-
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
Authors:
David Romero,
Chenyang Lyu,
Haryo Akbarianto Wibowo,
Teresa Lynn,
Injy Hamed,
Aditya Nanda Kishore,
Aishik Mandal,
Alina Dragonetti,
Artem Abzaliev,
Atnafu Lambebo Tonja,
Bontu Fufa Balcha,
Chenxi Whitehouse,
Christian Salamea,
Dan John Velasco,
David Ifeoluwa Adelani,
David Le Meur,
Emilio Villa-Cueva,
Fajri Koto,
Fauzan Farooqui,
Frederico Belcavello,
Ganzorig Batnasan,
Gisela Vallejo,
Grainne Caulfield,
Guido Ivetta,
Haiyue Song
, et al. (50 additional authors not shown)
Abstract:
Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data. However, most of the current VQA models use datasets that are primarily focused on English and a few major world languages, with images that are typically Western-centric. While recen…
▽ More
Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data. However, most of the current VQA models use datasets that are primarily focused on English and a few major world languages, with images that are typically Western-centric. While recent efforts have tried to increase the number of languages covered on VQA datasets, they still lack diversity in low-resource languages. More importantly, although these datasets often extend their linguistic range via translation or some other approaches, they usually keep images the same, resulting in narrow cultural representation. To address these limitations, we construct CVQA, a new Culturally-diverse multilingual Visual Question Answering benchmark, designed to cover a rich set of languages and cultures, where we engage native speakers and cultural experts in the data collection process. As a result, CVQA includes culturally-driven images and questions from across 28 countries on four continents, covering 26 languages with 11 scripts, providing a total of 9k questions. We then benchmark several Multimodal Large Language Models (MLLMs) on CVQA, and show that the dataset is challenging for the current state-of-the-art models. This benchmark can serve as a probing evaluation suite for assessing the cultural capability and bias of multimodal models and hopefully encourage more research efforts toward increasing cultural awareness and linguistic diversity in this field.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Bi-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic Dexterous Manipulations
Authors:
Koffivi Fidèle Gbagbe,
Miguel Altamirano Cabrera,
Ali Alabbas,
Oussama Alyunes,
Artem Lykov,
Dzmitry Tsetserukou
Abstract:
This research introduces the Bi-VLA (Vision-Language-Action) model, a novel system designed for bimanual robotic dexterous manipulations that seamlessly integrate vision, language understanding, and physical action. The system's functionality was evaluated through a set of household tasks, including the preparation of a desired salad upon human request. Bi-VLA demonstrates the ability to interpret…
▽ More
This research introduces the Bi-VLA (Vision-Language-Action) model, a novel system designed for bimanual robotic dexterous manipulations that seamlessly integrate vision, language understanding, and physical action. The system's functionality was evaluated through a set of household tasks, including the preparation of a desired salad upon human request. Bi-VLA demonstrates the ability to interpret complex human instructions, perceive and understand the visual context of ingredients, and execute precise bimanual actions to assemble the requested salad. Through a series of experiments, we evaluate the system's performance in terms of accuracy, efficiency, and adaptability to various salad recipes and human preferences. Our results indicate a high success rate of 100% in generating the correct executable code by the Language module from the user-requested tasks. The Vision Module achieved a success rate of 96.06% in detecting specific ingredients and an 83.4% success rate in detecting a list of multiple ingredients.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning
Authors:
Artem Lykov,
Miguel Altamirano Cabrera,
Koffivi Fidèle Gbagbe,
Dzmitry Tsetserukou
Abstract:
This paper presents the development of a novel ethical reasoning framework for robots. "Robots Can Feel" is the first system for robots that utilizes a combination of logic and human-like emotion simulation to make decisions in morally complex situations akin to humans. The key feature of the approach is the management of the Emotion Weight Coefficient - a customizable parameter to assign the role…
▽ More
This paper presents the development of a novel ethical reasoning framework for robots. "Robots Can Feel" is the first system for robots that utilizes a combination of logic and human-like emotion simulation to make decisions in morally complex situations akin to humans. The key feature of the approach is the management of the Emotion Weight Coefficient - a customizable parameter to assign the role of emotions in robot decision-making. The system aims to serve as a tool that can equip robots of any form and purpose with ethical behavior close to human standards. Besides the platform, the system is independent of the choice of the base model. During the evaluation, the system was tested on 8 top up-to-date LLMs (Large Language Models). This list included both commercial and open-source models developed by various companies and countries. The research demonstrated that regardless of the model choice, the Emotions Weight Coefficient influences the robot's decision similarly. According to ANOVA analysis, the use of different Emotion Weight Coefficients influenced the final decision in a range of situations, such as in a request for a dietary violation F(4, 35) = 11.2, p = 0.0001 and in an animal compassion situation F(4, 35) = 8.5441, p = 0.0001. A demonstration code repository is provided at: https://github.com/TemaLykov/robots_can_feel
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
MoveTouch: Robotic Motion Capturing System with Wearable Tactile Display to Achieve Safe HRI
Authors:
Ali Alabbas,
Miguel Altamirano Cabrera,
Mohamed Sayed,
Oussama Alyounes,
Qian Liu,
Dzmitry Tsetserukou
Abstract:
The collaborative robot market is flourishing as there is a trend towards simplification, modularity, and increased flexibility on the production line. But when humans and robots are collaborating in a shared environment, the safety of humans should be a priority. We introduce a novel wearable robotic system to enhance safety during Human-Robot Interaction (HRI). The proposed wearable robot is des…
▽ More
The collaborative robot market is flourishing as there is a trend towards simplification, modularity, and increased flexibility on the production line. But when humans and robots are collaborating in a shared environment, the safety of humans should be a priority. We introduce a novel wearable robotic system to enhance safety during Human-Robot Interaction (HRI). The proposed wearable robot is designed to hold a fiducial marker and maintain its visibility to a motion capture system, which, in turn, localizes the user's hand with good accuracy and low latency and provides vibrotactile feedback to the user's wrist. The vibrotactile feedback guides the user's hand movement during collaborative tasks in order to increase safety and enhance collaboration efficiency. A user study was conducted to assess the recognition and discriminability of ten designed vibration patterns applied to the upper (dorsal) and the down (volar) parts of the user's wrist. The results show that the pattern recognition rate on the volar side was higher, with an average of 75.64% among all users. Four patterns with a high recognition rate were chosen to be incorporated into our system. A second experiment was carried out to evaluate users' response to the chosen patterns in real-world collaborative tasks. Results show that all participants responded to the patterns correctly, and the average response time for the patterns was between 0.24 and 2.41 seconds.
△ Less
Submitted 5 July, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
A Cross-Platform Execution Engine for the Quantum Intermediate Representation
Authors:
Elaine Wong,
Vicente Leyton Ortega,
Daniel Claudino,
Seth Johnson,
Sharmin Afrose,
Meenambika Gowrishankar,
Anthony M. Cabrera,
Travis S. Humble
Abstract:
Hybrid languages like the Quantum Intermediate Representation (QIR) are essential for programming systems that mix quantum and conventional computing models, while execution of these programs is often deferred to a system-specific implementation. Here, we describe and demonstrate the QIR Execution Engine (QIR-EE) for parsing, interpreting, and executing QIR across multiple hardware platforms. QIR-…
▽ More
Hybrid languages like the Quantum Intermediate Representation (QIR) are essential for programming systems that mix quantum and conventional computing models, while execution of these programs is often deferred to a system-specific implementation. Here, we describe and demonstrate the QIR Execution Engine (QIR-EE) for parsing, interpreting, and executing QIR across multiple hardware platforms. QIR-EE uses LLVM to execute hybrid instructions specifying quantum programs and, by design, presents extension points that support customized runtime and hardware environments. We demonstrate an implementation that uses the XACC quantum hardware-accelerator library to dispatch prototypical quantum programs on different commercial quantum platforms and numerical simulators, and we validate execution of QIR-EE on the IonQ Harmony and Quantinuum H1-1 hardware. Our results highlight the efficiency of hybrid executable architectures for handling mixed instructions, managing mixed data, and integrating with quantum computing frameworks to realize cross-platform execution.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
DogSurf: Quadruped Robot Capable of GRU-based Surface Recognition for Blind Person Navigation
Authors:
Artem Bazhenov,
Vladimir Berman,
Sergei Satsevich,
Olga Shalopanova,
Miguel Altamirano Cabrera,
Artem Lykov,
Dzmitry Tsetserukou
Abstract:
This paper introduces DogSurf - a newapproach of using quadruped robots to help visually impaired people navigate in real world. The presented method allows the quadruped robot to detect slippery surfaces, and to use audio and haptic feedback to inform the user when to stop. A state-of-the-art GRU-based neural network architecture with mean accuracy of 99.925% was proposed for the task of multicla…
▽ More
This paper introduces DogSurf - a newapproach of using quadruped robots to help visually impaired people navigate in real world. The presented method allows the quadruped robot to detect slippery surfaces, and to use audio and haptic feedback to inform the user when to stop. A state-of-the-art GRU-based neural network architecture with mean accuracy of 99.925% was proposed for the task of multiclass surface classification for quadruped robots. A dataset was collected on a Unitree Go1 Edu robot. The dataset and code have been posted to the public domain.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
CognitiveOS: Large Multimodal Model based System to Endow Any Type of Robot with Generative AI
Authors:
Artem Lykov,
Mikhail Konenkov,
Koffivi Fidèle Gbagbe,
Mikhail Litvinov,
Denis Davletshin,
Aleksey Fedoseev,
Miguel Altamirano Cabrera,
Robinroy Peter,
Dzmitry Tsetserukou
Abstract:
This paper introduces CognitiveOS, the first operating system designed for cognitive robots capable of functioning across diverse robotic platforms. CognitiveOS is structured as a multi-agent system comprising modules built upon a transformer architecture, facilitating communication through an internal monologue format. These modules collectively empower the robot to tackle intricate real-world ta…
▽ More
This paper introduces CognitiveOS, the first operating system designed for cognitive robots capable of functioning across diverse robotic platforms. CognitiveOS is structured as a multi-agent system comprising modules built upon a transformer architecture, facilitating communication through an internal monologue format. These modules collectively empower the robot to tackle intricate real-world tasks. The paper delineates the operational principles of the system along with descriptions of its nine distinct modules. The modular design endows the system with distinctive advantages over traditional end-to-end methodologies, notably in terms of adaptability and scalability. The system's modules are configurable, modifiable, or deactivatable depending on the task requirements, while new modules can be seamlessly integrated. This system serves as a foundational resource for researchers and developers in the cognitive robotics domain, alleviating the burden of constructing a cognitive robot system from scratch. Experimental findings demonstrate the system's advanced task comprehension and adaptability across varied tasks, robotic platforms, and module configurations, underscoring its potential for real-world applications. Moreover, in the category of Reasoning it outperformed CognitiveDog (by 15%) and RT2 (by 31%), achieving the highest to date rate of 77%. We provide a code repository and dataset for the replication of CognitiveOS: link will be provided in camera-ready submission.
△ Less
Submitted 19 March, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Trustworthy human-centric based Automated Decision-Making Systems
Authors:
Marcelino Cabrera,
Carlos Cruz,
Pavel Novoa-Hernández,
David A. Pelta,
José Luis Verdegay
Abstract:
Automated Decision-Making Systems (ADS) have become pervasive across various fields, activities, and occupations, to enhance performance. However, this widespread adoption introduces potential risks, including the misuse of ADS. Such misuse may manifest when ADS is employed in situations where it is unnecessary or when essential requirements, conditions, and terms are overlooked, leading to uninte…
▽ More
Automated Decision-Making Systems (ADS) have become pervasive across various fields, activities, and occupations, to enhance performance. However, this widespread adoption introduces potential risks, including the misuse of ADS. Such misuse may manifest when ADS is employed in situations where it is unnecessary or when essential requirements, conditions, and terms are overlooked, leading to unintended consequences. This research paper presents a thorough examination of the implications, distinctions, and ethical considerations associated with digitalization, digital transformation, and the utilization of ADS in contemporary society and future contexts. Emphasis is placed on the imperative need for regulation, transparency, and ethical conduct in the deployment of ADS.
△ Less
Submitted 22 December, 2023;
originally announced January 2024.
-
TeslaCharge: Smart Robotic Charger Driven by Impedance Control and Human Haptic Patterns
Authors:
Oussama Alyounes,
Miguel Altamirano Cabrera,
Dzmitry Tsetserukou
Abstract:
The growing demand for electric vehicles requires the development of automated car charging methods. At the moment, the process of charging an electric car is completely manual, and that requires physical effort to accomplish the task, which is not suitable for people with disabilities. Typically, the effort in the research is focused on detecting the position and orientation of the socket, which…
▽ More
The growing demand for electric vehicles requires the development of automated car charging methods. At the moment, the process of charging an electric car is completely manual, and that requires physical effort to accomplish the task, which is not suitable for people with disabilities. Typically, the effort in the research is focused on detecting the position and orientation of the socket, which resulted in a relatively high accuracy, $\pm 5 \: mm $ and $\pm 10^o$. However, this accuracy is not enough to complete the charging process. In this work, we focus on designing a novel methodology for robust robotic plug-in and plug-out based on human haptics, to overcome the error in the position and orientation of the socket. Participants were invited to perform the charging task, and their cognitive capabilities were recognized by measuring the applied forces along with the movement of the charger. Three controllers were designed based on impedance control to mimic the human patterns of charging an electric car. The recorded data from humans were used to calibrate the parameters of the impedance controllers: inertia $M_d$, dam** $D_d$, and stiffness $K_d$. A robotic validation was performed, where the designed controllers were applied to the robot UR10. Using the proposed controllers and the human kinesthetic data, it was possible to successfully automate the operation of charging an electric car.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
HaptiCharger: Robotic Charging of Electric Vehicles Based on Human Haptic Patterns
Authors:
Oussama Alyounes,
Miguel Altamirano Cabrera,
Dzmitry Tsetserukou
Abstract:
The growing demand for electric vehicles requires the development of automated car charging methods. At the moment, the process of charging an electric car is completely manual, and that requires physical effort to accomplish the task, which is not suitable for people with disabilities. Typically, the effort in the automation of the charging task research is focused on detecting the position and o…
▽ More
The growing demand for electric vehicles requires the development of automated car charging methods. At the moment, the process of charging an electric car is completely manual, and that requires physical effort to accomplish the task, which is not suitable for people with disabilities. Typically, the effort in the automation of the charging task research is focused on detecting the position and orientation of the socket, which resulted in a relatively high accuracy, 5 mm, and 10 degrees. However, this accuracy is not enough to complete the charging process. In this work, we focus on designing a novel methodology for robust robotic plug-in and plug-out based on human haptics to overcome the error in the orientation of the socket. Participants were invited to perform the charging task, and their cognitive capabilities were recognized by measuring the applied forces along with the movements of the charger. Eventually, an algorithm was developed based on the human's best strategies to be applied to a robotic arm.
△ Less
Submitted 10 November, 2023; v1 submitted 13 October, 2023;
originally announced October 2023.
-
ArUcoGlide: a Novel Wearable Robot for Position Tracking and Haptic Feedback to Increase Safety During Human-Robot Interaction
Authors:
Ali Alabbas,
Miguel Altamirano Cabrera,
Oussama Alyounes,
Dzmitry Tsetserukou
Abstract:
The current capabilities of robotic systems make human collaboration necessary to accomplish complex tasks effectively. In this work, we are introducing a framework to ensure safety in a human-robot collaborative environment. The system is composed of a wearable 2-DOF robot, a low-cost and easy-to-install tracking system, and a collision avoidance algorithm based on the Artificial Potential Field…
▽ More
The current capabilities of robotic systems make human collaboration necessary to accomplish complex tasks effectively. In this work, we are introducing a framework to ensure safety in a human-robot collaborative environment. The system is composed of a wearable 2-DOF robot, a low-cost and easy-to-install tracking system, and a collision avoidance algorithm based on the Artificial Potential Field (APF). The wearable robot is designed to hold a fiducial marker and maintain its visibility to the tracking system, which, in turn, localizes the user's hand with good accuracy and low latency and provides haptic feedback to the user. The system is designed to enhance the performance of collaborative tasks while ensuring user safety. Three experiments were carried out to evaluate the performance of the proposed system. The first one evaluated the accuracy of the tracking system. The second experiment analyzed human-robot behavior during an imminent collision. The third experiment evaluated the system in a collaborative activity in a shared working environment. The results show that the implementation of the introduced system reduces the operation time by 16% and increases the average distance between the user's hand and the robot by 5 cm.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
MorphoArms: Morphogenetic Teleoperation of Multimanual Robot
Authors:
Mikhail Martynov,
Zhanibek Darush,
Miguel Altamirano Cabrera,
Sausar Karaf,
Dzmitry Tsetserukou
Abstract:
Nowadays, there are few unmanned aerial vehicles (UAVs) capable of flying, walking and gras**. A drone with all these functionalities can significantly improve its performance in complex tasks such as monitoring and exploring different types of terrain, and rescue operations. This paper presents MorphoArms, a novel system that consists of a morphogenetic chassis and a hand gesture recognition te…
▽ More
Nowadays, there are few unmanned aerial vehicles (UAVs) capable of flying, walking and gras**. A drone with all these functionalities can significantly improve its performance in complex tasks such as monitoring and exploring different types of terrain, and rescue operations. This paper presents MorphoArms, a novel system that consists of a morphogenetic chassis and a hand gesture recognition teleoperation system. The mechanics, electronics, control architecture, and walking behavior of the morphogenetic chassis are described. This robot is capable of walking and gras** objects using four robotic limbs. Robotic limbs with four degrees-of-freedom are used as pedipulators when walking and as manipulators when performing actions in the environment. The robot control system is implemented using teleoperation, where commands are given by hand gestures. A motion capture system is used to track the user's hands and to recognize their gestures. The method of controlling the robot was experimentally tested in a study involving 10 users. The evaluation included three questionnaires (NASA TLX, SUS, and UEQ). The results showed that the proposed system was more user-friendly than 56% of the systems, and it was rated above average in terms of attractiveness, stimulation, and novelty.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
DroneARchery: Human-Drone Interaction through Augmented Reality with Haptic Feedback and Multi-UAV Collision Avoidance Driven by Deep Reinforcement Learning
Authors:
Ekaterina Dorzhieva,
Ahmed Baza,
Ayush Gupta,
Aleksey Fedoseev,
Miguel Altamirano Cabrera,
Ekaterina Karmanova,
Dzmitry Tsetserukou
Abstract:
We propose a novel concept of augmented reality (AR) human-drone interaction driven by RL-based swarm behavior to achieve intuitive and immersive control of a swarm formation of unmanned aerial vehicles. The DroneARchery system developed by us allows the user to quickly deploy a swarm of drones, generating flight paths simulating archery. The haptic interface LinkGlide delivers a tactile stimulus…
▽ More
We propose a novel concept of augmented reality (AR) human-drone interaction driven by RL-based swarm behavior to achieve intuitive and immersive control of a swarm formation of unmanned aerial vehicles. The DroneARchery system developed by us allows the user to quickly deploy a swarm of drones, generating flight paths simulating archery. The haptic interface LinkGlide delivers a tactile stimulus of the bowstring tension to the forearm to increase the precision of aiming. The swarm of released drones dynamically avoids collisions between each other, the drone following the user, and external obstacles with behavior control based on deep reinforcement learning.
The developed concept was tested in the scenario with a human, where the user shoots from a virtual bow with a real drone to hit the target. The human operator observes the ballistic trajectory of the drone in an AR and achieves a realistic and highly recognizable experience of the bowstring tension through the haptic display.
The experimental results revealed that the system improves trajectory prediction accuracy by 63.3% through applying AR technology and conveying haptic feedback of pulling force. DroneARchery users highlighted the naturalness (4.3 out of 5 point Likert scale) and increased confidence (4.7 out of 5) when controlling the drone. We have designed the tactile patterns to present four sliding distances (tension) and three applied force levels (stiffness) of the haptic display. Users demonstrated the ability to distinguish tactile patterns produced by the haptic display representing varying bowstring tension(average recognition rate is of 72.8%) and stiffness (average recognition rate is of 94.2%).
The novelty of the research is the development of an AR-based approach for drone control that does not require special skills and training from the operator.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Exploring the Role of Electro-Tactile and Kinesthetic Feedback in Telemanipulation Task
Authors:
Daria Trinitatova,
Miguel Altamirano Cabrera,
Polina Ponomareva,
Aleksey Fedoseev,
Dzmitry Tsetserukou
Abstract:
Teleoperation of robotic systems for precise and delicate object gras** requires high-fidelity haptic feedback to obtain comprehensive real-time information about the grasp. In such cases, the most common approach is to use kinesthetic feedback. However, a single contact point information is insufficient to detect the dynamically changing shape of soft objects. This paper proposes a novel telema…
▽ More
Teleoperation of robotic systems for precise and delicate object gras** requires high-fidelity haptic feedback to obtain comprehensive real-time information about the grasp. In such cases, the most common approach is to use kinesthetic feedback. However, a single contact point information is insufficient to detect the dynamically changing shape of soft objects. This paper proposes a novel telemanipulation system that provides kinesthetic and cutaneous stimuli to the user's hand to achieve accurate liquid dispensing by dexterously manipulating the deformable object (i.e., pipette). The experimental results revealed that the proposed approach to provide the user with multimodal haptic feedback considerably improves the quality of dosing with a remote pipette. Compared with pure visual feedback, the relative dosing error decreased by 66\% and task execution time decreased by 18\% when users manipulated the deformable pipette with a multimodal haptic interface in combination with visual feedback. The proposed technology can be potentially implemented in delicate dosing procedures during the antibody tests for COVID-19, chemical experiments, operation with organic materials, and telesurgery.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
LinkGlide-S: A Wearable Multi-Contact Tactile Display Aimed at Rendering Object Softness at the Palm with Impedance Control in VR and Telemanipulation
Authors:
Miguel Altamirano Cabrera,
Jonathan Tirado,
Juan Heredia,
Dzmitry Tsetserukou
Abstract:
LinkGlide-S is a novel wearable hand-worn tactile display to deliver multi-contact and multi-modal stimuli at the user's palm.} The array of inverted five-bar linkages generates three independent contact points to cover the whole palm area. \textcolor{black} {The independent contact points generate various tactile patterns at the user's hand, providing multi-contact tactile feedback. An impedance…
▽ More
LinkGlide-S is a novel wearable hand-worn tactile display to deliver multi-contact and multi-modal stimuli at the user's palm.} The array of inverted five-bar linkages generates three independent contact points to cover the whole palm area. \textcolor{black} {The independent contact points generate various tactile patterns at the user's hand, providing multi-contact tactile feedback. An impedance control delivers the stiffness of objects according to different parameters. Three experiments were performed to evaluate the perception of patterns, investigate the realistic perception of object interaction in Virtual Reality, and assess the users' softness perception by the impedance control. The experimental results revealed a high recognition rate for the generated patterns. These results confirm that the performance of LinkGlide-S is adequate to detect and manipulate virtual objects with different stiffness. This novel haptic device can potentially achieve a highly immersive VR experience and more interactive applications during telemanipulation.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
DogTouch: CNN-based Recognition of Surface Textures by Quadruped Robot with High Density Tactile Sensors
Authors:
Nipun Dhananjaya Weerakkodi Mudalige,
Elena Nazarova,
Ildar Babataev,
Pavel Kopanev,
Aleksey Fedoseev,
Miguel Altamirano Cabrera,
Dzmitry Tsetserukou
Abstract:
The ability to perform locomotion in various terrains is critical for legged robots. However, the robot has to have a better understanding of the surface it is walking on to perform robust locomotion on different terrains. Animals and humans are able to recognize the surface with the help of the tactile sensation on their feet. Although, the foot tactile sensation for legged robots has not been mu…
▽ More
The ability to perform locomotion in various terrains is critical for legged robots. However, the robot has to have a better understanding of the surface it is walking on to perform robust locomotion on different terrains. Animals and humans are able to recognize the surface with the help of the tactile sensation on their feet. Although, the foot tactile sensation for legged robots has not been much explored. This paper presents research on a novel quadruped robot DogTouch with tactile sensing feet (TSF). TSF allows the recognition of different surface textures utilizing a tactile sensor and a convolutional neural network (CNN). The experimental results show a sufficient validation accuracy of 74.37\% for our trained CNN-based model, with the highest recognition for line patterns of 90\%. In the future, we plan to improve the prediction model by presenting surface samples with the various depths of patterns and applying advanced Deep Learning and Shallow learning models for surface recognition.
Additionally, we propose a novel approach to navigation of quadruped and legged robots. We can arrange the tactile paving textured surface (similar that used for blind or visually impaired people). Thus, DogTouch will be capable of locomotion in unknown environment by just recognizing the specific tactile patterns which will indicate the straight path, left or right turn, pedestrian crossing, road, and etc. That will allow robust navigation regardless of lighting condition. Future quadruped robots equipped with visual and tactile perception system will be able to safely and intelligently navigate and interact in the unstructured indoor and outdoor environment.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
CoboGuider: Haptic Potential Fields for Safe Human-Robot Interaction
Authors:
Viktor Rakhmatulin,
Miguel Altamirano Cabrera,
Fikre Hagos,
Oleg Sautenkov,
Jonathan Tirado,
Ighor Uzhinsky,
Dzmitry Tsetserukou
Abstract:
Modern industry still relies on manual manufacturing operations and safe human-robot interaction is of great interest nowadays. Speed and Separation Monitoring (SSM) allows close and efficient collaborative scenarios by maintaining a protective separation distance during robot operation. The paper focuses on a novel approach to strengthen the SSM safety requirements by introducing haptic feedback…
▽ More
Modern industry still relies on manual manufacturing operations and safe human-robot interaction is of great interest nowadays. Speed and Separation Monitoring (SSM) allows close and efficient collaborative scenarios by maintaining a protective separation distance during robot operation. The paper focuses on a novel approach to strengthen the SSM safety requirements by introducing haptic feedback to a robotic cell worker. Tactile stimuli provide early warning of dangerous movements and proximity to the robot, based on the human reaction time and instantaneous velocities of robot and operator. A preliminary experiment was performed to identify the reaction time of participants when they are exposed to tactile stimuli in a collaborative environment with controlled conditions. In a second experiment, we evaluated our approach into a study case where human worker and cobot performed collaborative planetary gear assembly. Results show that the applied approach increased the average minimum distance between the robot's end-effector and hand by 44% compared to the operator relying only on the visual feedback. Moreover, the participants without the haptic support have failed several times to maintain the protective separation distance.
△ Less
Submitted 16 December, 2021; v1 submitted 25 October, 2021;
originally announced October 2021.
-
CoHaptics: Development of Human-Robot Collaborative System with Forearm-worn Haptic Display to Increase Safety in Future Factories
Authors:
Miguel Altamirano Cabrera,
Juan Heredia,
Jonathan Tirado,
Vladislav Panov,
Fikre Hagos,
Dzmitry Tsetserukou
Abstract:
Complex tasks require human collaboration since robots do not have enough dexterity. However, robots are still used as instruments and not as collaborative systems. We are introducing a framework to ensure safety in a human-robot collaborative environment. The system is composed of a haptic feedback display, low-cost wearable mocap, and a new collision avoidance algorithm based on the Artificial P…
▽ More
Complex tasks require human collaboration since robots do not have enough dexterity. However, robots are still used as instruments and not as collaborative systems. We are introducing a framework to ensure safety in a human-robot collaborative environment. The system is composed of a haptic feedback display, low-cost wearable mocap, and a new collision avoidance algorithm based on the Artificial Potential Fields (APF). Wearable optical motion capturing system enables tracking the human hand position with high accuracy and low latency on large working areas. This study evaluates whether haptic feedback improves safety in human-robot collaboration. Three experiments were carried out to evaluate the performance of the proposed system. The first one evaluated human responses to the haptic device during interaction with the Robot Tool Center Point (TCP). The second experiment analyzed human-robot behavior during an imminent collision. The third experiment evaluated the system in a collaborative activity in a shared working environment. This study had shown that when haptic feedback in the control loop was included, the safe distance (minimum robot-obstacle distance) increased by 4.1 cm from 12.39 cm to 16.55 cm, and the robot's path, when the collision avoidance algorithm was activated, was reduced by 81%.
△ Less
Submitted 13 September, 2021;
originally announced September 2021.
-
DroneTrap: Drone Catching in Midair by Soft Robotic Hand with Color-Based Force Detection and Hand Gesture Recognition
Authors:
Aleksey Fedoseev,
Valerii Serpiva,
Ekaterina Karmanova,
Miguel Altamirano Cabrera,
Vladimir Shirokun,
Iakov Vasilev,
Stanislav Savushkin,
Dzmitry Tsetserukou
Abstract:
The paper proposes a novel concept of docking drones to make this process as safe and fast as possible. The idea behind the project is that a robot with a soft gripper grasps the drone in midair. The human operator navigates the robotic arm with the ML-based gesture recognition interface. The 3-finger robot hand with soft fingers is equipped with touch sensors, making it possible to achieve safe d…
▽ More
The paper proposes a novel concept of docking drones to make this process as safe and fast as possible. The idea behind the project is that a robot with a soft gripper grasps the drone in midair. The human operator navigates the robotic arm with the ML-based gesture recognition interface. The 3-finger robot hand with soft fingers is equipped with touch sensors, making it possible to achieve safe drone catching and avoid inadvertent damage to the drone's propellers and motors. Additionally, the soft hand is featured with a unique color-based force estimation technology based on a computer vision (CV) system. Moreover, the visual color-changing system makes it easier for the human operator to interpret the applied forces.
Without any additional programming, the operator has full real-time control of the robot's motion and task execution by wearing a mocap glove with gesture recognition, which was developed and applied for the high-level control of DroneTrap. The experimental results revealed that the developed color-based force estimation can be applied for rigid object capturing with high precision (95.3\%). The proposed technology can potentially revolutionize the landing and deployment of drones for parcel delivery on uneven ground, structure maintenance and inspection, risque operations, and etc.
△ Less
Submitted 4 August, 2021; v1 submitted 7 February, 2021;
originally announced February 2021.
-
Affordance-Aware Handovers with Human Arm Mobility Constraints
Authors:
Paola Ardón,
Maria E. Cabrera,
Èric Pairet,
Ronald P. A. Petrick,
Subramanian Ramamoorthy,
Katrin S. Lohan,
Maya Cakmak
Abstract:
Reasoning about object handover configurations allows an assistive agent to estimate the appropriateness of handover for a receiver with different arm mobility capacities. While there are existing approaches for estimating the effectiveness of handovers, their findings are limited to users without arm mobility impairments and to specific objects. Therefore, current state-of-the-art approaches are…
▽ More
Reasoning about object handover configurations allows an assistive agent to estimate the appropriateness of handover for a receiver with different arm mobility capacities. While there are existing approaches for estimating the effectiveness of handovers, their findings are limited to users without arm mobility impairments and to specific objects. Therefore, current state-of-the-art approaches are unable to hand over novel objects to receivers with different arm mobility capacities. We propose a method that generalises handover behaviours to previously unseen objects, subject to the constraint of a user's arm mobility levels and the task context. We propose a heuristic-guided hierarchically optimised cost whose optimisation adapts object configurations for receivers with low arm mobility. This also ensures that the robot grasps consider the context of the user's upcoming task, i.e., the usage of the object. To understand preferences over handover configurations, we report on the findings of an online study, wherein we presented different handover methods, including ours, to $259$ users with different levels of arm mobility. We find that people's preferences over handover methods are correlated to their arm mobility capacities. We encapsulate these preferences in a statistical relational model (SRL) that is able to reason about the most suitable handover configuration given a receiver's arm mobility and upcoming task. Using our SRL model, we obtained an average handover accuracy of $90.8\%$ when generalising handovers to novel objects.
△ Less
Submitted 16 February, 2021; v1 submitted 29 October, 2020;
originally announced October 2020.
-
CobotGear: Interaction with Collaborative Robots using Wearable Optical Motion Capturing Systems
Authors:
Juan Heredia,
Miguel Altamirano Cabrera,
Jonathan Tirado,
Vladislav Panov,
Dzmitry Tsetserukou
Abstract:
In industrial applications, complex tasks require human collaboration since the robot doesn't have enough dexterity. However, the robots are still implemented as tools and not as collaborative intelligent systems. To ensure safety in the human-robot collaboration, we introduce a system that presents a new method that integrates low-cost wearable mocap, and an improved collision avoidance algorithm…
▽ More
In industrial applications, complex tasks require human collaboration since the robot doesn't have enough dexterity. However, the robots are still implemented as tools and not as collaborative intelligent systems. To ensure safety in the human-robot collaboration, we introduce a system that presents a new method that integrates low-cost wearable mocap, and an improved collision avoidance algorithm based on the artificial potential fields. Wearable optical motion capturing allows to track the human hand position with high accuracy and low latency on large working areas. To increase the efficiency of the proposed algorithm, two obstacle types are discriminated according to their collision probability. A preliminary experiment was performed to analyze the algorithm behavior and to select the best values for the obstacle's threshold angle $θ_{OBS}$, and for the avoidance threshold distance $d_{AT}$. The second experiment was carried out to evaluate the system performance with $d_{AT}$ = 0.2 m and $θ_{OBS}$ = 45 degrees. The third experiment evaluated the system in a real collaborative task. The results demonstrate the robust performance of the robotic arm generating smooth collision-free trajectories. The proposed technology will allow consumer robots to safely collaborate with humans in cluttered environments, e.g., factories, kitchens, living rooms, and restaurants.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Tactile Perception of Objects by the User's Palm for the Development of Multi-contact Wearable Tactile Displays
Authors:
Miguel Altamirano Cabrera,
Juan Heredia,
Dzmitry Tsetserukou
Abstract:
The user's palm plays an important role in object detection and manipulation. The design of a robust multi-contact tactile display must consider the sensation and perception of of the stimulated area aiming to deliver the right stimuli at the correct location. To the best of our knowledge, there is no study to obtain the human palm data for this purpose. The objective of this work is to introduce…
▽ More
The user's palm plays an important role in object detection and manipulation. The design of a robust multi-contact tactile display must consider the sensation and perception of of the stimulated area aiming to deliver the right stimuli at the correct location. To the best of our knowledge, there is no study to obtain the human palm data for this purpose. The objective of this work is to introduce the method to investigate the user's palm sensations during the interaction with objects. An array of fifteen Force Sensitive Resistors (FSRs) was located at the user's palm to get the area of interaction, and the normal force delivered to four different convex surfaces. Experimental results showed the active areas at the palm during the interaction with each of the surfaces at different forces. The obtained results can be applied in the development of multi-contact wearable tactile and haptic displays for the palm, and in training a machine-learning algorithm to predict stimuli aiming to achieve a highly immersive experience in Virtual Reality.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.
-
Communication Modalities for Supervised Teleoperation in Highly Dexterous Tasks - Does one size fit all?
Authors:
Tian Zhou,
Maria E. Cabrera,
Juan P. Wachs
Abstract:
This study tries to explain the connection between communication modalities and levels of supervision in teleoperation during a dexterous task, like surgery. This concept is applied to two surgical related tasks: incision and peg transfer. It was found that as the complexity of the task escalates, the combination linking human supervision with a more expressive modality shows better performance th…
▽ More
This study tries to explain the connection between communication modalities and levels of supervision in teleoperation during a dexterous task, like surgery. This concept is applied to two surgical related tasks: incision and peg transfer. It was found that as the complexity of the task escalates, the combination linking human supervision with a more expressive modality shows better performance than other combinations of modalities and control. More specifically, in the peg transfer task, the combination of speech modality and action level supervision achieves shorter task completion time (77.1 +- 3.4 s) with fewer mistakes (0.20 +- 0.17 pegs dropped).
△ Less
Submitted 17 April, 2017;
originally announced April 2017.
-
Coherency in One-Shot Gesture Recognition
Authors:
Maria Cabrera,
Richard Voyles,
Juan Wachs
Abstract:
User's intentions may be expressed through spontaneous gesturing, which have been seen only a few times or never before. Recognizing such gestures involves one shot gesture learning. While most research has focused on the recognition of the gestures itself, recently new approaches were proposed to deal with gesture perception and production as part of the same problem. The framework presented in t…
▽ More
User's intentions may be expressed through spontaneous gesturing, which have been seen only a few times or never before. Recognizing such gestures involves one shot gesture learning. While most research has focused on the recognition of the gestures itself, recently new approaches were proposed to deal with gesture perception and production as part of the same problem. The framework presented in this work focuses on learning the process that leads to gesture generation, rather than mining the gesture's associated features. This is achieved using kinematic, cognitive and biomechanic characteristics of human interaction. These factors enable the artificial production of realistic gesture samples originated from a single observation. The generated samples are then used as training sets for different state-of-the-art classifiers. Performance is obtained first, by observing the machines' gesture recognition percentages. Then, performance is computed by the human recognition from gestures performed by robots. Based on these two scenarios, a composite new metric of coherency is proposed relating to the amount of agreement between these two conditions. Experimental results provide an average recognition performance of 89.2% for the trained classifiers and 92.5% for the participants. Coherency in recognition was determined at 93.6%. While this new metric is not directly comparable to raw accuracy or other pure performance-based standard metrics, it provides a quantifier for validating how realistic the machine generated samples are and how accurate the resulting mimicry is.
△ Less
Submitted 20 January, 2017;
originally announced January 2017.
-
What makes a gesture a gesture? Neural signatures involved in gesture recognition
Authors:
Maria Cabrera,
Keisha Novak,
Daniel Foti,
Richard Voyles,
Juan Wachs
Abstract:
Previous work in the area of gesture production, has made the assumption that machines can replicate "human-like" gestures by connecting a bounded set of salient points in the motion trajectory. Those inflection points were hypothesized to also display cognitive saliency. The purpose of this paper is to validate that claim using electroencephalography (EEG). That is, this paper attempts to find ne…
▽ More
Previous work in the area of gesture production, has made the assumption that machines can replicate "human-like" gestures by connecting a bounded set of salient points in the motion trajectory. Those inflection points were hypothesized to also display cognitive saliency. The purpose of this paper is to validate that claim using electroencephalography (EEG). That is, this paper attempts to find neural signatures of gestures (also referred as placeholders) in human cognition, which facilitate the understanding, learning and repetition of gestures. Further, it is discussed whether there is a direct map** between the placeholders and kinematic salient points in the gesture trajectories. These are expressed as relationships between inflection points in the gestures' trajectories with oscillatory mu rhythms (8-12 Hz) in the EEG. This is achieved by correlating fluctuations in mu power during gesture observation with salient motion points found for each gesture. Peaks in the EEG signal at central electrodes (motor cortex) and occipital electrodes (visual cortex) were used to isolate the salient events within each gesture. We found that a linear model predicting mu peaks from motion inflections fits the data well. Increases in EEG power were detected 380 and 500ms after inflection points at occipital and central electrodes, respectively. These results suggest that coordinated activity in visual and motor cortices is sensitive to motion trajectories during gesture observation, and it is consistent with the proposal that inflection points operate as placeholders in gesture recognition.
△ Less
Submitted 20 January, 2017;
originally announced January 2017.
-
Integration of Rule Based Expert Systems and Case Based Reasoning in an Acute Bacterial Meningitis Clinical Decision Support System
Authors:
Mariana Maceiras Cabrera,
Ernesto Ocampo Edye
Abstract:
This article presents the results of the research carried out on the development of a medical diagnostic system applied to the Acute Bacterial Meningitis, using the Case Based Reasoning methodology. The research was focused on the implementation of the adaptation stage, from the integration of Case Based Reasoning and Rule Based Expert Systems. In this adaptation stage we use a higher level RBC th…
▽ More
This article presents the results of the research carried out on the development of a medical diagnostic system applied to the Acute Bacterial Meningitis, using the Case Based Reasoning methodology. The research was focused on the implementation of the adaptation stage, from the integration of Case Based Reasoning and Rule Based Expert Systems. In this adaptation stage we use a higher level RBC that stores and allows reutilizing change experiences, combined with a classic rule-based inference engine. In order to take into account the most evident clinical situation, a pre-diagnosis stage is implemented using a rule engine that, given an evident situation, emits the corresponding diagnosis and avoids the complete process.
△ Less
Submitted 7 March, 2010;
originally announced March 2010.