Search | arXiv e-print repository

Hands-Free VR

Authors: Jorge Askur Vazquez Fernandez, Jae Joong Lee, Santiago Andrés Serrano Vacca, Alejandra Magana, Bedrich Benes, Voicu Popescu

Abstract: The paper introduces Hands-Free VR, a voice-based natural-language interface for VR. The user gives a command using their voice, the speech audio data is converted to text using a speech-to-text deep learning model that is fine-tuned for robustness to word phonetic similarity and to spoken English accents, and the text is mapped to an executable VR command using a large language model that is robu… ▽ More The paper introduces Hands-Free VR, a voice-based natural-language interface for VR. The user gives a command using their voice, the speech audio data is converted to text using a speech-to-text deep learning model that is fine-tuned for robustness to word phonetic similarity and to spoken English accents, and the text is mapped to an executable VR command using a large language model that is robust to natural language diversity. Hands-Free VR was evaluated in a controlled within-subjects study (N = 22) that asked participants to find specific objects and to place them in various configurations. In the control condition participants used a conventional VR user interface to grab, carry, and position the objects using the handheld controllers. In the experimental condition participants used Hands-Free VR. The results confirm that: (1) Hands-Free VR is robust to spoken English accents, as for 20 of our participants English was not their first language, and to word phonetic similarity, correctly transcribing the voice command 96.71% of the time; (2) Hands-Free VR is robust to natural language diversity, correctly map** the transcribed command to an executable command in 97.83% of the time; (3) Hands-Free VR had a significant efficiency advantage over the conventional VR interface in terms of task completion time, total viewpoint translation, total view direction rotation, and total left and right hand translations; (4) Hands-Free VR received high user preference ratings in terms of ease of use, intuitiveness, ergonomics, reliability, and desirability. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2306.06969 [pdf, other]

Viewpoint Generation using Feature-Based Constrained Spaces for Robot Vision Systems

Authors: Alejandro Magaña, Jonas Dirr, Philipp Bauer, Gunther Reinhart

Abstract: The efficient computation of viewpoints under consideration of various system and process constraints is a common challenge that any robot vision system is confronted with when trying to execute a vision task. Although fundamental research has provided solid and sound solutions for tackling this problem, a holistic framework that poses its formal description, considers the heterogeneity of robot v… ▽ More The efficient computation of viewpoints under consideration of various system and process constraints is a common challenge that any robot vision system is confronted with when trying to execute a vision task. Although fundamental research has provided solid and sound solutions for tackling this problem, a holistic framework that poses its formal description, considers the heterogeneity of robot vision systems, and offers an integrated solution remains unaddressed. Hence, this publication outlines the generation of viewpoints as a geometrical problem and introduces a generalized theoretical framework based on Feature-Based Constrained Spaces ($\mathcal{C}$-spaces) as the backbone for solving it. A $\mathcal{C}$-space can be understood as the topological space that a viewpoint constraint spans, where the sensor can be positioned for acquiring a feature while fulfilling the regarded constraint. The present study demonstrates that many viewpoint constraints can be efficiently formulated as $\mathcal{C}$-spaces providing geometric, deterministic, and closed solutions. The introduced $\mathcal{C}$-spaces are characterized based on generic domain and viewpoint constraints models to ease the transferability of the present framework to different applications and robot vision systems. The effectiveness and efficiency of the concepts introduced are verified on a simulation-based scenario and validated on a real robot vision system comprising two different sensors. △ Less

Submitted 12 June, 2023; originally announced June 2023.

arXiv:2108.05955 [pdf]

Using Machine Learning to Predict Engineering Technology Students' Success with Computer Aided Design

Authors: Jasmine Singh, Viranga Perera, Alejandra J. Magana, Brittany Newell, ** Wei-Kocsis, Ying Ying Seah, Greg J. Strimel, Charles Xie

Abstract: Computer-aided design (CAD) programs are essential to engineering as they allow for better designs through low-cost iterations. While CAD programs are typically taught to undergraduate students as a job skill, such software can also help students learn engineering concepts. A current limitation of CAD programs (even those that are specifically designed for educational purposes) is that they are no… ▽ More Computer-aided design (CAD) programs are essential to engineering as they allow for better designs through low-cost iterations. While CAD programs are typically taught to undergraduate students as a job skill, such software can also help students learn engineering concepts. A current limitation of CAD programs (even those that are specifically designed for educational purposes) is that they are not capable of providing automated real-time help to students. To encourage CAD programs to build in assistance to students, we used data generated from students using a free, open source CAD software called Aladdin to demonstrate how student data combined with machine learning techniques can predict how well a particular student will perform in a design task. We challenged students to design a house that consumed zero net energy as part of an introductory engineering technology undergraduate course. Using data from 128 students, along with the scikit-learn Python machine learning library, we tested our models using both total counts of design actions and sequences of design actions as inputs. We found that our models using early design sequence actions are particularly valuable for prediction. Our logistic regression model achieved a >60% chance of predicting if a student would succeed in designing a zero net energy house. Our results suggest that it would be feasible for Aladdin to provide useful feedback to students when they are approximately halfway through their design. Further improvements to these models could lead to earlier predictions and thus provide students feedback sooner to enhance their learning. △ Less

Submitted 12 August, 2021; originally announced August 2021.

arXiv:2106.02280 [pdf, other]

Human-Adversarial Visual Question Answering

Authors: Sasha Sheng, Amanpreet Singh, Vedanuj Goswami, Jose Alberto Lopez Magana, Wojciech Galuba, Devi Parikh, Douwe Kiela

Abstract: Performance on the most commonly used Visual Question Answering dataset (VQA v2) is starting to approach human accuracy. However, in interacting with state-of-the-art VQA models, it is clear that the problem is far from being solved. In order to stress test VQA models, we benchmark them against human-adversarial examples. Human subjects interact with a state-of-the-art VQA model, and for each imag… ▽ More Performance on the most commonly used Visual Question Answering dataset (VQA v2) is starting to approach human accuracy. However, in interacting with state-of-the-art VQA models, it is clear that the problem is far from being solved. In order to stress test VQA models, we benchmark them against human-adversarial examples. Human subjects interact with a state-of-the-art VQA model, and for each image in the dataset, attempt to find a question where the model's predicted answer is incorrect. We find that a wide range of state-of-the-art models perform poorly when evaluated on these examples. We conduct an extensive analysis of the collected adversarial examples and provide guidance on future research directions. We hope that this Adversarial VQA (AdVQA) benchmark can help drive progress in the field and advance the state of the art. △ Less

Submitted 4 June, 2021; originally announced June 2021.

Comments: 22 pages, 13 figures. First two authors contributed equally

arXiv:2007.04495 [pdf, other]

Hack.VR: A Programming Game in Virtual Reality

Authors: Dominic Kao, Christos Mousas, Alejandra J. Magana, D. Fox Harrell, Rabindra Ratan, Edward F. Melcer, Brett Sherrick, Paul Parsons, Dmitri A. Gusev

Abstract: In this article we describe Hack.VR, an object-oriented programming game in virtual reality. Hack.VR uses a VR programming language in which nodes represent functions and node connections represent data flow. Using this programming framework, players reprogram VR objects such as elevators, robots, and switches. Hack.VR has been designed to be highly interactable both physically and semantically. In this article we describe Hack.VR, an object-oriented programming game in virtual reality. Hack.VR uses a VR programming language in which nodes represent functions and node connections represent data flow. Using this programming framework, players reprogram VR objects such as elevators, robots, and switches. Hack.VR has been designed to be highly interactable both physically and semantically. △ Less

Submitted 23 November, 2020; v1 submitted 8 July, 2020; originally announced July 2020.

Showing 1–5 of 5 results for author: Magana, A