-
Hands-Free VR
Authors:
Jorge Askur Vazquez Fernandez,
Jae Joong Lee,
Santiago Andrés Serrano Vacca,
Alejandra Magana,
Bedrich Benes,
Voicu Popescu
Abstract:
The paper introduces Hands-Free VR, a voice-based natural-language interface for VR. The user gives a command using their voice, the speech audio data is converted to text using a speech-to-text deep learning model that is fine-tuned for robustness to word phonetic similarity and to spoken English accents, and the text is mapped to an executable VR command using a large language model that is robu…
▽ More
The paper introduces Hands-Free VR, a voice-based natural-language interface for VR. The user gives a command using their voice, the speech audio data is converted to text using a speech-to-text deep learning model that is fine-tuned for robustness to word phonetic similarity and to spoken English accents, and the text is mapped to an executable VR command using a large language model that is robust to natural language diversity. Hands-Free VR was evaluated in a controlled within-subjects study (N = 22) that asked participants to find specific objects and to place them in various configurations. In the control condition participants used a conventional VR user interface to grab, carry, and position the objects using the handheld controllers. In the experimental condition participants used Hands-Free VR. The results confirm that: (1) Hands-Free VR is robust to spoken English accents, as for 20 of our participants English was not their first language, and to word phonetic similarity, correctly transcribing the voice command 96.71% of the time; (2) Hands-Free VR is robust to natural language diversity, correctly map** the transcribed command to an executable command in 97.83% of the time; (3) Hands-Free VR had a significant efficiency advantage over the conventional VR interface in terms of task completion time, total viewpoint translation, total view direction rotation, and total left and right hand translations; (4) Hands-Free VR received high user preference ratings in terms of ease of use, intuitiveness, ergonomics, reliability, and desirability.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
An Evaluation of OCR on Egocentric Data
Authors:
Valentin Popescu,
Dima Damen,
Toby Perrett
Abstract:
In this paper, we evaluate state-of-the-art OCR methods on Egocentric data. We annotate text in EPIC-KITCHENS images, and demonstrate that existing OCR methods struggle with rotated text, which is frequently observed on objects being handled. We introduce a simple rotate-and-merge procedure which can be applied to pre-trained OCR models that halves the normalized edit distance error. This suggests…
▽ More
In this paper, we evaluate state-of-the-art OCR methods on Egocentric data. We annotate text in EPIC-KITCHENS images, and demonstrate that existing OCR methods struggle with rotated text, which is frequently observed on objects being handled. We introduce a simple rotate-and-merge procedure which can be applied to pre-trained OCR models that halves the normalized edit distance error. This suggests that future OCR attempts should incorporate rotation into model design and training procedures.
△ Less
Submitted 11 June, 2022;
originally announced June 2022.
-
Network modeling methods for precision medicine
Authors:
Elio Nushi,
Victor-Bogdan Popescu,
Jose Angel Sanchez Martin,
Sergiu Ivanov,
Eugen Czeizler,
Ion Petre
Abstract:
We discuss in this survey several network modeling methods and their applicability to precision medicine. We review several network centrality methods (degree centrality, closeness centrality, eccentricity centrality, betweenness centrality, and eigenvector-based prestige) and two systems controllability methods (minimum dominating sets and network structural controllability). We demonstrate their…
▽ More
We discuss in this survey several network modeling methods and their applicability to precision medicine. We review several network centrality methods (degree centrality, closeness centrality, eccentricity centrality, betweenness centrality, and eigenvector-based prestige) and two systems controllability methods (minimum dominating sets and network structural controllability). We demonstrate their applicability to precision medicine on three multiple myeloma patient disease networks. Each network consists of protein-protein interactions built around a specific patient's mutated genes, around the targets of the drugs used in the standard of care in multiple myeloma, and around multiple myeloma-specific essential genes. For each network we demonstrate how the network methods we discuss can be used to identify personalized, targeted drug combinations uniquely suited to that patient.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Representation range needs for 16-bit neural network training
Authors:
Valentina Popescu,
Abhinav Venigalla,
Di Wu,
Robert Schreiber
Abstract:
Deep learning has grown rapidly thanks to its state-of-the-art performance across a wide range of real-world applications. While neural networks have been trained using IEEE-754 binary32 arithmetic, the rapid growth of computational demands in deep learning has boosted interest in faster, low precision training. Mixed-precision training that combines IEEE-754 binary16 with IEEE-754 binary32 has be…
▽ More
Deep learning has grown rapidly thanks to its state-of-the-art performance across a wide range of real-world applications. While neural networks have been trained using IEEE-754 binary32 arithmetic, the rapid growth of computational demands in deep learning has boosted interest in faster, low precision training. Mixed-precision training that combines IEEE-754 binary16 with IEEE-754 binary32 has been tried, and other $16$-bit formats, for example Google's bfloat16, have become popular. In floating-point arithmetic there is a tradeoff between precision and representation range as the number of exponent bits changes; denormal numbers extend the representation range. This raises questions of how much exponent range is needed, of whether there is a format between binary16 (5 exponent bits) and bfloat16 (8 exponent bits) that works better than either of them, and whether or not denormals are necessary.
In the current paper we study the need for denormal numbers for mixed-precision training, and we propose a 1/6/9 format, i.e., 6-bit exponent and 9-bit explicit mantissa, that offers a better range-precision tradeoff. We show that 1/6/9 mixed-precision training is able to speed up training on hardware that incurs a performance slowdown on denormal operations or eliminates the need for denormal numbers altogether. And, for a number of fully connected and convolutional neural networks in computer vision and natural language processing, 1/6/9 achieves numerical parity to standard mixed-precision.
△ Less
Submitted 6 April, 2021; v1 submitted 29 March, 2021;
originally announced March 2021.
-
Identifying efficient controls of complex interaction networks using genetic algorithms
Authors:
Victor-Bogdan Popescu,
Krishna Kanhaiya,
Iulian Năstac,
Eugen Czeizler,
Ion Petre
Abstract:
Control theory has seen recently impactful applications in network science, especially in connections with applications in network medicine. A key topic of research is that of finding minimal external interventions that offer control over the dynamics of a given network, a problem known as network controllability. We propose in this article a new solution for this problem based on genetic algorith…
▽ More
Control theory has seen recently impactful applications in network science, especially in connections with applications in network medicine. A key topic of research is that of finding minimal external interventions that offer control over the dynamics of a given network, a problem known as network controllability. We propose in this article a new solution for this problem based on genetic algorithms. We tailor our solution for applications in computational drug repurposing, seeking to maximise its use of FDA-approved drug targets in a given disease-specific protein-protein interaction network. We show how our algorithm identifies a number of potentially efficient drugs for breast, ovarian, and pancreatic cancer. We demonstrate our algorithm on several benchmark networks from cancer medicine, social networks, electronic circuits, and several random networks with their edges distributed according to the Erdős-Rényi, the small-world, and the scale-free properties. Overall, we show that our new algorithm is more efficient in identifying relevant drug targets in a disease network, advancing the computational solutions needed for new therapeutic and drug repurposing approaches.
△ Less
Submitted 9 July, 2020;
originally announced July 2020.