Search | arXiv e-print repository

Gemini: A Family of Highly Capable Multimodal Models

Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI. △ Less

Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2305.10400 [pdf, other]

What You See is What You Read? Improving Text-Image Alignment Evaluation

Authors: Michal Yarom, Yonatan Bitton, Soravit Changpinyo, Roee Aharoni, Jonathan Herzig, Oran Lang, Eran Ofek, Idan Szpektor

Abstract: Automatically determining whether a text and a corresponding image are semantically aligned is a significant challenge for vision-language models, with applications in generative text-to-image and image-to-text tasks. In this work, we study methods for automatic text-image alignment evaluation. We first introduce SeeTRUE: a comprehensive evaluation set, spanning multiple datasets from both text-to… ▽ More Automatically determining whether a text and a corresponding image are semantically aligned is a significant challenge for vision-language models, with applications in generative text-to-image and image-to-text tasks. In this work, we study methods for automatic text-image alignment evaluation. We first introduce SeeTRUE: a comprehensive evaluation set, spanning multiple datasets from both text-to-image and image-to-text generation tasks, with human judgements for whether a given text-image pair is semantically aligned. We then describe two automatic methods to determine alignment: the first involving a pipeline based on question generation and visual question answering models, and the second employing an end-to-end classification approach by finetuning multimodal pretrained models. Both methods surpass prior approaches in various text-image alignment tasks, with significant improvements in challenging cases that involve complex composition or unnatural images. Finally, we demonstrate how our approaches can localize specific misalignments between an image and a given text, and how they can be used to automatically re-rank candidates in text-to-image generation. △ Less

Submitted 26 December, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

Comments: Accepted to NeurIPS 2023. Website: https://wysiwyr-itm.github.io/

arXiv:2301.02336 [pdf, other]

doi 10.1145/3568162.3578630

Exploring Levels of Control for a Navigation Assistant for Blind Travelers

Authors: Vinitha Ranganeni, Mike Sinclair, Eyal Ofek, Amos Miller, Jonathan Campbell, Andrey Kolobov, Edward Cutrell

Abstract: Only a small percentage of blind and low-vision people use traditional mobility aids such as a cane or a guide dog. Various assistive technologies have been proposed to address the limitations of traditional mobility aids. These devices often give either the user or the device majority of the control. In this work, we explore how varying levels of control affect the users' sense of agency, trust i… ▽ More Only a small percentage of blind and low-vision people use traditional mobility aids such as a cane or a guide dog. Various assistive technologies have been proposed to address the limitations of traditional mobility aids. These devices often give either the user or the device majority of the control. In this work, we explore how varying levels of control affect the users' sense of agency, trust in the device, confidence, and successful navigation. We present Glide, a novel mobility aid with two modes for control: Glide-directed and User-directed. We employ Glide in a study (N=9) in which blind or low-vision participants used both modes to navigate through an indoor environment. Overall, participants found that Glide was easy to use and learn. Most participants trusted Glide despite its current limitations, and their confidence and performance increased as they continued to use Glide. Users' control mode preference varied in different situations; no single mode "won" in all situations. △ Less

Submitted 5 January, 2023; originally announced January 2023.

Comments: 9 pages, 6 figures, Human-Robot Interaction 2023

arXiv:2206.09038 [pdf, other]

doi 10.1145/1463434.1463464

Validation of Vector Data using Oblique Images

Authors: Pragyana Mishra, Eyal Ofek, Gur Kimchi

Abstract: Oblique images are aerial photographs taken at oblique angles to the earth's surface. Projections of vector and other geospatial data in these images depend on camera parameters, positions of the geospatial entities, surface terrain, occlusions, and visibility. This paper presents a robust and scalable algorithm to detect inconsistencies in vector data using oblique images. The algorithm uses imag… ▽ More Oblique images are aerial photographs taken at oblique angles to the earth's surface. Projections of vector and other geospatial data in these images depend on camera parameters, positions of the geospatial entities, surface terrain, occlusions, and visibility. This paper presents a robust and scalable algorithm to detect inconsistencies in vector data using oblique images. The algorithm uses image descriptors to encode the local appearance of a geospatial entity in images. These image descriptors combine color, pixel-intensity gradients, texture, and steerable filter responses. A Support Vector Machine classifier is trained to detect image descriptors that are not consistent with underlying vector data, digital elevation maps, building models, and camera parameters. In this paper, we train the classifier on visible road segments and non-road data. Thereafter, the trained classifier detects inconsistencies in vectors, which include both occluded and misaligned road segments. The consistent road segments validate our vector, DEM, and 3-D model data for those areas while inconsistent segments point out errors. We further show that a search for descriptors that are consistent with visible road segments in the neighborhood of a misaligned road yields the desired road alignment that is consistent with pixels in the image. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: In Proceedings of 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS'08)

Journal ref: Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS '08), pp. 1-10. 2008

arXiv:2206.03189 [pdf, other]

Quantifying the Effects of Working in VR for One Week

Authors: Verena Biener, Snehanjali Kalamkar, Negar Nouri, Eyal Ofek, Michel Pahud, John J. Dudley, **ghui Hu, Per Ola Kristensson, Maheshya Weerasinghe, Klen Čopič Pucihar, Matjaž Kljun, Stephan Streuber, Jens Grubert

Abstract: Virtual Reality (VR) provides new possibilities for modern knowledge work. However, the potential advantages of virtual work environments can only be used if it is feasible to work in them for an extended period of time. Until now, there are limited studies of long-term effects when working in VR. This paper addresses the need for understanding such long-term effects. Specifically, we report on a… ▽ More Virtual Reality (VR) provides new possibilities for modern knowledge work. However, the potential advantages of virtual work environments can only be used if it is feasible to work in them for an extended period of time. Until now, there are limited studies of long-term effects when working in VR. This paper addresses the need for understanding such long-term effects. Specifically, we report on a comparative study (n=16), in which participants were working in VR for an entire week -- for five days, eight hours each day -- as well as in a baseline physical desktop environment. This study aims to quantify the effects of exchanging a desktop-based work environment with a VR-based environment. Hence, during this study, we do not present the participants with the best possible VR system but rather a setup delivering a comparable experience to working in the physical desktop environment. The study reveals that, as expected, VR results in significantly worse ratings across most measures. Among other results, we found concerning levels of simulator sickness, below average usability ratings and two participants dropped out on the first day using VR, due to migraine, nausea and anxiety. Nevertheless, there is some indication that participants gradually overcame negative first impressions and initial discomfort. Overall, this study helps lay the groundwork for subsequent research, by clearly highlighting current shortcomings and identifying opportunities for improving the experience of working in VR. △ Less

Submitted 8 June, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

Comments: 11 pages

ACM Class: I.3.7

arXiv:2201.06337 [pdf, other]

PoVRPoint: Authoring Presentations in Mobile Virtual Reality

Authors: Verena Biener, Travis Gesslein, Daniel Schneider, Felix Kawala, Alexander Otte, Per Ola Kristensson, Michel Pahud, Eyal Ofek, Cuauhtli Campos, Matjaž Kljun, Klen Čopič Pucihar, Jens Grubert

Abstract: Virtual Reality (VR) has the potential to support mobile knowledge workers by complementing traditional input devices with a large three-dimensional output space and spatial input. Previous research on supporting VR knowledge work explored domains such as text entry using physical keyboards and spreadsheet interaction using combined pen and touch input. Inspired by such work, this paper probes the… ▽ More Virtual Reality (VR) has the potential to support mobile knowledge workers by complementing traditional input devices with a large three-dimensional output space and spatial input. Previous research on supporting VR knowledge work explored domains such as text entry using physical keyboards and spreadsheet interaction using combined pen and touch input. Inspired by such work, this paper probes the VR design space for authoring presentations in mobile settings. We propose PoVRPoint -- a set of tools coupling pen- and touch-based editing of presentations on mobile devices, such as tablets, with the interaction capabilities afforded by VR. We study the utility of extended display space to, for example, assist users in identifying target slides, supporting spatial manipulation of objects on a slide, creating animations, and facilitating arrangements of multiple, possibly occluded, shapes. Among other things, our results indicate that 1) the wide field of view afforded by VR results in significantly faster target slide identification times compared to a tablet-only interface for visually salient targets; and 2) the three-dimensional view in VR enables significantly faster object reordering in the presence of occlusion compared to two baseline interfaces. A user study further confirmed that the interaction techniques were found to be usable and enjoyable. △ Less

Submitted 17 January, 2022; originally announced January 2022.

Comments: IEEE VR 2022; to appear in IEEE transactions on visualization and computer graphics, 2022

ACM Class: I.3.7

Journal ref: In IEEE transactions on visualization and computer graphics, 2022

arXiv:2111.03942 [pdf, other]

Extended Reality for Knowledge Work in Everyday Environments

Authors: Verena Biener, Eyal Ofek, Michel Pahud, Per Ola Kristensson, Jens Grubert

Abstract: Virtual and Augmented Reality have the potential to change information work. The ability to modify the workers senses can transform everyday environments into a productive office, using portable head-mounted displays combined with conventional interaction devices, such as keyboards and tablets. While a stream of better, cheaper and lighter HMDs have been introduced for consumers in recent years, t… ▽ More Virtual and Augmented Reality have the potential to change information work. The ability to modify the workers senses can transform everyday environments into a productive office, using portable head-mounted displays combined with conventional interaction devices, such as keyboards and tablets. While a stream of better, cheaper and lighter HMDs have been introduced for consumers in recent years, there are still many challenges to be addressed to allow this vision to become reality. This chapter summarizes the state of the art in the field of extended reality for knowledge work in everyday environments and proposes steps to address the open challenges. △ Less

Submitted 6 November, 2021; originally announced November 2021.

ACM Class: H.5.2

arXiv:2109.10607 [pdf, other]

Accuracy Evaluation of Touch Tasks in Commodity Virtual and Augmented Reality Head-Mounted Displays

Authors: Daniel Schneider, Verena Biener, Alexander Otte, Travis Gesslein, Philipp Gagel, Cuauhtli Campos, Klen Čopič Pucihar, Matjaž Kljun, Eyal Ofek, Michel Pahud, Per Ola Kristensson, Jens Grubert

Abstract: An increasing number of consumer-oriented head-mounted displays (HMD) for augmented and virtual reality (AR/VR) are capable of finger and hand tracking. We report on the accuracy of off-the-shelf VR and AR HMDs when used for touch-based tasks such as pointing or drawing. Specifically, we report on the finger tracking accuracy of the VR head-mounted displays Oculus Quest, Vive Pro and the Leap Moti… ▽ More An increasing number of consumer-oriented head-mounted displays (HMD) for augmented and virtual reality (AR/VR) are capable of finger and hand tracking. We report on the accuracy of off-the-shelf VR and AR HMDs when used for touch-based tasks such as pointing or drawing. Specifically, we report on the finger tracking accuracy of the VR head-mounted displays Oculus Quest, Vive Pro and the Leap Motion controller, when attached to a VR HMD, as well as the finger tracking accuracy of the AR head-mounted displays Microsoft HoloLens 2 and Magic Leap. We present the results of two experiments in which we compare the accuracy for absolute and relative pointing tasks using both human participants and a robot. The results suggest that HTC Vive has a lower spatial accuracy than the Oculus Quest and Leap Motion and that the Microsoft HoloLens 2 provides higher spatial accuracy than Magic Leap One. These findings can serve as decision support for researchers and practitioners in choosing which systems to use in the future. △ Less

Submitted 22 September, 2021; originally announced September 2021.

Comments: To appear in SUI 2021, November 09-10, Virtual Conference

ACM Class: I.3.7

arXiv:2108.12390 [pdf]

doi 10.1145/3510463

Two-In-One: A Design Space for Map** Unimanual Input into Bimanual Interactions in VR for Users with Limited Movement

Authors: Momona Yamagami, Sasa Junuzovic, Mar Gonzalez-Franco, Eyal Ofek, Edward Cutrell, John R. Porter, Andrew D. Wilson, Martez E. Mott

Abstract: Virtual Reality (VR) applications often require users to perform actions with two hands when performing tasks and interacting with objects in virtual environments. Although bimanual interactions in VR can resemble real-world interactions -- thus increasing realism and improving immersion -- they can also pose significant accessibility challenges to people with limited mobility, such as for people… ▽ More Virtual Reality (VR) applications often require users to perform actions with two hands when performing tasks and interacting with objects in virtual environments. Although bimanual interactions in VR can resemble real-world interactions -- thus increasing realism and improving immersion -- they can also pose significant accessibility challenges to people with limited mobility, such as for people who have full use of only one hand. An opportunity exists to create accessible techniques that take advantage of users' abilities, but designers currently lack structured tools to consider alternative approaches. To begin filling this gap, we propose Two-in-One, a design space that facilitates the creation of accessible methods for bimanual interactions in VR from unimanual input. Our design space comprises two dimensions, bimanual interactions and computer assistance, and we provide a detailed examination of issues to consider when creating new unimanual input techniques that map to bimanual interactions in VR. We used our design space to create three interaction techniques that we subsequently implemented for a subset of bimanual interactions and received user feedback through a video elicitation study with 17 people with limited mobility. Our findings explore complex tradeoffs associated with autonomy and agency and highlight the need for additional settings and methods to make VR accessible to people with limited mobility. △ Less

Submitted 20 April, 2024; v1 submitted 27 August, 2021; originally announced August 2021.

Comments: 26 pages, 3 figures, 6 tables

arXiv:2108.10829 [pdf, other]

doi 10.1145/3472749.3474821

HapticBots: Distributed Encountered-type Haptics for VR with Multiple Shape-changing Mobile Robots

Authors: Ryo Suzuki, Eyal Ofek, Mike Sinclair, Daneil Leithinger, Mar Gonzalez-Franco

Abstract: HapticBots introduces a novel encountered-type haptic approach for Virtual Reality (VR) based on multiple tabletop-size shape-changing robots. These robots move on a tabletop and change their height and orientation to haptically render various surfaces and objects on-demand. Compared to previous encountered-type haptic approaches like shape displays or robotic arms, our proposed approach has an ad… ▽ More HapticBots introduces a novel encountered-type haptic approach for Virtual Reality (VR) based on multiple tabletop-size shape-changing robots. These robots move on a tabletop and change their height and orientation to haptically render various surfaces and objects on-demand. Compared to previous encountered-type haptic approaches like shape displays or robotic arms, our proposed approach has an advantage in deployability, scalability, and generalizability -- these robots can be easily deployed due to their compact form factor. They can support multiple concurrent touch points in a large area thanks to the distributed nature of the robots. We propose and evaluate a novel set of interactions enabled by these robots which include: 1) rendering haptics for VR objects by providing just-in-time touch-points on the user's hand, 2) simulating continuous surfaces with the concurrent height and position change, and 3) enabling the user to pick up and move VR objects through graspable proxy objects. Finally, we demonstrate HapticBots with various applications, including remote collaboration, education and training, design and 3D modeling, and gaming and entertainment. △ Less

Submitted 24 August, 2021; originally announced August 2021.

Comments: UIST 2021

arXiv:2009.02947 [pdf, other]

Towards a Practical Virtual Office for Mobile Knowledge Workers

Authors: Eyal Ofek, Jens Grubert, Michel Pahud, Mark Phillips, Per Ola Kristensson

Abstract: As more people work from home or during travel, new opportunities and challenges arise around mobile office work. On one hand, people may work at flexible hours, independent of traffic limitations, but on the other hand, they may need to work at makeshift spaces, with less than optimal working conditions and decoupled from co-workers. Virtual Reality (VR) has the potential to change the way inform… ▽ More As more people work from home or during travel, new opportunities and challenges arise around mobile office work. On one hand, people may work at flexible hours, independent of traffic limitations, but on the other hand, they may need to work at makeshift spaces, with less than optimal working conditions and decoupled from co-workers. Virtual Reality (VR) has the potential to change the way information workers work: it enables personal bespoke working environments even on the go and allows new collaboration approaches that can help mitigate the effects of physical distance. In this paper, we investigate opportunities and challenges for realizing a mobile VR offices environments and discuss implications from recent findings of mixing standard off-the-shelf equipment, such as tablets, laptops or desktops, with VR to enable effective, efficient, ergonomic, and rewarding mobile knowledge work. Further, we investigate the role of conceptual and physical spaces in a mobile VR office. △ Less

Submitted 7 September, 2020; originally announced September 2020.

Comments: https://www.microsoft.com/en-us/research/event/new-future-of-work/#!publications

ACM Class: H.5.2

Journal ref: Microsoft New Future of Work 2020 Symposium

arXiv:2009.02927 [pdf, other]

Back to the Future: Revisiting Mouse and Keyboard Interaction for HMD-based Immersive Analytics

Authors: Jens Grubert, Eyal Ofek, Michel Pahud, Per Ola Kristensson

Abstract: With the rise of natural user interfaces, immersive analytics applications often focus on novel forms of interaction modalities such as mid-air gestures, gaze or tangible interaction utilizing input devices such as depth-sensors, touch screens and eye-trackers. At the same time, traditional input devices such as the physical keyboard and mouse are used to a lesser extent. We argue, that for certai… ▽ More With the rise of natural user interfaces, immersive analytics applications often focus on novel forms of interaction modalities such as mid-air gestures, gaze or tangible interaction utilizing input devices such as depth-sensors, touch screens and eye-trackers. At the same time, traditional input devices such as the physical keyboard and mouse are used to a lesser extent. We argue, that for certain work scenarios, such as conducting analytic tasks at stationary desktop settings, it can be valuable to combine the benefits of novel and established input devices as well as input modalities to create productive immersive analytics environments. △ Less

Submitted 7 September, 2020; originally announced September 2020.

ACM Class: H.5.2

Journal ref: In ACM CHI 2020 4th Workshop on Immersive Analytics: Envisioning Future Productivity for Immersive Analytics

arXiv:2008.04559 [pdf, other]

Breaking the Screen: Interaction Across Touchscreen Boundaries in Virtual Reality for Mobile Knowledge Workers

Authors: Verena Biener, Daniel Schneider, Travis Gesslein, Alexander Otte, Bastian Kuth, Per Ola Kristensson, Eyal Ofek, Michel Pahud, Jens Grubert

Abstract: Virtual Reality (VR) has the potential to transform knowledge work. One advantage of VR knowledge work is that it allows extending 2D displays into the third dimension, enabling new operations, such as selecting overlap** objects or displaying additional layers of information. On the other hand, mobile knowledge workers often work on established mobile devices, such as tablets, limiting interact… ▽ More Virtual Reality (VR) has the potential to transform knowledge work. One advantage of VR knowledge work is that it allows extending 2D displays into the third dimension, enabling new operations, such as selecting overlap** objects or displaying additional layers of information. On the other hand, mobile knowledge workers often work on established mobile devices, such as tablets, limiting interaction with those devices to a small input space. This challenge of a constrained input space is intensified in situations when VR knowledge work is situated in cramped environments, such as airplanes and touchdown spaces. In this paper, we investigate the feasibility of interacting jointly between an immersive VR head-mounted display and a tablet within the context of knowledge work. Specifically, we 1) design, implement and study how to interact with information that reaches beyond a single physical touchscreen in VR; 2) design and evaluate a set of interaction concepts; and 3) build example applications and gather user feedback on those applications. △ Less

Submitted 11 August, 2020; originally announced August 2020.

Comments: 10 pages, 8 figures, ISMAR 2020

ACM Class: I.3.7

Journal ref: In IEEE transactions on visualization and computer graphics, 2020

arXiv:2008.04543 [pdf, other]

Pen-based Interaction with Spreadsheets in Mobile Virtual Reality

Authors: Travis Gesslein, Verena Biener, Philipp Gagel, Daniel Schneider, Per Ola Kristensson, Eyal Ofek, Michel Pahud, Jens Grubert

Abstract: Virtual Reality (VR) can enhance the display and interaction of mobile knowledge work and in particular, spreadsheet applications. While spreadsheets are widely used yet are challenging to interact with, especially on mobile devices, using them in VR has not been explored in depth. A special uniqueness of the domain is the contrast between the immersive and large display space afforded by VR, cont… ▽ More Virtual Reality (VR) can enhance the display and interaction of mobile knowledge work and in particular, spreadsheet applications. While spreadsheets are widely used yet are challenging to interact with, especially on mobile devices, using them in VR has not been explored in depth. A special uniqueness of the domain is the contrast between the immersive and large display space afforded by VR, contrasted by the very limited interaction space that may be afforded for the information worker on the go, such as an airplane seat or a small work-space. To close this gap, we present a tool-set for enhancing spreadsheet interaction on tablets using immersive VR headsets and pen-based input. This combination opens up many possibilities for enhancing the productivity for spreadsheet interaction. We propose to use the space around and in front of the tablet for enhanced visualization of spreadsheet data and meta-data. For example, extending sheet display beyond the bounds of the physical screen, or easier debugging by uncovering hidden dependencies between sheet's cells. Combining the precise on-screen input of a pen with spatial sensing around the tablet, we propose tools for the efficient creation and editing of spreadsheets functions such as off-the-screen layered menus, visualization of sheets dependencies, and gaze-and-touch-based switching between spreadsheet tabs. We study the feasibility of the proposed tool-set using a video-based online survey and an expert-based assessment of indicative human performance potential. △ Less

Submitted 11 August, 2020; originally announced August 2020.

Comments: 10 pages, 11 figures, ISMAR 2020

ACM Class: I.3.7

Journal ref: In 2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR) 2020

arXiv:1907.08153 [pdf, other]

ReconViguRation: Reconfiguring Physical Keyboards in Virtual Reality

Authors: Daniel Schneider, Alexander Otte, Travis Gesslein, Philipp Gagel, Bastian Kuth, Mohamad Shahm Damlakhi, Oliver Dietz, Eyal Ofek, Michel Pahud, Per Ola Kristensson, Jörg Müller, Jens Grubert

Abstract: Physical keyboards are common peripherals for personal computers and are efficient standard text entry devices. Recent research has investigated how physical keyboards can be used in immersive head-mounted display-based Virtual Reality (VR). So far, the physical layout of keyboards has typically been transplanted into VR for replicating ty** experiences in a standard desktop environment. In th… ▽ More Physical keyboards are common peripherals for personal computers and are efficient standard text entry devices. Recent research has investigated how physical keyboards can be used in immersive head-mounted display-based Virtual Reality (VR). So far, the physical layout of keyboards has typically been transplanted into VR for replicating ty** experiences in a standard desktop environment. In this paper, we explore how to fully leverage the immersiveness of VR to change the input and output characteristics of physical keyboard interaction within a VR environment. This allows individual physical keys to be reconfigured to the same or different actions and visual output to be distributed in various ways across the VR representation of the keyboard. We explore a set of input and output map**s for reconfiguring the virtual presentation of physical keyboards and probe the resulting design space by specifically designing, implementing and evaluating nine VR-relevant applications: emojis, languages and special characters, application shortcuts, virtual text processing macros, a window manager, a photo browser, a whack-a-mole game, secure password entry and a virtual touch bar. We investigate the feasibility of the applications in a user study with 20 participants and find that, among other things, they are usable in VR. We discuss the limitations and possibilities of remap** the input and output characteristics of physical keyboards in VR based on empirical findings and analysis and suggest future research directions in this area. △ Less

Submitted 18 July, 2019; originally announced July 2019.

Comments: to appear

ACM Class: H.5.2

Journal ref: In IEEE Transactions of Visualization and Computer Graphics (TVCG), 2019

arXiv:1812.02197 [pdf]

doi 10.1109/MCG.2018.2875609

The Office of the Future: Virtual, Portable and Global

Authors: Jens Grubert, Eyal Ofek, Michel Pahud, Per Ola Kristensson

Abstract: Virtual Reality has the potential to change the way we work. We envision the future office worker to be able to work productively everywhere solely using portable standard input devices and immersive head-mounted displays. Virtual Reality has the potential to enable this, by allowing users to create working environments of their choice and by relieving them from physical world limitations such as… ▽ More Virtual Reality has the potential to change the way we work. We envision the future office worker to be able to work productively everywhere solely using portable standard input devices and immersive head-mounted displays. Virtual Reality has the potential to enable this, by allowing users to create working environments of their choice and by relieving them from physical world limitations such as constrained space or noisy environments. In this article, we investigate opportunities and challenges for realizing this vision and discuss implications from recent findings of text entry in virtual reality as a core office task. △ Less

Submitted 5 December, 2018; originally announced December 2018.

arXiv:1804.03211 [pdf]

Mobiles as Portals for Interacting with Virtual Data Visualizations

Authors: Michel Pahud, Eyal Ofek, Nathalie Henry Riche, Christophe Hurter, Jens Grubert

Abstract: We propose a set of techniques leveraging mobile devices as lenses to explore, interact and annotate n-dimensional data visualizations. The democratization of mobile devices, with their arrays of integrated sensors, opens up opportunities to create experiences for anyone to explore and interact with large information spaces anywhere. In this paper, we propose to revisit ideas behind the Chameleon… ▽ More We propose a set of techniques leveraging mobile devices as lenses to explore, interact and annotate n-dimensional data visualizations. The democratization of mobile devices, with their arrays of integrated sensors, opens up opportunities to create experiences for anyone to explore and interact with large information spaces anywhere. In this paper, we propose to revisit ideas behind the Chameleon prototype of Fitzmaurice et al. initially envisioned in the 90s for navigation, before spatially-aware devices became mainstream. We also take advantage of other input modalities such as pen and touch to not only navigate the space using the mobile as a lens, but interact and annotate it by adding toolglasses. △ Less

Submitted 9 April, 2018; originally announced April 2018.

arXiv:1802.00626 [pdf, other]

Text Entry in Immersive Head-Mounted Display-based Virtual Reality using Standard Keyboards

Authors: Jens Grubert, Lukas Witzani, Eyal Ofek, Michel Pahud, Matthias Kranz, Per Ola Kristensson

Abstract: We study the performance and user experience of two popular mainstream text entry devices, desktop keyboards and touchscreen keyboards, for use in Virtual Reality (VR) applications. We discuss the limitations arising from limited visual feedback, and examine the efficiency of different strategies of use. We analyze a total of 24 hours of ty** data in VR from 24 participants and find that novice… ▽ More We study the performance and user experience of two popular mainstream text entry devices, desktop keyboards and touchscreen keyboards, for use in Virtual Reality (VR) applications. We discuss the limitations arising from limited visual feedback, and examine the efficiency of different strategies of use. We analyze a total of 24 hours of ty** data in VR from 24 participants and find that novice users are able to retain about 60% of their ty** speed on a desktop keyboard and about 40-45\% of their ty** speed on a touchscreen keyboard. We also find no significant learning effects, indicating that users can transfer their ty** skills fast into VR. Besides investigating baseline performances, we study the position in which keyboards and hands are rendered in space. We find that this does not adversely affect performance for desktop keyboard ty** and results in a performance trade-off for touchscreen keyboard ty**. △ Less

Submitted 2 February, 2018; originally announced February 2018.

Comments: IEEE VR 2018. arXiv admin note: text overlap with arXiv:1802.00613

ACM Class: H.5.2

arXiv:1802.00613 [pdf, other]

Effects of Hand Representations for Ty** in Virtual Reality

Authors: Jens Grubert, Lukas Witzani, Eyal Ofek, Michel Pahud, Matthias Kranz, Per Ola Kristensson

Abstract: Alphanumeric text entry is a challenge for Virtual Reality (VR) applications. VR enables new capabilities, impossible in the real world, such as an unobstructed view of the keyboard, without occlusion by the user's physical hands. Several hand representations have been proposed for ty** in VR on standard physical keyboards. However, to date, these hand representations have not been compared rega… ▽ More Alphanumeric text entry is a challenge for Virtual Reality (VR) applications. VR enables new capabilities, impossible in the real world, such as an unobstructed view of the keyboard, without occlusion by the user's physical hands. Several hand representations have been proposed for ty** in VR on standard physical keyboards. However, to date, these hand representations have not been compared regarding their performance and effects on presence for VR text entry. Our work addresses this gap by comparing existing hand representations with minimalistic fingertip visualization. We study the effects of four hand representations (no hand representation, inverse kinematic model, fingertip visualization using spheres and video inlay) on ty** in VR using a standard physical keyboard with 24 participants. We found that the fingertip visualization and video inlay both resulted in statistically significant lower text entry error rates compared to no hand or inverse kinematic model representations. We found no statistical differences in text entry speed. △ Less

Submitted 2 February, 2018; originally announced February 2018.

Comments: IEEE VR 2018 publication

ACM Class: H.5.2

arXiv:1701.03963 [pdf, other]

Towards Interaction Around Unmodified Camera-equipped Mobile Devices

Authors: Jens Grubert, Eyal Ofek, Michel Pahud, Matthias Kranz, Dieter Schmalstieg

Abstract: Around-device interaction promises to extend the input space of mobile and wearable devices beyond the common but restricted touchscreen. So far, most around-device interaction approaches rely on instrumenting the device or the environment with additional sensors. We believe, that the full potential of ordinary cameras, specifically user-facing cameras, which are integrated in most mobile devices… ▽ More Around-device interaction promises to extend the input space of mobile and wearable devices beyond the common but restricted touchscreen. So far, most around-device interaction approaches rely on instrumenting the device or the environment with additional sensors. We believe, that the full potential of ordinary cameras, specifically user-facing cameras, which are integrated in most mobile devices today, are not used to their full potential, yet. We To this end, we present a novel approach for extending the input space around unmodified mobile devices using built-in front-facing cameras of unmodified handheld devices. Our approach estimates hand poses and gestures through reflections in sunglasses, ski goggles or visors. Thereby, GlassHands creates an enlarged input space, rivaling input reach on large touch displays. We discuss the idea, its limitations and future work. △ Less

Submitted 14 January, 2017; originally announced January 2017.

ACM Class: H.5.2

arXiv:cs/0603084 [pdf, ps, other]

Random 3CNF formulas elude the Lovasz theta function

Authors: Uriel Feige, Eran Ofek

Abstract: Let $φ$ be a 3CNF formula with n variables and m clauses. A simple nonconstructive argument shows that when m is sufficiently large compared to n, most 3CNF formulas are not satisfiable. It is an open question whether there is an efficient refutation algorithm that for most such formulas proves that they are not satisfiable. A possible approach to refute a formula $φ$ is: first, translate it int… ▽ More Let $φ$ be a 3CNF formula with n variables and m clauses. A simple nonconstructive argument shows that when m is sufficiently large compared to n, most 3CNF formulas are not satisfiable. It is an open question whether there is an efficient refutation algorithm that for most such formulas proves that they are not satisfiable. A possible approach to refute a formula $φ$ is: first, translate it into a graph $G_φ$ using a generic reduction from 3-SAT to max-IS, then bound the maximum independent set of $G_φ$ using the Lovasz $\vartheta$ function. If the $\vartheta$ function returns a value $< m$, this is a certificate for the unsatisfiability of $φ$. We show that for random formulas with $m < n^{3/2 -o(1)}$ clauses, the above approach fails, i.e. the $\vartheta$ function is likely to return a value of m. △ Less

Submitted 22 March, 2006; originally announced March 2006.

Comments: 14 pages

Showing 1–21 of 21 results for author: Ofek, E