Search | arXiv e-print repository

Aria Everyday Activities Dataset

Authors: Zhaoyang Lv, Nicholas Charron, Pierre Moulon, Alexander Gamino, Cheng Peng, Chris Sweeney, Edward Miller, Huixuan Tang, Jeff Meissner, **g Dong, Kiran Somasundaram, Luis Pesqueira, Mark Schwesinger, Omkar Parkhi, Qiao Gu, Renzo De Nardi, Shangyi Cheng, Steve Saarinen, Vijay Baiyya, Yuyang Zou, Richard Newcombe, Jakob Julian Engel, Xiaqing Pan, Carl Ren

Abstract: We present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses. AEA contains 143 daily activity sequences recorded by multiple wearers in five geographically diverse indoor locations. Each of the recording contains multimodal sensor data recorded through the Project Aria glasses. In addition, AEA provides machine perception data includi… ▽ More We present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses. AEA contains 143 daily activity sequences recorded by multiple wearers in five geographically diverse indoor locations. Each of the recording contains multimodal sensor data recorded through the Project Aria glasses. In addition, AEA provides machine perception data including high frequency globally aligned 3D trajectories, scene point cloud, per-frame 3D eye gaze vector and time aligned speech transcription. In this paper, we demonstrate a few exemplar research applications enabled by this dataset, including neural scene reconstruction and prompted segmentation. AEA is an open source dataset that can be downloaded from https://www.projectaria.com/datasets/aea/. We are also providing open-source implementations and examples of how to use the dataset in Project Aria Tools https://github.com/facebookresearch/projectaria_tools. △ Less

Submitted 21 February, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

Comments: Dataset website: https://www.projectaria.com/datasets/aea/

arXiv:2308.13561 [pdf, other]

Project Aria: A New Tool for Egocentric Multi-Modal AI Research

Authors: Jakob Engel, Kiran Somasundaram, Michael Goesele, Albert Sun, Alexander Gamino, Andrew Turner, Arjang Talattof, Arnie Yuan, Bilal Souti, Brighid Meredith, Cheng Peng, Chris Sweeney, Cole Wilson, Dan Barnes, Daniel DeTone, David Caruso, Derek Valleroy, Dinesh Ginjupalli, Duncan Frost, Edward Miller, Elias Mueggler, Evgeniy Oleinik, Fan Zhang, Guruprasad Somasundaram, Gustavo Solaira , et al. (49 additional authors not shown)

Abstract: Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, mul… ▽ More Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, multi-modal data recording and streaming device with the goal to foster and accelerate research in this area. In this paper, we describe the Aria device hardware including its sensor configuration and the corresponding software tools that enable recording and processing of such data. △ Less

Submitted 1 October, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.13093 [pdf, other]

EgoBlur: Responsible Innovation in Aria

Authors: Nikhil Raina, Guruprasad Somasundaram, Kang Zheng, Sagar Miglani, Steve Saarinen, Jeff Meissner, Mark Schwesinger, Luis Pesqueira, Ishita Prasad, Edward Miller, Prince Gupta, Mingfei Yan, Richard Newcombe, Carl Ren, Omkar M Parkhi

Abstract: Project Aria pushes the frontiers of Egocentric AI with large-scale real-world data collection using purposely designed glasses with privacy first approach. To protect the privacy of bystanders being recorded by the glasses, our research protocols are designed to ensure recorded video is processed by an AI anonymization model that removes bystander faces and vehicle license plates. Detected face a… ▽ More Project Aria pushes the frontiers of Egocentric AI with large-scale real-world data collection using purposely designed glasses with privacy first approach. To protect the privacy of bystanders being recorded by the glasses, our research protocols are designed to ensure recorded video is processed by an AI anonymization model that removes bystander faces and vehicle license plates. Detected face and license plate regions are processed with a Gaussian blur such that these personal identification information (PII) regions are obscured. This process helps to ensure that anonymized versions of the video is retained for research purposes. In Project Aria, we have developed a state-of-the-art anonymization system EgoBlur. In this paper, we present extensive analysis of EgoBlur on challenging datasets comparing its performance with other state-of-the-art systems from industry and academia including extensive Responsible AI analysis on recently released Casual Conversations V2 dataset. △ Less

Submitted 6 September, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2211.01677 [pdf]

doi 10.22152/programming-journal.org/2023/7/7

Little Tricky Logic: Misconceptions in the Understanding of LTL

Authors: Ben Greenman, Sam Saarinen, Tim Nelson, Shriram Krishnamurthi

Abstract: Context: Linear Temporal Logic (LTL) has been used widely in verification. Its importance and popularity have only grown with the revival of temporal logic synthesis, and with new uses of LTL in robotics and planning activities. All these uses demand that the user have a clear understanding of what an LTL specification means. Inquiry: Despite the growing use of LTL, no studies have investigated… ▽ More Context: Linear Temporal Logic (LTL) has been used widely in verification. Its importance and popularity have only grown with the revival of temporal logic synthesis, and with new uses of LTL in robotics and planning activities. All these uses demand that the user have a clear understanding of what an LTL specification means. Inquiry: Despite the growing use of LTL, no studies have investigated the misconceptions users actually have in understanding LTL formulas. This paper addresses the gap with a first study of LTL misconceptions. Approach: We study researchers' and learners' understanding of LTL in four rounds (three written surveys, one talk-aloud) spread across a two-year timeframe. Concretely, we decompose "understanding LTL" into three questions. A person reading a spec needs to understand what it is saying, so we study the map** from LTL to English. A person writing a spec needs to go in the other direction, so we study English to LTL. However, misconceptions could arise from two sources: a misunderstanding of LTL's syntax or of its underlying semantics. Therefore, we also study the relationship between formulas and specific traces. Knowledge: We find several misconceptions that have consequences for learners, tool builders, and designers of new property languages. These findings are already resulting in changes to the Alloy modeling language. We also find that the English to LTL direction was the most common source of errors; unfortunately, this is the critical "authoring" direction in which a subtle mistake can lead to a faulty system. We contribute study instruments that are useful for training learners (whether academic or industrial) who are getting acquainted with LTL, and we provide a code book to assist in the analysis of responses to similar-style questions. Grounding: Our findings are grounded in the responses to our survey rounds. Round 1 used Quizius to identify misconceptions among learners in a way that reduces the threat of expert blind spots. Rounds 2 and 3 confirm that both additional learners and researchers (who work in formal methods, robotics, and related fields) make similar errors. Round 4 adds deep support for our misconceptions via talk-aloud surveys. Importance This work provides useful answers to two critical but unexplored questions: in what ways is LTL tricky and what can be done about it? Our survey instruments can serve as a starting point for other studies. △ Less

Submitted 3 November, 2022; originally announced November 2022.

Journal ref: The Art, Science, and Engineering of Programming, 2023, Vol. 7, Issue 2, Article 7

arXiv:2206.11169 [pdf, other]

doi 10.1364/OPTICA.468590

Laser cooling a membrane-in-the-middle system close to the quantum ground state from room temperature

Authors: Sampo A. Saarinen, Nenad Kralj, Eric C. Langman, Yeghishe Tsaturyan, Albert Schliesser

Abstract: Many protocols in quantum science and technology require initializing a system in a pure quantum state. In the context of the motional state of massive resonators, this enables studying fundamental physics at the elusive quantum-classical transition, and measuring force and acceleration with enhanced sensitivity. Laser cooling has been a method of choice to prepare mechanical resonators in the qua… ▽ More Many protocols in quantum science and technology require initializing a system in a pure quantum state. In the context of the motional state of massive resonators, this enables studying fundamental physics at the elusive quantum-classical transition, and measuring force and acceleration with enhanced sensitivity. Laser cooling has been a method of choice to prepare mechanical resonators in the quantum ground state, one of the simplest pure states. However, in order to overcome the heating and decoherence by the thermal bath, this usually has to be combined with cryogenic cooling. Here, we laser-cool an ultracoherent, soft-clamped mechanical resonator close to the quantum ground state directly from room temperature. To this end, we implement the versatile membrane-in-the-middle setup with one fiber mirror and one phononic crystal mirror, which reaches a quantum cooperativity close to unity already at room temperature. We furthermore introduce a powerful combination of coherent and measurement-based quantum control techniques, which allows us to mitigate thermal intermodulation noise. The lowest occupancy we reach is 30 phonons, limited by measurement imprecision. Doing away with the necessity for cryogenic cooling should further facilitate the spread of optomechanical quantum technologies. △ Less

Submitted 30 January, 2023; v1 submitted 22 June, 2022; originally announced June 2022.

Comments: New in v2: a) improved fitting routine led to new sideband-cooled linewidths and calibration of mechanical spectra, both of which result in amended: 1. Fig. 2; 2. feedback-cooled occupancies (lowest occupancy is 30, instead of 20); 3. imprecision level limited by excess classical noise; 4. Fig. 4. b) SI now includes new sections 1.B and 8, and expanded section 4.A. c) minor corrections throughout

Journal ref: Optica 10(3), 364-372 (2023)

arXiv:2107.05552 [pdf, other]

doi 10.1038/s41467-022-29115-9

Ground State Cooling of an Ultracoherent Electromechanical System

Authors: Yannick Seis, Thibault Capelle, Eric Langman, Sampo Saarinen, Eric Planz, Albert Schliesser

Abstract: Cavity electromechanics relies on parametric coupling between microwave and mechanical modes to manipulate the mechanical quantum state, and provide a coherent interface between different parts of hybrid quantum systems. High coherence of the mechanical mode is of key importance in such applications, in order to protect the quantum states it hosts from thermal decoherence. Here, we introduce an el… ▽ More Cavity electromechanics relies on parametric coupling between microwave and mechanical modes to manipulate the mechanical quantum state, and provide a coherent interface between different parts of hybrid quantum systems. High coherence of the mechanical mode is of key importance in such applications, in order to protect the quantum states it hosts from thermal decoherence. Here, we introduce an electromechanical system based around a soft-clamped mechanical resonator with an extremely high Q-factor (>$10^9$) held at very low (30 mK) temperatures. This ultracoherent mechanical resonator is capacitively coupled to a microwave mode, strong enough to enable ground-state-cooling of the mechanics ($\bar{n}_\mathrm{min}= 0.76\pm 0.16$). This paves the way towards exploiting the extremely long coherence times ($t_\mathrm{coh}>100 ms) offered by such systems for quantum information processing and state conversion. △ Less

Submitted 14 February, 2022; v1 submitted 12 July, 2021; originally announced July 2021.

Journal ref: Nat Commun 13, 1507 (2022)

arXiv:1906.12260 [pdf, other]

doi 10.1038/s41598-019-54200-3

Magnetic resonance imaging with optical preamplification and detection

Authors: Anders Simonsen, Juan Diego Sanchez, Sampo Antero Saarinen, Jan Henrik Ardenkjær-Larsen, Albert Schliesser, Eugene Simon Polzik

Abstract: Magnetic resonance (MR) imaging relies on conventional electronics that is increasingly challenged by the push for stronger magnetic fields and higher channel count. These problems can be avoided by utilizing optical technologies. As a replacement for the standard low-noise preamplifier, we have implemented a new transduction principle that upconverts an MR signal to the optical domain and imaged… ▽ More Magnetic resonance (MR) imaging relies on conventional electronics that is increasingly challenged by the push for stronger magnetic fields and higher channel count. These problems can be avoided by utilizing optical technologies. As a replacement for the standard low-noise preamplifier, we have implemented a new transduction principle that upconverts an MR signal to the optical domain and imaged a phantom in a clinical 3 T scanner with signal-to-noise comparable to classical induction detection. △ Less

Submitted 19 June, 2019; originally announced June 2019.

Comments: 6 pages, 4 figures

Journal ref: Sci Rep 9, 18173 (2019)

arXiv:1809.10025 [pdf, ps, other]

Personalized Education at Scale

Authors: Sam Saarinen, Evan Cater, Michael Littman

Abstract: Tailoring the presentation of information to the needs of individual students leads to massive gains in student outcomes~\cite{bloom19842}. This finding is likely due to the fact that different students learn differently, perhaps as a result of variation in ability, interest or other factors~\cite{schiefele1992interest}. Adapting presentations to the educational needs of an individual has traditio… ▽ More Tailoring the presentation of information to the needs of individual students leads to massive gains in student outcomes~\cite{bloom19842}. This finding is likely due to the fact that different students learn differently, perhaps as a result of variation in ability, interest or other factors~\cite{schiefele1992interest}. Adapting presentations to the educational needs of an individual has traditionally been the domain of experts, making it expensive and logistically challenging to do at scale, and also leading to inequity in educational outcomes. Increased course sizes and large MOOC enrollments provide an unprecedented access to student data. We propose that emerging technologies in reinforcement learning (RL), as well as semi-supervised learning, natural language processing, and computer vision are critical to leveraging this data to provide personalized education at scale. △ Less

Submitted 24 September, 2018; originally announced September 2018.

Showing 1–8 of 8 results for author: Saarinen, S