-
Aria Everyday Activities Dataset
Authors:
Zhaoyang Lv,
Nicholas Charron,
Pierre Moulon,
Alexander Gamino,
Cheng Peng,
Chris Sweeney,
Edward Miller,
Huixuan Tang,
Jeff Meissner,
**g Dong,
Kiran Somasundaram,
Luis Pesqueira,
Mark Schwesinger,
Omkar Parkhi,
Qiao Gu,
Renzo De Nardi,
Shangyi Cheng,
Steve Saarinen,
Vijay Baiyya,
Yuyang Zou,
Richard Newcombe,
Jakob Julian Engel,
Xiaqing Pan,
Carl Ren
Abstract:
We present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses. AEA contains 143 daily activity sequences recorded by multiple wearers in five geographically diverse indoor locations. Each of the recording contains multimodal sensor data recorded through the Project Aria glasses. In addition, AEA provides machine perception data includi…
▽ More
We present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses. AEA contains 143 daily activity sequences recorded by multiple wearers in five geographically diverse indoor locations. Each of the recording contains multimodal sensor data recorded through the Project Aria glasses. In addition, AEA provides machine perception data including high frequency globally aligned 3D trajectories, scene point cloud, per-frame 3D eye gaze vector and time aligned speech transcription. In this paper, we demonstrate a few exemplar research applications enabled by this dataset, including neural scene reconstruction and prompted segmentation. AEA is an open source dataset that can be downloaded from https://www.projectaria.com/datasets/aea/. We are also providing open-source implementations and examples of how to use the dataset in Project Aria Tools https://github.com/facebookresearch/projectaria_tools.
△ Less
Submitted 21 February, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Lih Wang and Dittert Conjectures on Permanents
Authors:
Divya. K. U,
K. Somasundaram
Abstract:
Let $Ω_n$ denote the set of all doubly stochastic matrices of order $n$. Lih and Wang conjectured that for $n\geq3$, per$(tJ_n+(1-t)A)\leq t $per$J_n+(1-t)$per$A$, for all $A\inΩ_n$ and all $t \in [0.5,1]$, where $J_n$ is the $n \times n$ matrix with each entry equal to $\frac{1}{n}$. This conjecture was proved partially for $n \leq 5$. \\ \indent Let $K_n$ denote the set of non-negative…
▽ More
Let $Ω_n$ denote the set of all doubly stochastic matrices of order $n$. Lih and Wang conjectured that for $n\geq3$, per$(tJ_n+(1-t)A)\leq t $per$J_n+(1-t)$per$A$, for all $A\inΩ_n$ and all $t \in [0.5,1]$, where $J_n$ is the $n \times n$ matrix with each entry equal to $\frac{1}{n}$. This conjecture was proved partially for $n \leq 5$. \\ \indent Let $K_n$ denote the set of non-negative $n\times n$ matrices whose elements have sum $n$. Let $φ$ be a real valued function defined on $K_n$ by $φ(X)=\prod_{i=1}^{n}r_i+\prod_{j=1}^{n}c_j$ - per$X$ for $X\in K_n$ with row sum vector $(r_1,r_2,...r_n)$ and column sum vector $(c_1,c_2,...c_n)$. A matrix $A\in K_n$ is called a $φ$-maximizing matrix if $φ(A)\geq φ(X)$ for all $X\in K_n$. Dittert conjectured that $J_n$ is the unique $φ$-maximizing matrix on $K_n$. Sinkhorn proved the conjecture for $n=2$ and Hwang proved it for $n=3$. \\ \indent In this paper, we prove the Lih and Wang conjecture for $n=6$ and Dittert conjecture for $n=4$.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Every Elementary Graph is Chromatic Choosable
Authors:
Nandana K Vasudevan,
K Somasundaram,
J Geetha
Abstract:
Elementary graphs are graphs whose edges can be colored using two colors in such a way that the edges in any induced $P_3$ get distinct colors. They constitute a subclass of the class of claw-free perfect graphs. In this paper, we show that for any elementary graph, its list chromatic number and chromatic number are equal.
Elementary graphs are graphs whose edges can be colored using two colors in such a way that the edges in any induced $P_3$ get distinct colors. They constitute a subclass of the class of claw-free perfect graphs. In this paper, we show that for any elementary graph, its list chromatic number and chromatic number are equal.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Authors:
Kristen Grauman,
Andrew Westbury,
Lorenzo Torresani,
Kris Kitani,
Jitendra Malik,
Triantafyllos Afouras,
Kumar Ashutosh,
Vijay Baiyya,
Siddhant Bansal,
Bikram Boote,
Eugene Byrne,
Zach Chavis,
Joya Chen,
Feng Cheng,
Fu-Jen Chu,
Sean Crane,
Avijit Dasgupta,
**g Dong,
Maria Escobar,
Cristhian Forigua,
Abrham Gebreselasie,
Sanjay Haresh,
**g Huang,
Md Mohaiminul Islam,
Suyog Jain
, et al. (76 additional authors not shown)
Abstract:
We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts, yielding long-form captures from…
▽ More
We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts, yielding long-form captures from 1 to 42 minutes each and 1,286 hours of video combined. The multimodal nature of the dataset is unprecedented: the video is accompanied by multichannel audio, eye gaze, 3D point clouds, camera poses, IMU, and multiple paired language descriptions -- including a novel "expert commentary" done by coaches and teachers and tailored to the skilled-activity domain. To push the frontier of first-person video understanding of skilled human activity, we also present a suite of benchmark tasks and their annotations, including fine-grained activity understanding, proficiency estimation, cross-view translation, and 3D hand/body pose. All resources are open sourced to fuel new research in the community. Project page: http://ego-exo4d-data.org/
△ Less
Submitted 29 April, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of Things
Authors:
Samiul Alam,
Tuo Zhang,
Tiantian Feng,
Hui Shen,
Zhichao Cao,
Dong Zhao,
JeongGil Ko,
Kiran Somasundaram,
Shrikanth S. Narayanan,
Salman Avestimehr,
Mi Zhang
Abstract:
There is a significant relevance of federated learning (FL) in the realm of Artificial Intelligence of Things (AIoT). However, most existing FL works do not use datasets collected from authentic IoT devices and thus do not capture unique modalities and inherent challenges of IoT data. To fill this critical gap, in this work, we introduce FedAIoT, an FL benchmark for AIoT. FedAIoT includes eight da…
▽ More
There is a significant relevance of federated learning (FL) in the realm of Artificial Intelligence of Things (AIoT). However, most existing FL works do not use datasets collected from authentic IoT devices and thus do not capture unique modalities and inherent challenges of IoT data. To fill this critical gap, in this work, we introduce FedAIoT, an FL benchmark for AIoT. FedAIoT includes eight datasets collected from a wide range of IoT devices. These datasets cover unique IoT modalities and target representative applications of AIoT. FedAIoT also includes a unified end-to-end FL framework for AIoT that simplifies benchmarking the performance of the datasets. Our benchmark results shed light on the opportunities and challenges of FL for AIoT. We hope FedAIoT could serve as an invaluable resource to foster advancements in the important field of FL for AIoT. The repository of FedAIoT is maintained at https://github.com/AIoT-MLSys-Lab/FedAIoT.
△ Less
Submitted 19 June, 2024; v1 submitted 29 September, 2023;
originally announced October 2023.
-
Project Aria: A New Tool for Egocentric Multi-Modal AI Research
Authors:
Jakob Engel,
Kiran Somasundaram,
Michael Goesele,
Albert Sun,
Alexander Gamino,
Andrew Turner,
Arjang Talattof,
Arnie Yuan,
Bilal Souti,
Brighid Meredith,
Cheng Peng,
Chris Sweeney,
Cole Wilson,
Dan Barnes,
Daniel DeTone,
David Caruso,
Derek Valleroy,
Dinesh Ginjupalli,
Duncan Frost,
Edward Miller,
Elias Mueggler,
Evgeniy Oleinik,
Fan Zhang,
Guruprasad Somasundaram,
Gustavo Solaira
, et al. (49 additional authors not shown)
Abstract:
Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, mul…
▽ More
Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, multi-modal data recording and streaming device with the goal to foster and accelerate research in this area. In this paper, we describe the Aria device hardware including its sensor configuration and the corresponding software tools that enable recording and processing of such data.
△ Less
Submitted 1 October, 2023; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Total Colorings of Some Classes of FourRegular Circulant Graphs
Authors:
R. Navaneeth,
J. Geetha,
K. Somasundaram,
Hung-Lin Fu
Abstract:
The total chromatic number, $χ''(G)$ is the minimum number of colors which need to be assigned to obtain a total coloring of the graph $G$. The Total Coloring Conjecture (TCC) made independently by Behzad and Vizing that for any graph, $χ''(G) \leq Δ(G)+2 $, where $Δ(G)$ represents the maximum degree of $G$. In this paper we obtained the total chromatic number for some classes of four regular circ…
▽ More
The total chromatic number, $χ''(G)$ is the minimum number of colors which need to be assigned to obtain a total coloring of the graph $G$. The Total Coloring Conjecture (TCC) made independently by Behzad and Vizing that for any graph, $χ''(G) \leq Δ(G)+2 $, where $Δ(G)$ represents the maximum degree of $G$. In this paper we obtained the total chromatic number for some classes of four regular circulant graphs.
△ Less
Submitted 29 October, 2021;
originally announced October 2021.
-
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Authors:
Kristen Grauman,
Andrew Westbury,
Eugene Byrne,
Zachary Chavis,
Antonino Furnari,
Rohit Girdhar,
Jackson Hamburger,
Hao Jiang,
Miao Liu,
Xingyu Liu,
Miguel Martin,
Tushar Nagarajan,
Ilija Radosavovic,
Santhosh Kumar Ramakrishnan,
Fiona Ryan,
Jayant Sharma,
Michael Wray,
Mengmeng Xu,
Eric Zhongcong Xu,
Chen Zhao,
Siddhant Bansal,
Dhruv Batra,
Vincent Cartillier,
Sean Crane,
Tien Do
, et al. (60 additional authors not shown)
Abstract:
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers from 74 worldwide locations and 9 different countries. The approach to collection is designed to uphold rigorous privacy and ethics standards with cons…
▽ More
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers from 74 worldwide locations and 9 different countries. The approach to collection is designed to uphold rigorous privacy and ethics standards with consenting participants and robust de-identification procedures where relevant. Ego4D dramatically expands the volume of diverse egocentric video footage publicly available to the research community. Portions of the video are accompanied by audio, 3D meshes of the environment, eye gaze, stereo, and/or synchronized videos from multiple egocentric cameras at the same event. Furthermore, we present a host of new benchmark challenges centered around understanding the first-person visual experience in the past (querying an episodic memory), present (analyzing hand-object manipulation, audio-visual conversation, and social interactions), and future (forecasting activities). By publicly sharing this massive annotated dataset and benchmark suite, we aim to push the frontier of first-person perception. Project page: https://ego4d-data.org/
△ Less
Submitted 11 March, 2022; v1 submitted 13 October, 2021;
originally announced October 2021.
-
Total, Equitable Total and Neighborhood sum distinguishing Total Colorings of Some Classes of Circulant Graphs
Authors:
S. Prajnanaswaroopa,
J. Geetha,
K. Somasundaram
Abstract:
In this paper, we have obtained the total chromatic as well as equitable and neighborhood sum distinguishing total chromatic numbers of some classes of the circulant graphs.
In this paper, we have obtained the total chromatic as well as equitable and neighborhood sum distinguishing total chromatic numbers of some classes of the circulant graphs.
△ Less
Submitted 4 June, 2021; v1 submitted 26 May, 2021;
originally announced May 2021.
-
Egocentric Activity Recognition and Localization on a 3D Map
Authors:
Miao Liu,
Lingni Ma,
Kiran Somasundaram,
Yin Li,
Kristen Grauman,
James M. Rehg,
Chao Li
Abstract:
Given a video captured from a first person perspective and the environment context of where the video is recorded, can we recognize what the person is doing and identify where the action occurs in the 3D space? We address this challenging problem of jointly recognizing and localizing actions of a mobile user on a known 3D map from egocentric videos. To this end, we propose a novel deep probabilist…
▽ More
Given a video captured from a first person perspective and the environment context of where the video is recorded, can we recognize what the person is doing and identify where the action occurs in the 3D space? We address this challenging problem of jointly recognizing and localizing actions of a mobile user on a known 3D map from egocentric videos. To this end, we propose a novel deep probabilistic model. Our model takes the inputs of a Hierarchical Volumetric Representation (HVR) of the 3D environment and an egocentric video, infers the 3D action location as a latent variable, and recognizes the action based on the video and contextual cues surrounding its potential locations. To evaluate our model, we conduct extensive experiments on the subset of Ego4D dataset, in which both human naturalistic actions and photo-realistic 3D environment reconstructions are captured. Our method demonstrates strong results on both action recognition and 3D action localization across seen and unseen environments. We believe our work points to an exciting research direction in the intersection of egocentric vision, and 3D scene understanding.
△ Less
Submitted 12 August, 2022; v1 submitted 20 May, 2021;
originally announced May 2021.
-
Total Coloring of some classes of Powers of Cycles
Authors:
Prajnanaswaroopa S,
J Geetha,
K Somasundaram,
Hung-Lin Fu,
N Narayanan
Abstract:
In this paper, we have obtained the total chromatic number of some classes of Cayley graphs, odd graphs and mock threshold graphs.
In this paper, we have obtained the total chromatic number of some classes of Cayley graphs, odd graphs and mock threshold graphs.
△ Less
Submitted 27 January, 2020; v1 submitted 9 October, 2019;
originally announced October 2019.
-
Total Colourings - A survey
Authors:
Geetha Jayabalan,
Narayanan N,
K Somasundaram
Abstract:
The smallest integer $k$ needed for the assignment of colors to the elements so that the coloring is proper (vertices and edges) is called the total chromatic number of a graph. Vizing and Behzed conjectured that the total coloring can be done using at most $Δ(G)+2$ colors, where $Δ(G)$ is the maximum degree of $G$.
It is not settled even for planar graphs. In this paper we give a survey on tota…
▽ More
The smallest integer $k$ needed for the assignment of colors to the elements so that the coloring is proper (vertices and edges) is called the total chromatic number of a graph. Vizing and Behzed conjectured that the total coloring can be done using at most $Δ(G)+2$ colors, where $Δ(G)$ is the maximum degree of $G$.
It is not settled even for planar graphs. In this paper we give a survey on total coloring of graphs.
△ Less
Submitted 14 December, 2018;
originally announced December 2018.
-
A Partially Supervised Bayesian Image Classification Model with Applications in Diagnosis of Sentinel Lymph Node Metastases in Breast Cancer
Authors:
Ying Zhu,
Tom Fearn,
D. Wayne Chicken,
Martin R. Austwick,
Santosh K. Somasundaram,
Charles A. Mosse,
Benjamin Clark,
Irving J. Bigio,
Mohammed R. S. Keshtgar,
Stephen G. Bown
Abstract:
A method has been developed for the analysis of images of sentinel lymph nodes generated by a spectral scanning device. The aim is to classify the nodes, excised during surgery for breast cancer, as normal or metastatic. The data from one node constitute spectra at 86 wavelengths for each pixel of a 20*20 grid. For the analysis, the spectra are reduced to scores on two factors, one derived externa…
▽ More
A method has been developed for the analysis of images of sentinel lymph nodes generated by a spectral scanning device. The aim is to classify the nodes, excised during surgery for breast cancer, as normal or metastatic. The data from one node constitute spectra at 86 wavelengths for each pixel of a 20*20 grid. For the analysis, the spectra are reduced to scores on two factors, one derived externally from a linear discriminant analysis using spectra taken manually from known normal and metastatic tissue, and one derived from the node under investigation to capture variability orthogonal to the external factor. Then a three-group mixture model (normal, metastatic, non-nodal background) using multivariate t distributions is fitted to the scores, with external data being used to specify informative prior distributions for the parameters of the three distributions. A Markov random field prior imposes smoothness on the image generated by the model. Finally, the node is classified as metastatic if any one pixel in this smoothed image is classified as metastatic. The model parameters were tuned on a training set of nodes, and then the tuned model was tested on a separate validation set of nodes, achieving satisfactory sensitivity and specificity. The aim in develo** the analysis was to allow flexibility in the way each node is modelled whilst still using external information. The Bayesian framework employed is ideal for this.
△ Less
Submitted 28 December, 2017;
originally announced December 2017.
-
An End-to-End System for Crowdsourced 3d Maps for Autonomous Vehicles: The Map** Component
Authors:
Onkar Dabeer,
Radhika Gowaikar,
Slawomir K. Grzechnik,
Mythreya J. Lakshman,
Gerhard Reitmayr,
Kiran Somasundaram,
Ravi Teja Sukhavasi,
Xinzhou Wu
Abstract:
Autonomous vehicles rely on precise high definition (HD) 3d maps for navigation. This paper presents the map** component of an end-to-end system for crowdsourcing precise 3d maps with semantically meaningful landmarks such as traffic signs (6 dof pose, shape and size) and traffic lanes (3d splines). The system uses consumer grade parts, and in particular, relies on a single front facing camera a…
▽ More
Autonomous vehicles rely on precise high definition (HD) 3d maps for navigation. This paper presents the map** component of an end-to-end system for crowdsourcing precise 3d maps with semantically meaningful landmarks such as traffic signs (6 dof pose, shape and size) and traffic lanes (3d splines). The system uses consumer grade parts, and in particular, relies on a single front facing camera and a consumer grade GPS. Using real-time sign and lane triangulation on-device in the vehicle, with offline sign/lane clustering across multiple journeys and offline Bundle Adjustment across multiple journeys in the backend, we construct maps with mean absolute accuracy at sign corners of less than 20 cm from 25 journeys. To the best of our knowledge, this is the first end-to-end HD map** pipeline in global coordinates in the automotive context using cost effective sensors.
△ Less
Submitted 31 March, 2017; v1 submitted 29 March, 2017;
originally announced March 2017.
-
A comparative study of clusterhead selection algorithms in wireless sensor networks
Authors:
K. Ramesh,
Dr. K. Somasundaram
Abstract:
In Wireless Sensor Network, sensor nodes life time is the most critical parameter. Many researches on these lifetime extension are motivated by LEACH scheme, which by allowing rotation of cluster head role among the sensor nodes tries to distribute the energy consumption over all nodes in the network. Selection of clusterhead for such rotation greatly affects the energy efficiency of the network.…
▽ More
In Wireless Sensor Network, sensor nodes life time is the most critical parameter. Many researches on these lifetime extension are motivated by LEACH scheme, which by allowing rotation of cluster head role among the sensor nodes tries to distribute the energy consumption over all nodes in the network. Selection of clusterhead for such rotation greatly affects the energy efficiency of the network. Different communication protocols and algorithms are investigated to find ways to reduce power consumption. In this paper brief survey is taken from many proposals, which suggests different clusterhead selection strategies and a global view is presented. Comparison of their costs of clusterhead selection in different rounds, transmission method and other effects like cluster formation, distribution of clusterheads and creation of clusters shows a need of a combined strategy for better results.
△ Less
Submitted 8 May, 2012;
originally announced May 2012.
-
Improved Fair-Zone technique using Mobility Prediction in WSN
Authors:
K. Ramesh,
Dr. K. Somasundaram
Abstract:
The self-organizational ability of ad-hoc Wireless Sensor Networks (WSNs) has led them to be the most popular choice in ubiquitous computing. Clustering sensor nodes organizing them hierarchically have proven to be an effective method to provide better data aggregation and scalability for the sensor network while conserving limited energy. It has some limitation in energy and mobility of nodes. In…
▽ More
The self-organizational ability of ad-hoc Wireless Sensor Networks (WSNs) has led them to be the most popular choice in ubiquitous computing. Clustering sensor nodes organizing them hierarchically have proven to be an effective method to provide better data aggregation and scalability for the sensor network while conserving limited energy. It has some limitation in energy and mobility of nodes. In this paper we propose a mobility prediction technique which tries overcoming above mentioned problems and improves the life time of the network. The technique used here is Exponential Moving Average for online updates of nodal contact probability in cluster based network.
△ Less
Submitted 8 May, 2012;
originally announced May 2012.
-
Multi-Level Coding Efficiency with Improved Quality for Image Compression based on AMBTC
Authors:
K. Somasundaram,
S. Vimala
Abstract:
In this paper, we have proposed an extended version of Absolute Moment Block Truncation Coding (AMBTC) to compress images. Generally the elements of a bitplane used in the variants of Block Truncation Coding (BTC) are of size 1 bit. But it has been extended to two bits in the proposed method. Number of statistical moments preserved to reconstruct the compressed has also been raised from 2 to 4. He…
▽ More
In this paper, we have proposed an extended version of Absolute Moment Block Truncation Coding (AMBTC) to compress images. Generally the elements of a bitplane used in the variants of Block Truncation Coding (BTC) are of size 1 bit. But it has been extended to two bits in the proposed method. Number of statistical moments preserved to reconstruct the compressed has also been raised from 2 to 4. Hence, the quality of the reconstructed images has been improved significantly from 33.62 to 38.12 with the increase in bpp by 1. The increased bpp (3) is further reduced to 1.75in multiple levels: in one level, by drop** 4 elements of the bitplane in such a away that the pixel values of the dropped elements can easily be interpolated with out much of loss in the quality, in level two, eight elements are dropped and reconstructed later and in level three, the size of the statistical moments is reduced. The experiments were carried over standard images of varying intensities. In all the cases, the proposed method outperforms the existing AMBTC technique in terms of both PSNR and bpp.
△ Less
Submitted 7 April, 2012;
originally announced April 2012.