Search | arXiv e-print repository

Realtime Dynamic Gaze Target Tracking and Depth-Level Estimation

Authors: Esmaeil Seraj, Harsh Bhate, Walter Talamonti

Abstract: The integration of Transparent Displays (TD) in various applications, such as Heads-Up Displays (HUDs) in vehicles, is a burgeoning field, poised to revolutionize user experiences. However, this innovation brings forth significant challenges in realtime human-device interaction, particularly in accurately identifying and tracking a user's gaze on dynamically changing TDs. In this paper, we present… ▽ More The integration of Transparent Displays (TD) in various applications, such as Heads-Up Displays (HUDs) in vehicles, is a burgeoning field, poised to revolutionize user experiences. However, this innovation brings forth significant challenges in realtime human-device interaction, particularly in accurately identifying and tracking a user's gaze on dynamically changing TDs. In this paper, we present a two-fold robust and efficient systematic solution for realtime gaze monitoring, comprised of: (1) a tree-based algorithm for identifying and dynamically tracking gaze targets (i.e., moving, size-changing, and overlap** 2D content) projected on a transparent display, in realtime; (2) a multi-stream self-attention architecture to estimate the depth-level of human gaze from eye tracking data, to account for the display's transparency and preventing undesired interactions with the TD. We collected a real-world eye-tracking dataset to train and test our gaze monitoring system. We present extensive results and ablation studies, including inference experiments on System on Chip (SoC) evaluation boards, demonstrating our model's scalability, precision, and realtime feasibility in both static and dynamic contexts. Our solution marks a significant stride in enhancing next-generation user-device interaction and experience, setting a new benchmark for algorithmic gaze monitoring technology in dynamic transparent displays. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2403.06088 [pdf, other]

Towards In-Vehicle Multi-Task Facial Attribute Recognition: Investigating Synthetic Data and Vision Foundation Models

Authors: Esmaeil Seraj, Walter Talamonti

Abstract: In the burgeoning field of intelligent transportation systems, enhancing vehicle-driver interaction through facial attribute recognition, such as facial expression, eye gaze, age, etc., is of paramount importance for safety, personalization, and overall user experience. However, the scarcity of comprehensive large-scale, real-world datasets poses a significant challenge for training robust multi-t… ▽ More In the burgeoning field of intelligent transportation systems, enhancing vehicle-driver interaction through facial attribute recognition, such as facial expression, eye gaze, age, etc., is of paramount importance for safety, personalization, and overall user experience. However, the scarcity of comprehensive large-scale, real-world datasets poses a significant challenge for training robust multi-task models. Existing literature often overlooks the potential of synthetic datasets and the comparative efficacy of state-of-the-art vision foundation models in such constrained settings. This paper addresses these gaps by investigating the utility of synthetic datasets for training complex multi-task models that recognize facial attributes of passengers of a vehicle, such as gaze plane, age, and facial expression. Utilizing transfer learning techniques with both pre-trained Vision Transformer (ViT) and Residual Network (ResNet) models, we explore various training and adaptation methods to optimize performance, particularly when data availability is limited. We provide extensive post-evaluation analysis, investigating the effects of synthetic data distributions on model performance in in-distribution data and out-of-distribution inference. Our study unveils counter-intuitive findings, notably the superior performance of ResNet over ViTs in our specific multi-task context, which is attributed to the mismatch in model complexity relative to task complexity. Our results highlight the challenges and opportunities for enhancing the use of synthetic data and vision foundation models in practical applications. △ Less

Submitted 9 March, 2024; originally announced March 2024.

Comments: Manuscript under peer review

arXiv:2212.14403 [pdf, other]

Utilizing Human Feedback for Primitive Optimization in Wheelchair Tennis

Authors: Arjun Krishna, Zulfiqar Zaidi, Letian Chen, Rohan Paleja, Esmaeil Seraj, Matthew Gombolay

Abstract: Agile robotics presents a difficult challenge with robots moving at high speeds requiring precise and low-latency sensing and control. Creating agile motion that accomplishes the task at hand while being safe to execute is a key requirement for agile robots to gain human trust. This requires designing new approaches that are flexible and maintain knowledge over world constraints. In this paper, we… ▽ More Agile robotics presents a difficult challenge with robots moving at high speeds requiring precise and low-latency sensing and control. Creating agile motion that accomplishes the task at hand while being safe to execute is a key requirement for agile robots to gain human trust. This requires designing new approaches that are flexible and maintain knowledge over world constraints. In this paper, we consider the problem of building a flexible and adaptive controller for a challenging agile mobile manipulation task of hitting ground strokes on a wheelchair tennis robot. We propose and evaluate an extension to work done on learning striking behaviors using a probabilistic movement primitive (ProMP) framework by (1) demonstrating the safe execution of learned primitives on an agile mobile manipulator setup, and (2) proposing an online primitive refinement procedure that utilizes evaluative feedback from humans on the executed trajectories. △ Less

Submitted 29 December, 2022; originally announced December 2022.

Comments: Workshop paper at Learning for Agile Robotics Workshop, CoRL 2022

arXiv:2210.02517 [pdf, other]

Athletic Mobile Manipulator System for Robotic Wheelchair Tennis

Authors: Zulfiqar Zaidi, Daniel Martin, Nathaniel Belles, Viacheslav Zakharov, Arjun Krishna, Kin Man Lee, Peter Wagstaff, Sumedh Naik, Matthew Sklar, Sugju Choi, Yoshiki Kakehi, Ruturaj Patil, Divya Mallemadugula, Florian Pesce, Peter Wilson, Wendell Hom, Matan Diamond, Bryan Zhao, Nina Moorman, Rohan Paleja, Letian Chen, Esmaeil Seraj, Matthew Gombolay

Abstract: Athletics are a quintessential and universal expression of humanity. From French monks who in the 12th century invented jeu de paume, the precursor to modern lawn tennis, back to the K'iche' people who played the Maya Ballgame as a form of religious expression over three thousand years ago, humans have sought to train their minds and bodies to excel in sporting contests. Advances in robotics are o… ▽ More Athletics are a quintessential and universal expression of humanity. From French monks who in the 12th century invented jeu de paume, the precursor to modern lawn tennis, back to the K'iche' people who played the Maya Ballgame as a form of religious expression over three thousand years ago, humans have sought to train their minds and bodies to excel in sporting contests. Advances in robotics are opening up the possibility of robots in sports. Yet, key challenges remain, as most prior works in robotics for sports are limited to pristine sensing environments, do not require significant force generation, or are on miniaturized scales unsuited for joint human-robot play. In this paper, we propose the first open-source, autonomous robot for playing regulation wheelchair tennis. We demonstrate the performance of our full-stack system in executing ground strokes and evaluate each of the system's hardware and software components. The goal of this paper is to (1) inspire more research in human-scale robot athletics and (2) establish the first baseline for a reproducible wheelchair tennis robot for regulation singles play. Our paper contributes to the science of systems design and poses a set of key challenges for the robotics community to address in striving towards robots that can match human capabilities in sports. △ Less

Submitted 7 February, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

Comments: 8 pages, accepted at RA-L, will also be presented at IROS 2023

arXiv:2206.10544 [pdf, other]

Multi-UAV Planning for Cooperative Wildfire Coverage and Tracking with Quality-of-Service Guarantees

Authors: Esmaeil Seraj, Andrew Silva, Matthew Gombolay

Abstract: In recent years, teams of robot and Unmanned Aerial Vehicles (UAVs) have been commissioned by researchers to enable accurate, online wildfire coverage and tracking. While the majority of prior work focuses on the coordination and control of such multi-robot systems, to date, these UAV teams have not been given the ability to reason about a fire's track (i.e., location and propagation dynamics) to… ▽ More In recent years, teams of robot and Unmanned Aerial Vehicles (UAVs) have been commissioned by researchers to enable accurate, online wildfire coverage and tracking. While the majority of prior work focuses on the coordination and control of such multi-robot systems, to date, these UAV teams have not been given the ability to reason about a fire's track (i.e., location and propagation dynamics) to provide performance guarantee over a time horizon. Motivated by the problem of aerial wildfire monitoring, we propose a predictive framework which enables cooperation in multi-UAV teams towards collaborative field coverage and fire tracking with probabilistic performance guarantee. Our approach enables UAVs to infer the latent fire propagation dynamics for time-extended coordination in safety-critical conditions. We derive a set of novel, analytical temporal, and tracking-error bounds to enable the UAV-team to distribute their limited resources and cover the entire fire area according to the case-specific estimated states and provide a probabilistic performance guarantee. Our results are not limited to the aerial wildfire monitoring case-study and are generally applicable to problems, such as search-and-rescue, target tracking and border patrol. We evaluate our approach in simulation and provide demonstrations of the proposed framework on a physical multi-robot testbed to account for real robot dynamics and restrictions. Our quantitative evaluations validate the performance of our method accumulating 7.5x and 9.0x smaller tracking-error than state-of-the-art model-based and reinforcement learning benchmarks, respectively. △ Less

Submitted 21 June, 2022; originally announced June 2022.

Comments: To appear in the journal of Autonomous Agents and Multi-Agent Systems (AAMAS)

arXiv:2201.08484 [pdf, other]

Iterated Reasoning with Mutual Information in Cooperative and Byzantine Decentralized Teaming

Authors: Sachin Konan, Esmaeil Seraj, Matthew Gombolay

Abstract: Information sharing is key in building team cognition and enables coordination and cooperation. High-performing human teams also benefit from acting strategically with hierarchical levels of iterated communication and rationalizability, meaning a human agent can reason about the actions of their teammates in their decision-making. Yet, the majority of prior work in Multi-Agent Reinforcement Learni… ▽ More Information sharing is key in building team cognition and enables coordination and cooperation. High-performing human teams also benefit from acting strategically with hierarchical levels of iterated communication and rationalizability, meaning a human agent can reason about the actions of their teammates in their decision-making. Yet, the majority of prior work in Multi-Agent Reinforcement Learning (MARL) does not support iterated rationalizability and only encourage inter-agent communication, resulting in a suboptimal equilibrium cooperation strategy. In this work, we show that reformulating an agent's policy to be conditional on the policies of its neighboring teammates inherently maximizes Mutual Information (MI) lower-bound when optimizing under Policy Gradient (PG). Building on the idea of decision-making under bounded rationality and cognitive hierarchy theory, we show that our modified PG approach not only maximizes local agent rewards but also implicitly reasons about MI between agents without the need for any explicit ad-hoc regularization terms. Our approach, InfoPG, outperforms baselines in learning emergent collaborative behaviors and sets the state-of-the-art in decentralized cooperative MARL tasks. Our experiments validate the utility of InfoPG by achieving higher sample efficiency and significantly larger cumulative reward in several complex cooperative multi-agent domains. △ Less

Submitted 24 June, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

Comments: The first two authors contributed equally to this work (Published in ICLR 2022)

Journal ref: International Conference on Learning Representations 2022

arXiv:2108.09568 [pdf, other]

Heterogeneous Graph Attention Networks for Learning Diverse Communication

Authors: Esmaeil Seraj, Zheyuan Wang, Rohan Paleja, Matthew Sklar, Anirudh Patel, Matthew Gombolay

Abstract: Multi-agent teaming achieves better performance when there is communication among participating agents allowing them to coordinate their actions for maximizing shared utility. However, when collaborating a team of agents with different action and observation spaces, information sharing is not straightforward and requires customized communication protocols, depending on sender and receiver types. W… ▽ More Multi-agent teaming achieves better performance when there is communication among participating agents allowing them to coordinate their actions for maximizing shared utility. However, when collaborating a team of agents with different action and observation spaces, information sharing is not straightforward and requires customized communication protocols, depending on sender and receiver types. Without properly modeling such heterogeneity in agents, communication becomes less helpful and could even deteriorate the multi-agent cooperation performance. We propose heterogeneous graph attention networks, called HetNet, to learn efficient and diverse communication models for coordinating heterogeneous agents towards accomplishing tasks that are of collaborative nature. We propose a Multi-Agent Heterogeneous Actor-Critic (MAHAC) learning paradigm to obtain collaborative per-class policies and effective communication protocols for composite robot teams. Our proposed framework is evaluated against multiple baselines in a complex environment in which agents of different types must communicate and cooperate to satisfy the objectives. Experimental results show that HetNet outperforms the baselines in learning sophisticated multi-agent communication protocols by achieving $\sim$10\% improvements in performance metrics. △ Less

Submitted 28 October, 2021; v1 submitted 21 August, 2021; originally announced August 2021.

arXiv:2011.00165 [pdf, other]

FireCommander: An Interactive, Probabilistic Multi-agent Environment for Heterogeneous Robot Teams

Authors: Esmaeil Seraj, Xiyang Wu, Matthew Gombolay

Abstract: The purpose of this tutorial is to help individuals use the \underline{FireCommander} game environment for research applications. The FireCommander is an interactive, probabilistic joint perception-action reconnaissance environment in which a composite team of agents (e.g., robots) cooperate to fight dynamic, propagating firespots (e.g., targets). In FireCommander game, a team of agents must be ta… ▽ More The purpose of this tutorial is to help individuals use the \underline{FireCommander} game environment for research applications. The FireCommander is an interactive, probabilistic joint perception-action reconnaissance environment in which a composite team of agents (e.g., robots) cooperate to fight dynamic, propagating firespots (e.g., targets). In FireCommander game, a team of agents must be tasked to optimally deal with a wildfire situation in an environment with propagating fire areas and some facilities such as houses, hospitals, power stations, etc. The team of agents can accomplish their mission by first sensing (e.g., estimating fire states), communicating the sensed fire-information among each other and then taking action to put the firespots out based on the sensed information (e.g., drop** water on estimated fire locations). The FireCommander environment can be useful for research topics spanning a wide range of applications from Reinforcement Learning (RL) and Learning from Demonstration (LfD), to Coordination, Psychology, Human-Robot Interaction (HRI) and Teaming. There are four important facets of the FireCommander environment that overall, create a non-trivial game: (1) Complex Objectives: Multi-objective Stochastic Environment, (2)Probabilistic Environment: Agents' actions result in probabilistic performance, (3) Hidden Targets: Partially Observable Environment and, (4) Uni-task Robots: Perception-only and Action-only agents. The FireCommander environment is first-of-its-kind in terms of including Perception-only and Action-only agents for coordination. It is a general multi-purpose game that can be useful in a variety of combinatorial optimization problems and stochastic games, such as applications of Reinforcement Learning (RL), Learning from Demonstration (LfD) and Inverse RL (iRL). △ Less

Submitted 27 October, 2021; v1 submitted 30 October, 2020; originally announced November 2020.

arXiv:2006.07969 [pdf, other]

Coordinated Control of UAVs for Human-Centered Active Sensing of Wildfires

Authors: Esmaeil Seraj, Matthew Gombolay

Abstract: Fighting wildfires is a precarious task, imperiling the lives of engaging firefighters and those who reside in the fire's path. Firefighters need online and dynamic observation of the firefront to anticipate a wildfire's unknown characteristics, such as size, scale, and propagation velocity, and to plan accordingly. In this paper, we propose a distributed control framework to coordinate a team of… ▽ More Fighting wildfires is a precarious task, imperiling the lives of engaging firefighters and those who reside in the fire's path. Firefighters need online and dynamic observation of the firefront to anticipate a wildfire's unknown characteristics, such as size, scale, and propagation velocity, and to plan accordingly. In this paper, we propose a distributed control framework to coordinate a team of unmanned aerial vehicles (UAVs) for a human-centered active sensing of wildfires. We develop a dual-criterion objective function based on Kalman uncertainty residual propagation and weighted multi-agent consensus protocol, which enables the UAVs to actively infer the wildfire dynamics and parameters, track and monitor the fire transition, and safely manage human firefighters on the ground using acquired information. We evaluate our approach relative to prior work, showing significant improvements by reducing the environment's cumulative uncertainty residual by more than $ 10^2 $ and $ 10^5 $ times in firefront coverage performance to support human-robot teaming for firefighting. We also demonstrate our method on physical robots in a mock firefighting exercise. △ Less

Submitted 14 June, 2020; originally announced June 2020.

arXiv:1907.02862 [pdf, other]

Essential Motor Cortex Signal Processing: an ERP and functional connectivity MATLAB toolbox -- user guide version 2.0

Authors: Esmaeil Seraj, Karthiga Mahalingam

Abstract: The purpose of this document is to help individuals use the "Essential Motor Cortex Signal Processing MATLAB Toolbox". The toolbox implements various methods for three major aspects of investigating human motor cortex from Neuroscience view point: (1) ERP estimation and quantification, (2) Cortical Functional Connectivity analysis and (3) EMG quantification. The toolbox -- which is distributed und… ▽ More The purpose of this document is to help individuals use the "Essential Motor Cortex Signal Processing MATLAB Toolbox". The toolbox implements various methods for three major aspects of investigating human motor cortex from Neuroscience view point: (1) ERP estimation and quantification, (2) Cortical Functional Connectivity analysis and (3) EMG quantification. The toolbox -- which is distributed under the terms of the GNU GENERAL PUBLIC LICENSE as a set of MATLAB R routines -- can be downloaded directly at the address: http://oset.ir/category.php?dir=Tools or from the public repository on GitHub, at address below: https://github.com/EsiSeraj/ERP Connectivity EMG Analysis The purpose of this toolbox is threefold: 1. Extract the event-related-potential (ERP) from preprocessed cerebral signals (i.e. EEG, MEG, etc.), identify and then quantify the event-related synchronization/desynchronization (ERS/ERD) events. Both time-course dynamics and time-frequency (TF) analyzes are included. 2. Measure, quantify and demonstrate the cortical functional connectivity (CFC) across scalp electrodes. These set of functions can also be applied to various types of cerebral signals (i.e. electric and magnetic). 3. Quantify electromyogram (EMG) recorded from active muscles during performing motor tasks. △ Less

Submitted 22 July, 2020; v1 submitted 14 June, 2019; originally announced July 2019.

Comments: 37 pages, 12 figures

arXiv:1904.07441 [pdf, other]

fMRI Based Cerebral Instantaneous Parameters for Automatic Alzheimer's, Mild Cognitive Impairment and Healthy Subject Classification

Authors: Esmaeil Seraj, Mehran Yazdi, Nastaran Shahparian

Abstract: Automatic identification and categorization of Alzheimer's patients and the ability to distinguish between different levels of this disease can be very helpful to the research community in this field, since other non-automatic approaches are very time-consuming and are highly dependent on experts' experience. Herein, we propose the utility of cerebral instantaneous phase and envelope information i… ▽ More Automatic identification and categorization of Alzheimer's patients and the ability to distinguish between different levels of this disease can be very helpful to the research community in this field, since other non-automatic approaches are very time-consuming and are highly dependent on experts' experience. Herein, we propose the utility of cerebral instantaneous phase and envelope information in order to discriminate between Alzheimer's patients, MCI subjects and healthy normal individuals from functional magnetic resonance imaging (fMRI) data. To this end, after performing the region-of-interest (ROI) analysis on fMRI data, different features covering power, entropy and coherency aspects of data are derived from instantaneous phase and envelope sequences of ROI signals. Various sets of features are calculated and fed to a sequential forward floating feature selection (SFFFS) to choose the most discriminative and informative sets of features. A Student's t-test has been used to select the most relevant features from chosen sets. Finally, a K-NN classifier is used to distinguish between classes in a three-class categorization problem. The reported performance in overall accuracy using fMRI data of 111 combined subjects, is 80.1% with 80.0% Sensitivity to both Alzheimer's and Normal categories distinction and is comparable to the state-of-the-art approaches recently proposed in this regard. The significance of obtained results was statistically confirmed by evaluating through standard classification performance indicators. The obtained results illustrate that introduced analytic phase and envelope feature indexes derived from the ROI signals are significantly discriminative in distinguishing between Alzheimer's patients and Normal healthy subject. △ Less

Submitted 15 April, 2019; originally announced April 2019.

arXiv:1903.06847 [pdf, other]

Safe Coordination of Human-Robot Firefighting Teams

Authors: Esmaeil Seraj, Andrew Silva, Matthew Gombolay

Abstract: Wildfires are destructive and inflict massive, irreversible harm to victims' lives and natural resources. Researchers have proposed commissioning unmanned aerial vehicles (UAVs) to provide firefighters with real-time tracking information; yet, these UAVs are not able to reason about a fire's track, including current location, measurement, and uncertainty, as well as propagation. We propose a model… ▽ More Wildfires are destructive and inflict massive, irreversible harm to victims' lives and natural resources. Researchers have proposed commissioning unmanned aerial vehicles (UAVs) to provide firefighters with real-time tracking information; yet, these UAVs are not able to reason about a fire's track, including current location, measurement, and uncertainty, as well as propagation. We propose a model-predictive, probabilistically safe distributed control algorithm for human-robot collaboration in wildfire fighting. The proposed algorithm overcomes the limitations of prior work by explicitly estimating the latent fire propagation dynamics to enable intelligent, time-extended coordination of the UAVs in support of on-the-ground human firefighters. We derive a novel, analytical bound that enables UAVs to distribute their resources and provides a probabilistic guarantee of the humans' safety while preserving the UAVs' ability to cover an entire fire. △ Less

Submitted 15 March, 2019; originally announced March 2019.

arXiv:1612.04295 [pdf, ps, other]

Cerebral Synchrony Assessment Tutorial: A General Review on Cerebral Signals' Synchronization Estimation Concepts and Methods

Authors: Esmaeil Seraj

Abstract: The human brain is ultimately responsible for all thoughts and movements that the body produces. This allows humans to successfully interact with their environment. If the brain is not functioning properly many abilities of human can be damaged. The goal of cerebral signal analysis is to learn about brain function. The idea that distinct areas of the brain are responsible for specific tasks, the f… ▽ More The human brain is ultimately responsible for all thoughts and movements that the body produces. This allows humans to successfully interact with their environment. If the brain is not functioning properly many abilities of human can be damaged. The goal of cerebral signal analysis is to learn about brain function. The idea that distinct areas of the brain are responsible for specific tasks, the functional segregation, is a key aspect of brain function. Functional integration is an important feature of brain function, it is the concordance of multiple segregated brain areas to produce a unified response. There is an amplified feedback mechanism in the brain called reentry which requires specific timing relations. This specific timing requires neurons within an assembly to synchronize their firing rates. This has led to increased interest and use of phase variables, particularly their synchronization, to measure connectivity in cerebral signals. Herein, we propose a comprehensive review on concepts and methods previously presented for assessing cerebral synchrony, with focus on phase synchronization, as a tool for brain connectivity evaluation. △ Less

Submitted 5 July, 2018; v1 submitted 12 December, 2016; originally announced December 2016.

arXiv:1610.02249 [pdf, other]

Cerebral Signal Instantaneous Parameters Estimation MATLAB Toolbox - User Guide Version 2.3

Authors: Esmaeil Seraj

Abstract: This document is meant to help individuals use the Cerebral Signal Phase Analysis toolbox which implements different methods for estimating the instantaneous phase and frequency of a signal and calculating some related popular quantities.The toolbox -- which is distributed under the terms of the GNU GENERAL PUBLIC LICENSE as a set of MATLAB routines -- can be downloaded at the address http://oset.… ▽ More This document is meant to help individuals use the Cerebral Signal Phase Analysis toolbox which implements different methods for estimating the instantaneous phase and frequency of a signal and calculating some related popular quantities.The toolbox -- which is distributed under the terms of the GNU GENERAL PUBLIC LICENSE as a set of MATLAB routines -- can be downloaded at the address http://oset.ir/category.php?dir=Tools.The purpose of this toolbox is to calculate the instantaneous phase and frequency sequences of cerebral signals (EEG, MEG, etc.) and some related popular features and quantities in brain studies and Neuroscience such as Phase Shift, Phase Resetting, Phase Locking Value (PLV), Phase Difference and more, to help researchers in these fields. △ Less

Submitted 5 July, 2018; v1 submitted 7 October, 2016; originally announced October 2016.

Showing 1–14 of 14 results for author: Seraj, E