-
Activity-Biometrics: Person Identification from Daily Activities
Authors:
Shehreen Azad,
Yogesh Singh Rawat
Abstract:
In this work, we study a novel problem which focuses on person identification while performing daily activities. Learning biometric features from RGB videos is challenging due to spatio-temporal complexity and presence of appearance biases such as clothing color and background. We propose ABNet, a novel framework which leverages disentanglement of biometric and non-biometric features to perform ef…
▽ More
In this work, we study a novel problem which focuses on person identification while performing daily activities. Learning biometric features from RGB videos is challenging due to spatio-temporal complexity and presence of appearance biases such as clothing color and background. We propose ABNet, a novel framework which leverages disentanglement of biometric and non-biometric features to perform effective person identification from daily activities. ABNet relies on a bias-less teacher to learn biometric features from RGB videos and explicitly disentangle non-biometric features with the help of biometric distortion. In addition, ABNet also exploits activity prior for biometrics which is enabled by joint biometric and activity learning. We perform comprehensive evaluation of the proposed approach across five different datasets which are derived from existing activity recognition benchmarks. Furthermore, we extensively compare ABNet with existing works in person identification and demonstrate its effectiveness for activity-based biometrics across all five datasets. The code and dataset can be accessed at: \url{https://github.com/sacrcv/Activity-Biometrics/}
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
RENOVI: A Benchmark Towards Remediating Norm Violations in Socio-Cultural Conversations
Authors:
Haolan Zhan,
Zhuang Li,
Xiaoxi Kang,
Tao Feng,
Yuncheng Hua,
Lizhen Qu,
Yi Ying,
Mei Rianto Chandra,
Kelly Rosalin,
Jureynolds Jureynolds,
Suraj Sharma,
Shilin Qu,
Linhao Luo,
Lay-Ki Soon,
Zhaleh Semnani Azad,
Ingrid Zukerman,
Gholamreza Haffari
Abstract:
Norm violations occur when individuals fail to conform to culturally accepted behaviors, which may lead to potential conflicts. Remediating norm violations requires social awareness and cultural sensitivity of the nuances at play. To equip interactive AI systems with a remediation ability, we offer ReNoVi - a large-scale corpus of 9,258 multi-turn dialogues annotated with social norms, as well as…
▽ More
Norm violations occur when individuals fail to conform to culturally accepted behaviors, which may lead to potential conflicts. Remediating norm violations requires social awareness and cultural sensitivity of the nuances at play. To equip interactive AI systems with a remediation ability, we offer ReNoVi - a large-scale corpus of 9,258 multi-turn dialogues annotated with social norms, as well as define a sequence of tasks to help understand and remediate norm violations step by step. ReNoVi consists of two parts: 512 human-authored dialogues (real data), and 8,746 synthetic conversations generated by ChatGPT through prompt learning. While collecting sufficient human-authored data is costly, synthetic conversations provide suitable amounts of data to help mitigate the scarcity of training data, as well as the chance to assess the alignment between LLMs and humans in the awareness of social norms. We thus harness the power of ChatGPT to generate synthetic training data for our task. To ensure the quality of both human-authored and synthetic data, we follow a quality control protocol during data collection. Our experimental results demonstrate the importance of remediating norm violations in socio-cultural conversations, as well as the improvement in performance obtained from synthetic data.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Let's Negotiate! A Survey of Negotiation Dialogue Systems
Authors:
Haolan Zhan,
Yufei Wang,
Tao Feng,
Yuncheng Hua,
Suraj Sharma,
Zhuang Li,
Lizhen Qu,
Zhaleh Semnani Azad,
Ingrid Zukerman,
Gholamreza Haffari
Abstract:
Negotiation is a crucial ability in human communication. Recently, there has been a resurgent research interest in negotiation dialogue systems, whose goal is to create intelligent agents that can assist people in resolving conflicts or reaching agreements. Although there have been many explorations into negotiation dialogue systems, a systematic review of this task has not been performed to date.…
▽ More
Negotiation is a crucial ability in human communication. Recently, there has been a resurgent research interest in negotiation dialogue systems, whose goal is to create intelligent agents that can assist people in resolving conflicts or reaching agreements. Although there have been many explorations into negotiation dialogue systems, a systematic review of this task has not been performed to date. We aim to fill this gap by investigating recent studies in the field of negotiation dialogue systems, and covering benchmarks, evaluations and methodologies within the literature. We also discuss potential future directions, including multi-modal, multi-party and cross-cultural negotiation scenarios. Our goal is to provide the community with a systematic overview of negotiation dialogue systems and to inspire future research.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Enhancing the security of image transmission in Quantum era: A Chaos-Assisted QKD Approach using entanglement
Authors:
Raiyan Rahman,
Md Shawmoon Azad,
Mohammed Rakibul Hasan,
Syed Emad Uddin Shubha,
M. R. C. Mahdy
Abstract:
The emergence of quantum computing has introduced unprecedented security challenges to conventional cryptographic systems, particularly in the domain of optical communications. This research addresses these challenges by innovatively combining quantum key distribution (QKD), specifically the E91 protocol, with logistic chaotic maps to establish a secure image transmission scheme. Our approach util…
▽ More
The emergence of quantum computing has introduced unprecedented security challenges to conventional cryptographic systems, particularly in the domain of optical communications. This research addresses these challenges by innovatively combining quantum key distribution (QKD), specifically the E91 protocol, with logistic chaotic maps to establish a secure image transmission scheme. Our approach utilizes the unpredictability of chaotic systems alongside the robust security mechanisms inherent in quantum entanglement. The scheme is further fortified with an eavesdrop** detection mechanism based on CHSH inequality, thereby enhancing its resilience against unauthorized access. Through quantitative simulations, we demonstrate the effectiveness of this scheme in encrypting images, achieving high entropy and sensitivity to the original images. The results indicate a significant improvement in encryption and decryption efficiency, showcasing the potential of the scheme as a viable solution against the vulnerabilities posed by quantum computing advancements. Our research offers a novel perspective in secure optical communications, blending the principles of chaos theory with QKD to create a more robust cryptographic framework.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Robustness Analysis on Foundational Segmentation Models
Authors:
Madeline Chantry Schiappa,
Shehreen Azad,
Sachidanand VS,
Yunhao Ge,
Ondrej Miksik,
Yogesh S. Rawat,
Vibhav Vineet
Abstract:
Due to the increase in computational resources and accessibility of data, an increase in large, deep learning models trained on copious amounts of multi-modal data using self-supervised or semi-supervised learning have emerged. These ``foundation'' models are often adapted to a variety of downstream tasks like classification, object detection, and segmentation with little-to-no training on the tar…
▽ More
Due to the increase in computational resources and accessibility of data, an increase in large, deep learning models trained on copious amounts of multi-modal data using self-supervised or semi-supervised learning have emerged. These ``foundation'' models are often adapted to a variety of downstream tasks like classification, object detection, and segmentation with little-to-no training on the target dataset. In this work, we perform a robustness analysis of Visual Foundation Models (VFMs) for segmentation tasks and focus on robustness against real-world distribution shift inspired perturbations. We benchmark seven state-of-the-art segmentation architectures using 2 different perturbed datasets, MS COCO-P and ADE20K-P, with 17 different perturbations with 5 severity levels each. Our findings reveal several key insights: (1) VFMs exhibit vulnerabilities to compression-induced corruptions, (2) despite not outpacing all of unimodal models in robustness, multimodal models show competitive resilience in zero-shot scenarios, and (3) VFMs demonstrate enhanced robustness for certain object categories. These observations suggest that our robustness evaluation framework sets new requirements for foundational models, encouraging further advancements to bolster their adaptability and performance. The code and dataset is available at: \url{https://tinyurl.com/fm-robust}.
△ Less
Submitted 26 April, 2024; v1 submitted 15 June, 2023;
originally announced June 2023.
-
Probing Conceptual Understanding of Large Visual-Language Models
Authors:
Madeline Schiappa,
Raiyaan Abdullah,
Shehreen Azad,
Jared Claypoole,
Michael Cogswell,
Ajay Divakaran,
Yogesh Rawat
Abstract:
In recent years large visual-language (V+L) models have achieved great success in various downstream tasks. However, it is not well studied whether these models have a conceptual grasp of the visual content. In this work we focus on conceptual understanding of these large V+L models. To facilitate this study, we propose novel benchmarking datasets for probing three different aspects of content und…
▽ More
In recent years large visual-language (V+L) models have achieved great success in various downstream tasks. However, it is not well studied whether these models have a conceptual grasp of the visual content. In this work we focus on conceptual understanding of these large V+L models. To facilitate this study, we propose novel benchmarking datasets for probing three different aspects of content understanding, 1) \textit{relations}, 2) \textit{composition}, and 3) \textit{context}. Our probes are grounded in cognitive science and help determine if a V+L model can, for example, determine if snow garnished with a man is implausible, or if it can identify beach furniture by knowing it is located on a beach. We experimented with many recent state-of-the-art V+L models and observe that these models mostly \textit{fail to demonstrate} a conceptual understanding. This study reveals several interesting insights such as that \textit{cross-attention} helps learning conceptual understanding, and that CNNs are better with \textit{texture and patterns}, while Transformers are better at \textit{color and shape}. We further utilize some of these insights and investigate a \textit{simple finetuning technique} that rewards the three conceptual understanding measures with promising initial results. The proposed benchmarks will drive the community to delve deeper into conceptual understanding and foster advancements in the capabilities of large V+L models. The code and dataset is available at: \url{https://tinyurl.com/vlm-robustness}
△ Less
Submitted 26 April, 2024; v1 submitted 7 April, 2023;
originally announced April 2023.
-
Safe Imitation Learning of Nonlinear Model Predictive Control for Flexible Robots
Authors:
Shamil Mamedov,
Rudolf Reiter,
Seyed Mahdi Basiri Azad,
Joschka Boedecker,
Moritz Diehl,
Jan Swevers
Abstract:
Flexible robots may overcome some of the industry's major challenges, such as enabling intrinsically safe human-robot collaboration and achieving a higher load-to-mass ratio. However, controlling flexible robots is complicated due to their complex dynamics, which include oscillatory behavior and a high-dimensional state space. NMPC offers an effective means to control such robots, but its extensiv…
▽ More
Flexible robots may overcome some of the industry's major challenges, such as enabling intrinsically safe human-robot collaboration and achieving a higher load-to-mass ratio. However, controlling flexible robots is complicated due to their complex dynamics, which include oscillatory behavior and a high-dimensional state space. NMPC offers an effective means to control such robots, but its extensive computational demands often limit its application in real-time scenarios. To enable fast control of flexible robots, we propose a framework for a safe approximation of NMPC using imitation learning and a predictive safety filter. Our framework significantly reduces computation time while incurring a slight loss in performance. Compared to NMPC, our framework shows more than a eightfold improvement in computation time when controlling a three-dimensional flexible robot arm in simulation, all while guaranteeing safety constraints. Notably, our approach outperforms conventional reinforcement learning methods. The development of fast and safe approximate NMPC holds the potential to accelerate the adoption of flexible robots in industry.
△ Less
Submitted 28 September, 2023; v1 submitted 6 December, 2022;
originally announced December 2022.
-
CLUTR: Curriculum Learning via Unsupervised Task Representation Learning
Authors:
Abdus Salam Azad,
Izzeddin Gur,
Jasper Emhoff,
Nathaniel Alexis,
Aleksandra Faust,
Pieter Abbeel,
Ion Stoica
Abstract:
Reinforcement Learning (RL) algorithms are often known for sample inefficiency and difficult generalization. Recently, Unsupervised Environment Design (UED) emerged as a new paradigm for zero-shot generalization by simultaneously learning a task distribution and agent policies on the generated tasks. This is a non-stationary process where the task distribution evolves along with agent policies; cr…
▽ More
Reinforcement Learning (RL) algorithms are often known for sample inefficiency and difficult generalization. Recently, Unsupervised Environment Design (UED) emerged as a new paradigm for zero-shot generalization by simultaneously learning a task distribution and agent policies on the generated tasks. This is a non-stationary process where the task distribution evolves along with agent policies; creating an instability over time. While past works demonstrated the potential of such approaches, sampling effectively from the task space remains an open challenge, bottlenecking these approaches. To this end, we introduce CLUTR: a novel unsupervised curriculum learning algorithm that decouples task representation and curriculum learning into a two-stage optimization. It first trains a recurrent variational autoencoder on randomly generated tasks to learn a latent task manifold. Next, a teacher agent creates a curriculum by maximizing a minimax REGRET-based objective on a set of latent tasks sampled from this manifold. Using the fixed-pretrained task manifold, we show that CLUTR successfully overcomes the non-stationarity problem and improves stability. Our experimental results show CLUTR outperforms PAIRED, a principled and popular UED method, in the challenging CarRacing and navigation environments: achieving 10.6X and 45\% improvement in zero-shot generalization, respectively. CLUTR also performs comparably to the non-UED state-of-the-art for CarRacing, while requiring 500X fewer environment interactions.
△ Less
Submitted 7 March, 2023; v1 submitted 18 October, 2022;
originally announced October 2022.
-
T3VIP: Transformation-based 3D Video Prediction
Authors:
Iman Nematollahi,
Erick Rosete-Beas,
Seyed Mahdi B. Azad,
Raghu Rajan,
Frank Hutter,
Wolfram Burgard
Abstract:
For autonomous skill acquisition, robots have to learn about the physical rules governing the 3D world dynamics from their own past experience to predict and reason about plausible future outcomes. To this end, we propose a transformation-based 3D video prediction (T3VIP) approach that explicitly models the 3D motion by decomposing a scene into its object parts and predicting their corresponding r…
▽ More
For autonomous skill acquisition, robots have to learn about the physical rules governing the 3D world dynamics from their own past experience to predict and reason about plausible future outcomes. To this end, we propose a transformation-based 3D video prediction (T3VIP) approach that explicitly models the 3D motion by decomposing a scene into its object parts and predicting their corresponding rigid transformations. Our model is fully unsupervised, captures the stochastic nature of the real world, and the observational cues in image and point cloud domains constitute its learning signals. To fully leverage all the 2D and 3D observational signals, we equip our model with automatic hyperparameter optimization (HPO) to interpret the best way of learning from them. To the best of our knowledge, our model is the first generative model that provides an RGB-D video prediction of the future for a static camera. Our extensive evaluation with simulated and real-world datasets demonstrates that our formulation leads to interpretable 3D models that predict future depth videos while achieving on-par performance with 2D models on RGB video prediction. Moreover, we demonstrate that our model outperforms 2D baselines on visuomotor control. Videos, code, dataset, and pre-trained models are available at http://t3vip.cs.uni-freiburg.de.
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
Harmony Search: Current Studies and Uses on Healthcare Systems
Authors:
Maryam T. Abdulkhaleq,
Tarik A. Rashid,
Abeer Alsadoon,
Bryar A. Hassan,
Mokhtar Mohammadi,
Jaza M. Abdullah,
Amit Chhabra,
Sazan L. Ali,
Rawshan N. Othman,
Hadil A. Hasan,
Sara Azad,
Naz A. Mahmood,
Sivan S. Abdalrahman,
Hezha O. Rasul,
Nebojsa Bacanin,
S. Vimal
Abstract:
One of the popular metaheuristic search algorithms is Harmony Search (HS). It has been verified that HS can find solutions to optimization problems due to its balanced exploratory and convergence behavior and its simple and flexible structure. This capability makes the algorithm preferable to be applied in several real-world applications in various fields, including healthcare systems, different e…
▽ More
One of the popular metaheuristic search algorithms is Harmony Search (HS). It has been verified that HS can find solutions to optimization problems due to its balanced exploratory and convergence behavior and its simple and flexible structure. This capability makes the algorithm preferable to be applied in several real-world applications in various fields, including healthcare systems, different engineering fields, and computer science. The popularity of HS urges us to provide a comprehensive survey of the literature on HS and its variants on health systems, analyze its strengths and weaknesses, and suggest future research directions. In this review paper, the current studies and uses of harmony search are studied in four main domains. (i) The variants of HS, including its modifications and hybridization. (ii) Summary of the previous review works. (iii) Applications of HS in healthcare systems. (iv) And finally, an operational framework is proposed for the applications of HS in healthcare systems. The main contribution of this review is intended to provide a thorough examination of HS in healthcare systems while also serving as a valuable resource for prospective scholars who want to investigate or implement this method.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Scenic4RL: Programmatic Modeling and Generation of Reinforcement Learning Environments
Authors:
Abdus Salam Azad,
Edward Kim,
Qiancheng Wu,
Kimin Lee,
Ion Stoica,
Pieter Abbeel,
Sanjit A. Seshia
Abstract:
The capability of a reinforcement learning (RL) agent heavily depends on the diversity of the learning scenarios generated by the environment. Generation of diverse realistic scenarios is challenging for real-time strategy (RTS) environments. The RTS environments are characterized by intelligent entities/non-RL agents cooperating and competing with the RL agents with large state and action spaces…
▽ More
The capability of a reinforcement learning (RL) agent heavily depends on the diversity of the learning scenarios generated by the environment. Generation of diverse realistic scenarios is challenging for real-time strategy (RTS) environments. The RTS environments are characterized by intelligent entities/non-RL agents cooperating and competing with the RL agents with large state and action spaces over a long period of time, resulting in an infinite space of feasible, but not necessarily realistic, scenarios involving complex interaction among different RL and non-RL agents. Yet, most of the existing simulators rely on randomly generating the environments based on predefined settings/layouts and offer limited flexibility and control over the environment dynamics for researchers to generate diverse, realistic scenarios as per their demand. To address this issue, for the first time, we formally introduce the benefits of adopting an existing formal scenario specification language, SCENIC, to assist researchers to model and generate diverse scenarios in an RTS environment in a flexible, systematic, and programmatic manner. To showcase the benefits, we interfaced SCENIC to an existing RTS environment Google Research Football(GRF) simulator and introduced a benchmark consisting of 32 realistic scenarios, encoded in SCENIC, to train RL agents and testing their generalization capabilities. We also show how researchers/RL practitioners can incorporate their domain knowledge to expedite the training process by intuitively modeling stochastic programmatic policies with SCENIC.
△ Less
Submitted 28 March, 2023; v1 submitted 18 June, 2021;
originally announced June 2021.
-
Hybrid Henry Gas Solubility Optimization Algorithm with Dynamic Cluster-to-Algorithm Map** for Search-based Software Engineering Problems
Authors:
Kamal Z. Zamli,
Md. Abdul Kader,
Saiful Azad,
Bestoun S. Ahmed
Abstract:
This paper discusses a new variant of the Henry Gas Solubility Optimization (HGSO) Algorithm, called Hybrid HGSO (HHGSO). Unlike its predecessor, HHGSO allows multiple clusters serving different individual meta-heuristic algorithms (i.e., with its own defined parameters and local best) to coexist within the same population. Exploiting the dynamic cluster-to-algorithm map** via penalized and rewa…
▽ More
This paper discusses a new variant of the Henry Gas Solubility Optimization (HGSO) Algorithm, called Hybrid HGSO (HHGSO). Unlike its predecessor, HHGSO allows multiple clusters serving different individual meta-heuristic algorithms (i.e., with its own defined parameters and local best) to coexist within the same population. Exploiting the dynamic cluster-to-algorithm map** via penalized and reward model with adaptive switching factor, HHGSO offers a novel approach for meta-heuristic hybridization consisting of Jaya Algorithm, Sooty Tern Optimization Algorithm, Butterfly Optimization Algorithm, and Owl Search Algorithm, respectively. The acquired results from the selected two case studies (i.e., involving team formation problem and combinatorial test suite generation) indicate that the hybridization has notably improved the performance of HGSO and gives superior performance against other competing meta-heuristic and hyper-heuristic algorithms.
△ Less
Submitted 31 May, 2021;
originally announced May 2021.
-
A Common Framework for Audience Interactivity
Authors:
Alina Striner,
Sasha Azad,
Chris Martens
Abstract:
Audience interactivity is interpreted differently across domains. This research develops a framework to describe audience interactivity across a broad range of experiences. We build on early work characterizing child audience interactivity experiences, expanding on these findings with an extensive review of literature in theater, games, and theme parks, paired with expert interviews in those domai…
▽ More
Audience interactivity is interpreted differently across domains. This research develops a framework to describe audience interactivity across a broad range of experiences. We build on early work characterizing child audience interactivity experiences, expanding on these findings with an extensive review of literature in theater, games, and theme parks, paired with expert interviews in those domains. The framework scaffolds interactivity as nested spheres of audience influence, and comprises a series of dimensions of audience interactivity including a Spectrum of Audience Interactivity. This framework aims to develop a common taxonomy for researchers and practitioners working with audience interactivity experiences.
△ Less
Submitted 27 March, 2018; v1 submitted 9 October, 2017;
originally announced October 2017.
-
Holistic Approach for Fault-Tolerant Network-on-Chip based Many-Core Systems
Authors:
Siavoosh Payandeh Azad,
Behrad Niazmand,
Jaan Raik,
Gert Jervan,
Thomas Hollstein
Abstract:
In this paper we describe a holistic approach for Fault-Tolerant Network-on-Chip (NoC) based many-core systems that incorporates a System Health Monitoring Unit (SHMU) which collects all the fault information from the system, classifies them and provides different solutions for different fault classes. A Mapper/Scheduler Unit (MSU) is used for online generation of different map** and scheduling…
▽ More
In this paper we describe a holistic approach for Fault-Tolerant Network-on-Chip (NoC) based many-core systems that incorporates a System Health Monitoring Unit (SHMU) which collects all the fault information from the system, classifies them and provides different solutions for different fault classes. A Mapper/Scheduler Unit (MSU) is used for online generation of different map** and scheduling solutions based on the current fault configuration of the system. For detection of faults, we have leveraged concurrent online checkers, able to capture faults with low detection latency and providing the fault information for SHMU, which can be later used for the recovery process. The experimentation setup is performed in an open source tool, able to perform the map**, scheduling and simulation of the system.
△ Less
Submitted 26 January, 2016;
originally announced January 2016.