-
RILe: Reinforced Imitation Learning
Authors:
Mert Albaba,
Sammy Christen,
Christoph Gebhardt,
Thomas Langarek,
Michael J. Black,
Otmar Hilliges
Abstract:
Reinforcement Learning has achieved significant success in generating complex behavior but often requires extensive reward function engineering. Adversarial variants of Imitation Learning and Inverse Reinforcement Learning offer an alternative by learning policies from expert demonstrations via a discriminator. Employing discriminators increases their data- and computational efficiency over the st…
▽ More
Reinforcement Learning has achieved significant success in generating complex behavior but often requires extensive reward function engineering. Adversarial variants of Imitation Learning and Inverse Reinforcement Learning offer an alternative by learning policies from expert demonstrations via a discriminator. Employing discriminators increases their data- and computational efficiency over the standard approaches; however, results in sensitivity to imperfections in expert data. We propose RILe, a teacher-student system that achieves both robustness to imperfect data and efficiency. In RILe, the student learns an action policy while the teacher dynamically adjusts a reward function based on the student's performance and its alignment with expert demonstrations. By tailoring the reward function to both performance of the student and expert similarity, our system reduces dependence on the discriminator and, hence, increases robustness against data imperfections. Experiments show that RILe outperforms existing methods by 2x in settings with limited or noisy expert data.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos
Authors:
Suleyman Ozdel,
Yao Rong,
Berat Mert Albaba,
Yen-Ling Kuo,
Xi Wang,
Enkelejda Kasneci
Abstract:
Eye-tracking applications that utilize the human gaze in video understanding tasks have become increasingly important. To effectively automate the process of video analysis based on eye-tracking data, it is important to accurately replicate human gaze behavior. However, this task presents significant challenges due to the inherent complexity and ambiguity of human gaze patterns. In this work, we i…
▽ More
Eye-tracking applications that utilize the human gaze in video understanding tasks have become increasingly important. To effectively automate the process of video analysis based on eye-tracking data, it is important to accurately replicate human gaze behavior. However, this task presents significant challenges due to the inherent complexity and ambiguity of human gaze patterns. In this work, we introduce a novel method for simulating human gaze behavior. Our approach uses a transformer-based reinforcement learning algorithm to train an agent that acts as a human observer, with the primary role of watching videos and simulating human gaze behavior. We employed an eye-tracking dataset gathered from videos generated by the VirtualHome simulator, with a primary focus on activity recognition. Our experimental results demonstrate the effectiveness of our gaze prediction method by highlighting its capability to replicate human gaze behavior and its applicability for downstream tasks where real human-gaze is used as input.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention
Authors:
Suleyman Ozdel,
Yao Rong,
Berat Mert Albaba,
Yen-Ling Kuo,
Xi Wang,
Enkelejda Kasneci
Abstract:
Humans utilize their gaze to concentrate on essential information while perceiving and interpreting intentions in videos. Incorporating human gaze into computational algorithms can significantly enhance model performance in video understanding tasks. In this work, we address a challenging and innovative task in video understanding: predicting the actions of an agent in a video based on a partial v…
▽ More
Humans utilize their gaze to concentrate on essential information while perceiving and interpreting intentions in videos. Incorporating human gaze into computational algorithms can significantly enhance model performance in video understanding tasks. In this work, we address a challenging and innovative task in video understanding: predicting the actions of an agent in a video based on a partial video. We introduce the Gaze-guided Action Anticipation algorithm, which establishes a visual-semantic graph from the video input. Our method utilizes a Graph Neural Network to recognize the agent's intention and predict the action sequence to fulfill this intention. To assess the efficiency of our approach, we collect a dataset containing household activities generated in the VirtualHome environment, accompanied by human gaze data of viewing videos. Our method outperforms state-of-the-art techniques, achieving a 7\% improvement in accuracy for 18-class intention recognition. This highlights the efficiency of our method in learning important features from human gaze data.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
MARLUI: Multi-Agent Reinforcement Learning for Adaptive UIs
Authors:
Thomas Langerak,
Sammy Christen,
Mert Albaba,
Christoph Gebhardt,
Otmar Hilliges
Abstract:
Adaptive user interfaces (UIs) automatically change an interface to better support users' tasks. Recently, machine learning techniques have enabled the transition to more powerful and complex adaptive UIs. However, a core challenge for adaptive user interfaces is the reliance on high-quality user data that has to be collected offline for each task. We formulate UI adaptation as a multi-agent reinf…
▽ More
Adaptive user interfaces (UIs) automatically change an interface to better support users' tasks. Recently, machine learning techniques have enabled the transition to more powerful and complex adaptive UIs. However, a core challenge for adaptive user interfaces is the reliance on high-quality user data that has to be collected offline for each task. We formulate UI adaptation as a multi-agent reinforcement learning problem to overcome this challenge. In our formulation, a user agent mimics a real user and learns to interact with a UI. Simultaneously, an interface agent learns UI adaptations to maximize the user agent's performance. The interface agent learns the task structure from the user agent's behavior and, based on that, can support the user agent in completing its task. Our method produces adaptation policies that are learned in simulation only and, therefore, does not need real user data. Our experiments show that learned policies generalize to real users and achieve on par performance with data-driven supervised learning baselines.
△ Less
Submitted 27 October, 2023; v1 submitted 26 September, 2022;
originally announced September 2022.
-
SyNet: An Ensemble Network for Object Detection in UAV Images
Authors:
Berat Mert Albaba,
Sedat Ozer
Abstract:
Recent advances in camera equipped drone applications and their widespread use increased the demand on vision based object detection algorithms for aerial images. Object detection process is inherently a challenging task as a generic computer vision problem, however, since the use of object detection algorithms on UAVs (or on drones) is relatively a new area, it remains as a more challenging probl…
▽ More
Recent advances in camera equipped drone applications and their widespread use increased the demand on vision based object detection algorithms for aerial images. Object detection process is inherently a challenging task as a generic computer vision problem, however, since the use of object detection algorithms on UAVs (or on drones) is relatively a new area, it remains as a more challenging problem to detect objects in aerial images. There are several reasons for that including: (i) the lack of large drone datasets including large object variance, (ii) the large orientation and scale variance in drone images when compared to the ground images, and (iii) the difference in texture and shape features between the ground and the aerial images. Deep learning based object detection algorithms can be classified under two main categories: (a) single-stage detectors and (b) multi-stage detectors. Both single-stage and multi-stage solutions have their advantages and disadvantages over each other. However, a technique to combine the good sides of each of those solutions could yield even a stronger solution than each of those solutions individually. In this paper, we propose an ensemble network, SyNet, that combines a multi-stage method with a single-stage one with the motivation of decreasing the high false negative rate of multi-stage detectors and increasing the quality of the single-stage detector proposals. As building blocks, CenterNet and Cascade R-CNN with pretrained feature extractors are utilized along with an ensembling strategy. We report the state of the art results obtained by our proposed solution on two different datasets: namely MS-COCO and visDrone with \%52.1 $mAP_{IoU = 0.75}$ is obtained on MS-COCO $val2017$ dataset and \%26.2 $mAP_{IoU = 0.75}$ is obtained on VisDrone $test-set$.
△ Less
Submitted 23 December, 2020;
originally announced December 2020.
-
Driver Modeling through Deep Reinforcement Learning and Behavioral Game Theory
Authors:
Berat Mert Albaba,
Yildiray Yildiz
Abstract:
In this paper, a synergistic combination of deep reinforcement learning and hierarchical game theory is proposed as a modeling framework for behavioral predictions of drivers in highway driving scenarios. The need for a modeling framework that can address multiple human-human and human-automation interactions, where all the agents can be modeled as decision makers simultaneously, is the main motiv…
▽ More
In this paper, a synergistic combination of deep reinforcement learning and hierarchical game theory is proposed as a modeling framework for behavioral predictions of drivers in highway driving scenarios. The need for a modeling framework that can address multiple human-human and human-automation interactions, where all the agents can be modeled as decision makers simultaneously, is the main motivation behind this work. Such a modeling framework may be utilized for the validation and verification of autonomous vehicles: It is estimated that for an autonomous vehicle to reach the same safety level of cars with drivers, millions of miles of driving tests are required. The modeling framework presented in this paper may be used in a high-fidelity traffic simulator consisting of multiple human decision makers to reduce the time and effort spent for testing by allowing safe and quick assessment of self-driving algorithms. To demonstrate the fidelity of the proposed modeling framework, game theoretical driver models are compared with real human driver behavior patterns extracted from traffic data.
△ Less
Submitted 24 March, 2020;
originally announced March 2020.
-
Modeling Cyber-Physical Human Systems via an Interplay Between Reinforcement Learning and Game Theory
Authors:
Mert Albaba,
Yildiray Yildiz
Abstract:
Predicting the outcomes of cyber-physical systems with multiple human interactions is a challenging problem. This article reviews a game theoretical approach to address this issue, where reinforcement learning is employed to predict the time-extended interaction dynamics. We explain that the most attractive feature of the method is proposing a computationally feasible approach to simultaneously mo…
▽ More
Predicting the outcomes of cyber-physical systems with multiple human interactions is a challenging problem. This article reviews a game theoretical approach to address this issue, where reinforcement learning is employed to predict the time-extended interaction dynamics. We explain that the most attractive feature of the method is proposing a computationally feasible approach to simultaneously model multiple humans as decision makers, instead of determining the decision dynamics of the intelligent agent of interest and forcing the others to obey certain kinematic and dynamic constraints imposed by the environment. We present two recent exploitations of the method to model 1) unmanned aircraft integration into the National Airspace System and 2) highway traffic. We conclude the article by providing ongoing and future work about employing, improving and validating the method. We also provide related open problems and research opportunities.
△ Less
Submitted 11 October, 2019;
originally announced October 2019.