-
On Learning the Tail Quantiles of Driving Behavior Distributions via Quantile Regression and Flows
Authors:
Jia Yu Tee,
Oliver De Candido,
Wolfgang Utschick,
Philipp Geiger
Abstract:
Towards safe autonomous driving (AD), we consider the problem of learning models that accurately capture the diversity and tail quantiles of human driver behavior probability distributions, in interaction with an AD vehicle. Such models, which predict drivers' continuous actions from their states, are particularly relevant for closing the gap between AD agent simulations and reality. To this end,…
▽ More
Towards safe autonomous driving (AD), we consider the problem of learning models that accurately capture the diversity and tail quantiles of human driver behavior probability distributions, in interaction with an AD vehicle. Such models, which predict drivers' continuous actions from their states, are particularly relevant for closing the gap between AD agent simulations and reality. To this end, we adapt two flexible quantile learning frameworks for this setting that avoid strong distributional assumptions: (1) quantile regression (based on the titled absolute loss), and (2) autoregressive quantile flows (a version of normalizing flows). Training happens in a behavior cloning-fashion. We use the highD dataset consisting of driver trajectories on several highways. We evaluate our approach in a one-step acceleration prediction task, and in multi-step driver simulation rollouts. We report quantitative results using the tilted absolute loss as metric, give qualitative examples showing that realistic extremal behavior can be learned, and discuss the main insights.
△ Less
Submitted 27 July, 2023; v1 submitted 22 May, 2023;
originally announced May 2023.
-
Fail-Safe Adversarial Generative Imitation Learning
Authors:
Philipp Geiger,
Christoph-Nikolas Straehle
Abstract:
For flexible yet safe imitation learning (IL), we propose theory and a modular method, with a safety layer that enables a closed-form probability density/gradient of the safe generative continuous policy, end-to-end generative adversarial training, and worst-case safety guarantees. The safety layer maps all actions into a set of safe actions, and uses the change-of-variables formula plus additivit…
▽ More
For flexible yet safe imitation learning (IL), we propose theory and a modular method, with a safety layer that enables a closed-form probability density/gradient of the safe generative continuous policy, end-to-end generative adversarial training, and worst-case safety guarantees. The safety layer maps all actions into a set of safe actions, and uses the change-of-variables formula plus additivity of measures for the density. The set of safe actions is inferred by first checking safety of a finite sample of actions via adversarial reachability analysis of fallback maneuvers, and then concluding on the safety of these actions' neighborhoods using, e.g., Lipschitz continuity. We provide theoretical analysis showing the robustness advantage of using the safety layer already during training (imitation error linear in the horizon) compared to only using it at test time (up to quadratic error). In an experiment on real-world driver interaction data, we empirically demonstrate tractability, safety and imitation performance of our approach.
△ Less
Submitted 28 July, 2023; v1 submitted 3 March, 2022;
originally announced March 2022.
-
Learning Game-Theoretic Models of Multiagent Trajectories Using Implicit Layers
Authors:
Philipp Geiger,
Christoph-Nikolas Straehle
Abstract:
For prediction of interacting agents' trajectories, we propose an end-to-end trainable architecture that hybridizes neural nets with game-theoretic reasoning, has interpretable intermediate representations, and transfers to downstream decision making. It uses a net that reveals preferences from the agents' past joint trajectory, and a differentiable implicit layer that maps these preferences to lo…
▽ More
For prediction of interacting agents' trajectories, we propose an end-to-end trainable architecture that hybridizes neural nets with game-theoretic reasoning, has interpretable intermediate representations, and transfers to downstream decision making. It uses a net that reveals preferences from the agents' past joint trajectory, and a differentiable implicit layer that maps these preferences to local Nash equilibria, forming the modes of the predicted future trajectory. Additionally, it learns an equilibrium refinement concept. For tractability, we introduce a new class of continuous potential games and an equilibrium-separating partition of the action space. We provide theoretical results for explicit gradients and soundness. In experiments, we evaluate our approach on two real-world data sets, where we predict highway driver merging trajectories, and on a simple decision-making transfer task.
△ Less
Submitted 18 February, 2022; v1 submitted 17 August, 2020;
originally announced August 2020.
-
Causal Transfer for Imitation Learning and Decision Making under Sensor-shift
Authors:
Jalal Etesami,
Philipp Geiger
Abstract:
Learning from demonstrations (LfD) is an efficient paradigm to train AI agents. But major issues arise when there are differences between (a) the demonstrator's own sensory input, (b) our sensors that observe the demonstrator and (c) the sensory input of the agent we train. In this paper, we propose a causal model-based framework for transfer learning under such "sensor-shifts", for two common LfD…
▽ More
Learning from demonstrations (LfD) is an efficient paradigm to train AI agents. But major issues arise when there are differences between (a) the demonstrator's own sensory input, (b) our sensors that observe the demonstrator and (c) the sensory input of the agent we train. In this paper, we propose a causal model-based framework for transfer learning under such "sensor-shifts", for two common LfD tasks: (1) inferring the effect of the demonstrator's actions and (2) imitation learning. First we rigorously analyze, on the population-level, to what extent the relevant underlying mechanisms (the action effects and the demonstrator policy) can be identified and transferred from the available observations together with prior knowledge of sensor characteristics. And we device an algorithm to infer these mechanisms. Then we introduce several proxy methods which are easier to calculate, estimate from finite data and interpret than the exact solutions, alongside theoretical bounds on their closeness to the exact ones. We validate our two main methods on simulated and semi-real world data.
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
Coordinating users of shared facilities via data-driven predictive assistants and game theory
Authors:
Philipp Geiger,
Michel Besserve,
Justus Winkelmann,
Claudius Proissl,
Bernhard Schölkopf
Abstract:
We study data-driven assistants that provide congestion forecasts to users of shared facilities (roads, cafeterias, etc.), to support coordination between them, and increase efficiency of such collective systems. Key questions are: (1) when and how much can (accurate) predictions help for coordination, and (2) which assistant algorithms reach optimal predictions?
First we lay conceptual ground f…
▽ More
We study data-driven assistants that provide congestion forecasts to users of shared facilities (roads, cafeterias, etc.), to support coordination between them, and increase efficiency of such collective systems. Key questions are: (1) when and how much can (accurate) predictions help for coordination, and (2) which assistant algorithms reach optimal predictions?
First we lay conceptual ground for this setting where user preferences are a priori unknown and predictions influence outcomes. Addressing (1), we establish conditions under which self-fulfilling prophecies, i.e., "perfect" (probabilistic) predictions of what will happen, solve the coordination problem in the game-theoretic sense of selecting a Bayesian Nash equilibrium (BNE). Next we prove that such prophecies exist even in large-scale settings where only aggregated statistics about users are available. This entails a new (nonatomic) BNE existence result. Addressing (2), we propose two assistant algorithms that sequentially learn from users' reactions, together with optimality/convergence guarantees. We validate one of them in a large real-world experiment.
△ Less
Submitted 29 July, 2021; v1 submitted 16 March, 2018;
originally announced March 2018.
-
Notes on socio-economic transparency mechanisms
Authors:
Philipp Geiger
Abstract:
Clearly, socio-economic freedom requires some extent of transparency regarding the implications of choices. In this paper, we review some established mechanisms for achieving such transparency, without any claim to completeness, and briefly discuss potential future directions. Our investigation is structured by four "challenges" under which we subsume the various requirements on, and approaches to…
▽ More
Clearly, socio-economic freedom requires some extent of transparency regarding the implications of choices. In this paper, we review some established mechanisms for achieving such transparency, without any claim to completeness, and briefly discuss potential future directions. Our investigation is structured by four "challenges" under which we subsume the various requirements on, and approaches to, socio-economic transparency mechanisms. One main focus is on the inference, i.e., statistical, aspect of such mechanisms.
△ Less
Submitted 15 June, 2016;
originally announced June 2016.
-
Experimental and causal view on information integration in autonomous agents
Authors:
Philipp Geiger,
Katja Hofmann,
Bernhard Schölkopf
Abstract:
The amount of digitally available but heterogeneous information about the world is remarkable, and new technologies such as self-driving cars, smart homes, or the internet of things may further increase it. In this paper we present preliminary ideas about certain aspects of the problem of how such heterogeneous information can be harnessed by autonomous agents. After discussing potentials and limi…
▽ More
The amount of digitally available but heterogeneous information about the world is remarkable, and new technologies such as self-driving cars, smart homes, or the internet of things may further increase it. In this paper we present preliminary ideas about certain aspects of the problem of how such heterogeneous information can be harnessed by autonomous agents. After discussing potentials and limitations of some existing approaches, we investigate how \emph{experiments} can help to obtain a better understanding of the problem. Specifically, we present a simple agent that integrates video data from a different agent, and implement and evaluate a version of it on the novel experimentation platform \emph{Malmo}. The focus of a second investigation is on how information about the hardware of different agents, the agents' sensory data, and \emph{causal} information can be utilized for knowledge transfer between agents and subsequently more data-efficient decision making. Finally, we discuss potential future steps w.r.t.\ theory and experimentation, and formulate open questions.
△ Less
Submitted 13 March, 2018; v1 submitted 14 June, 2016;
originally announced June 2016.
-
Causal inference for data-driven debugging and decision making in cloud computing
Authors:
Philipp Geiger,
Lucian Carata,
Bernhard Schoelkopf
Abstract:
Cloud computing involves complex technical and economical systems and interactions. This brings about various challenges, two of which are: (1) debugging and control to optimize the performance of computing systems, with the help of sandbox experiments, and (2) privacy-preserving prediction of the cost of ``spot'' resources for decision making of cloud clients. In this paper, we formalize debuggin…
▽ More
Cloud computing involves complex technical and economical systems and interactions. This brings about various challenges, two of which are: (1) debugging and control to optimize the performance of computing systems, with the help of sandbox experiments, and (2) privacy-preserving prediction of the cost of ``spot'' resources for decision making of cloud clients. In this paper, we formalize debugging by counterfactual probabilities and control by post-(soft-)interventional probabilities. We prove that counterfactuals can approximately be calculated from a ``stochastic'' graphical causal model (while they are originally defined only for ``deterministic'' functional causal models), and based on this sketch a data-driven approach to address problem (1). To address problem (2), we formalize bidding by post-(soft-)interventional probabilities and present a simple mathematical result on approximate integration of ``incomplete'' conditional probability distributions. We show how this can be used by cloud clients to trade off privacy against predictability of the outcome of their bidding actions in a toy scenario. We report experiments on simulated and real data.
△ Less
Submitted 10 March, 2020; v1 submitted 4 March, 2016;
originally announced March 2016.