Skip to main content

Showing 1–10 of 10 results for author: Treven, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.01175  [pdf, other

    cs.LG

    NeoRL: Efficient Exploration for Nonepisodic RL

    Authors: Bhavya Sukhija, Lenart Treven, Florian Dörfler, Stelian Coros, Andreas Krause

    Abstract: We study the problem of nonepisodic reinforcement learning (RL) for nonlinear dynamical systems, where the system dynamics are unknown and the RL agent has to learn from a single trajectory, i.e., without resets. We propose Nonepisodic Optimistic RL (NeoRL), an approach based on the principle of optimism in the face of uncertainty. NeoRL uses well-calibrated probabilistic models and plans optimist… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  2. arXiv:2406.01163  [pdf, other

    cs.LG

    When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL

    Authors: Lenart Treven, Bhavya Sukhija, Yarden As, Florian Dörfler, Andreas Krause

    Abstract: Reinforcement learning (RL) excels in optimizing policies for discrete-time Markov decision processes (MDP). However, various systems are inherently continuous in time, making discrete-time MDPs an inexact modeling choice. In many applications, such as greenhouse control or medical treatments, each interaction (measurement or switching of action) involves manual intervention and thus is inherently… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  3. arXiv:2403.16644  [pdf, other

    cs.RO cs.LG

    Bridging the Sim-to-Real Gap with Bayesian Inference

    Authors: Jonas Rothfuss, Bhavya Sukhija, Lenart Treven, Florian Dörfler, Stelian Coros, Andreas Krause

    Abstract: We present SIM-FSVGD for learning robot dynamics from data. As opposed to traditional methods, SIM-FSVGD leverages low-fidelity physical priors, e.g., in the form of simulators, to regularize the training of neural network models. While learning accurate dynamics already in the low data regime, SIM-FSVGD scales and excels also when more data is available. We empirically show that learning with imp… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  4. arXiv:2402.15898  [pdf, other

    cs.LG cs.AI

    Transductive Active Learning: Theory and Applications

    Authors: Jonas Hübotter, Bhavya Sukhija, Lenart Treven, Yarden As, Andreas Krause

    Abstract: We generalize active learning to address real-world settings with concrete prediction targets where sampling is restricted to an accessible region of the domain, while prediction targets may lie outside this region. We analyze a family of decision rules that sample adaptively to minimize uncertainty about prediction targets. We are the first to show, under general regularity assumptions, that such… ▽ More

    Submitted 22 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.15441

  5. arXiv:2402.15441  [pdf, other

    cs.LG cs.AI

    Active Few-Shot Fine-Tuning

    Authors: Jonas Hübotter, Bhavya Sukhija, Lenart Treven, Yarden As, Andreas Krause

    Abstract: We study the question: How can we select the right data for fine-tuning to a specific task? We call this data selection problem active fine-tuning and show that it is an instance of transductive active learning, a novel generalization of classical active learning. We propose ITL, short for information-based transductive learning, an approach which samples adaptively to maximize information gained… ▽ More

    Submitted 21 June, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  6. arXiv:2310.19848  [pdf, other

    cs.LG cs.RO math.OC

    Efficient Exploration in Continuous-time Model-based Reinforcement Learning

    Authors: Lenart Treven, Jonas Hübotter, Bhavya Sukhija, Florian Dörfler, Andreas Krause

    Abstract: Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time dynamics using nonlinear ordinary differential equations (ODEs). We capture epistemic uncertainty using well-calibrated probabilistic models, and use t… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  7. arXiv:2306.12371  [pdf, other

    cs.LG cs.RO eess.SY

    Optimistic Active Exploration of Dynamical Systems

    Authors: Bhavya Sukhija, Lenart Treven, Cansu Sancaktar, Sebastian Blaes, Stelian Coros, Andreas Krause

    Abstract: Reinforcement learning algorithms commonly seek to optimize policies for solving one particular task. How should we explore an unknown dynamical system such that the estimated model globally approximates the dynamics and allows us to solve multiple downstream tasks in a zero-shot manner? In this paper, we address this challenge, by develo** an algorithm -- OPAX -- for active exploration. OPAX us… ▽ More

    Submitted 30 October, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

  8. arXiv:2106.11609  [pdf, other

    cs.LG math.DS stat.ML

    Distributional Gradient Matching for Learning Uncertain Neural Dynamics Models

    Authors: Lenart Treven, Philippe Wenk, Florian Dörfler, Andreas Krause

    Abstract: Differential equations in general and neural ODEs in particular are an essential technique in continuous-time system identification. While many deterministic learning algorithms have been designed based on numerical integration via the adjoint method, many downstream tasks such as active learning, exploration in reinforcement learning, robust control, or filtering require accurate estimates of pre… ▽ More

    Submitted 15 October, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: Published at NeurIPS 2021

    Journal ref: Advances in Neural Information Processing Systems, 2021

  9. arXiv:2009.03091  [pdf, other

    cs.LG cs.IR stat.ML

    Iterative Correction of Sensor Degradation and a Bayesian Multi-Sensor Data Fusion Method

    Authors: Luka Kolar, Rok Šikonja, Lenart Treven

    Abstract: We present a novel method for inferring ground-truth signal from multiple degraded signals, affected by different amounts of sensor exposure. The algorithm learns a multiplicative degradation effect by performing iterative corrections of two signals solely from the ratio between them. The degradation function d should be continuous, satisfy monotonicity, and d(0) = 1. We use smoothed monotonic reg… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

  10. arXiv:2006.11022  [pdf, other

    eess.SY cs.LG cs.RO

    Learning Stabilizing Controllers for Unstable Linear Quadratic Regulators from a Single Trajectory

    Authors: Lenart Treven, Sebastian Curi, Mojmir Mutny, Andreas Krause

    Abstract: The principal task to control dynamical systems is to ensure their stability. When the system is unknown, robust approaches are promising since they aim to stabilize a large set of plausible systems simultaneously. We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR). We present two different semi-definite programs (SDP) which results in a control… ▽ More

    Submitted 23 November, 2020; v1 submitted 19 June, 2020; originally announced June 2020.