-
Inverse Reinforcement Learning via Matching of Optimality Profiles
Authors:
Luis Haug,
Ivan Ovinnikov,
Eugene Bykovets
Abstract:
The goal of inverse reinforcement learning (IRL) is to infer a reward function that explains the behavior of an agent performing a task. The assumption that most approaches make is that the demonstrated behavior is near-optimal. In many real-world scenarios, however, examples of truly optimal behavior are scarce, and it is desirable to effectively leverage sets of demonstrations of suboptimal or h…
▽ More
The goal of inverse reinforcement learning (IRL) is to infer a reward function that explains the behavior of an agent performing a task. The assumption that most approaches make is that the demonstrated behavior is near-optimal. In many real-world scenarios, however, examples of truly optimal behavior are scarce, and it is desirable to effectively leverage sets of demonstrations of suboptimal or heterogeneous performance, which are easier to obtain. We propose an algorithm that learns a reward function from such demonstrations together with a weak supervision signal in the form of a distribution over rewards collected during the demonstrations (or, more generally, a distribution over cumulative discounted future rewards). We view such distributions, which we also refer to as optimality profiles, as summaries of the degree of optimality of the demonstrations that may, for example, reflect the opinion of a human expert. Given an optimality profile and a small amount of additional supervision, our algorithm fits a reward function, modeled as a neural network, by essentially minimizing the Wasserstein distance between the corresponding induced distribution and the optimality profile. We show that our method is capable of learning reward functions such that policies trained to optimize them outperform the demonstrations used for fitting the reward functions.
△ Less
Submitted 19 November, 2020; v1 submitted 18 November, 2020;
originally announced November 2020.
-
Semantic Segmentation of Histopathological Slides for the Classification of Cutaneous Lymphoma and Eczema
Authors:
Jérémy Scheurer,
Claudio Ferrari,
Luis Berenguer Todo Bom,
Michaela Beer,
Werner Kempf,
Luis Haug
Abstract:
Mycosis fungoides (MF) is a rare, potentially life threatening skin disease, which in early stages clinically and histologically strongly resembles Eczema, a very common and benign skin condition. In order to increase the survival rate, one needs to provide the appropriate treatment early on. To this end, one crucial step for specialists is the evaluation of histopathological slides (glass slides)…
▽ More
Mycosis fungoides (MF) is a rare, potentially life threatening skin disease, which in early stages clinically and histologically strongly resembles Eczema, a very common and benign skin condition. In order to increase the survival rate, one needs to provide the appropriate treatment early on. To this end, one crucial step for specialists is the evaluation of histopathological slides (glass slides), or Whole Slide Images (WSI), of the patients' skin tissue. We introduce a deep learning aided diagnostics tool that brings a two-fold value to the decision process of pathologists. First, our algorithm accurately segments WSI into regions that are relevant for an accurate diagnosis, achieving a Mean-IoU of 69% and a Matthews Correlation score of 83% on a novel dataset. Additionally, we also show that our model is competitive with the state of the art on a reference dataset. Second, using the segmentation map and the original image, we are able to predict if a patient has MF or Eczema. We created two models that can be applied in different stages of the diagnostic pipeline, potentially eliminating life-threatening mistakes. The classification outcome is considerably more interpretable than using only the WSI as the input, since it is also based on the segmentation map. Our segmentation model, which we call EU-Net, extends a classical U-Net with an EfficientNet-B7 encoder which was pre-trained on the Imagenet dataset.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
Understanding the Power and Limitations of Teaching with Imperfect Knowledge
Authors:
Rati Devidze,
Farnam Mansouri,
Luis Haug,
Yuxin Chen,
Adish Singla
Abstract:
Machine teaching studies the interaction between a teacher and a student/learner where the teacher selects training examples for the learner to learn a specific task. The typical assumption is that the teacher has perfect knowledge of the task---this knowledge comprises knowing the desired learning target, having the exact task representation used by the learner, and knowing the parameters capturi…
▽ More
Machine teaching studies the interaction between a teacher and a student/learner where the teacher selects training examples for the learner to learn a specific task. The typical assumption is that the teacher has perfect knowledge of the task---this knowledge comprises knowing the desired learning target, having the exact task representation used by the learner, and knowing the parameters capturing the learning dynamics of the learner. Inspired by real-world applications of machine teaching in education, we consider the setting where teacher's knowledge is limited and noisy, and the key research question we study is the following: When does a teacher succeed or fail in effectively teaching a learner using its imperfect knowledge? We answer this question by showing connections to how imperfect knowledge affects the teacher's solution of the corresponding machine teaching problem when constructing optimal teaching sets. Our results have important implications for designing robust teaching algorithms for real-world applications.
△ Less
Submitted 21 March, 2020;
originally announced March 2020.
-
Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints
Authors:
Sebastian Tschiatschek,
Ahana Ghosh,
Luis Haug,
Rati Devidze,
Adish Singla
Abstract:
Inverse reinforcement learning (IRL) enables an agent to learn complex behavior by observing demonstrations from a (near-)optimal policy. The typical assumption is that the learner's goal is to match the teacher's demonstrated behavior. In this paper, we consider the setting where the learner has its own preferences that it additionally takes into consideration. These preferences can for example c…
▽ More
Inverse reinforcement learning (IRL) enables an agent to learn complex behavior by observing demonstrations from a (near-)optimal policy. The typical assumption is that the learner's goal is to match the teacher's demonstrated behavior. In this paper, we consider the setting where the learner has its own preferences that it additionally takes into consideration. These preferences can for example capture behavioral biases, mismatched worldviews, or physical constraints. We study two teaching approaches: learner-agnostic teaching, where the teacher provides demonstrations from an optimal policy ignoring the learner's preferences, and learner-aware teaching, where the teacher accounts for the learner's preferences. We design learner-aware teaching algorithms and show that significant performance improvements can be achieved over learner-agnostic teaching.
△ Less
Submitted 29 October, 2019; v1 submitted 2 June, 2019;
originally announced June 2019.
-
Teaching Inverse Reinforcement Learners via Features and Demonstrations
Authors:
Luis Haug,
Sebastian Tschiatschek,
Adish Singla
Abstract:
Learning near-optimal behaviour from an expert's demonstrations typically relies on the assumption that the learner knows the features that the true reward function depends on. In this paper, we study the problem of learning from demonstrations in the setting where this is not the case, i.e., where there is a mismatch between the worldviews of the learner and the expert. We introduce a natural qua…
▽ More
Learning near-optimal behaviour from an expert's demonstrations typically relies on the assumption that the learner knows the features that the true reward function depends on. In this paper, we study the problem of learning from demonstrations in the setting where this is not the case, i.e., where there is a mismatch between the worldviews of the learner and the expert. We introduce a natural quantity, the teaching risk, which measures the potential suboptimality of policies that look optimal to the learner in this setting. We show that bounds on the teaching risk guarantee that the learner is able to find a near-optimal policy using standard algorithms based on inverse reinforcement learning. Based on these findings, we suggest a teaching scheme in which the expert can decrease the teaching risk by updating the learner's worldview, and thus ultimately enable her to find a near-optimal policy.
△ Less
Submitted 27 March, 2019; v1 submitted 21 October, 2018;
originally announced October 2018.
-
Lagrangian antisurgery
Authors:
Luis Haug
Abstract:
We describe an operation which modifies a Lagrangian submanifold $L$ in a symplectic manifold $(M, ω)$ such as to produce a new immersed Lagrangian submanifold $L'$, which as a smooth manifold is obtained by surgery along a framed sphere in $L$. Intuitively, this can be described as collapsing an isotropic disc with boundary on $L$ to a point. The inverse operation generalizes classical Lagrangian…
▽ More
We describe an operation which modifies a Lagrangian submanifold $L$ in a symplectic manifold $(M, ω)$ such as to produce a new immersed Lagrangian submanifold $L'$, which as a smooth manifold is obtained by surgery along a framed sphere in $L$. Intuitively, this can be described as collapsing an isotropic disc with boundary on $L$ to a point. The inverse operation generalizes classical Lagrangian surgery. We also describe corresponding immersed Lagrangian cobordisms between $L$ and $L'$ . After removal of their singular locus, we obtain examples of embedded Lagrangian cobordisms with precisely two ends. As an application, we use this construction to produce interesting examples of Lagrangian cobordisms between Clifford and Chekanov tori.
△ Less
Submitted 20 January, 2021; v1 submitted 16 November, 2015;
originally announced November 2015.
-
The Lagrangian cobordism group of $T^2$
Authors:
Luis Haug
Abstract:
We compute the Lagrangian cobordism group of the standard symplectic 2-torus and prove that it is isomorphic to the Grothendieck group of its derived Fukaya category. The proofs use homological mirror symmetry for the 2-torus.
We compute the Lagrangian cobordism group of the standard symplectic 2-torus and prove that it is isomorphic to the Grothendieck group of its derived Fukaya category. The proofs use homological mirror symmetry for the 2-torus.
△ Less
Submitted 25 November, 2014; v1 submitted 30 October, 2013;
originally announced October 2013.
-
On the Quantum Homology of Real Lagrangians in Fano Toric Manifolds
Authors:
Luis Haug
Abstract:
We study the Lagrangian quantum homology of real parts of Fano toric manifolds of minimal Chern number at least 2, using coefficients in a ring of Laurent polynomials over Z/2Z. We show that these Lagrangians are wide, in the sense that their quantum homology is isomorphic as a module to their classical homology tensored with this ring. Moreover, we show that the quantum homology is isomorphic as…
▽ More
We study the Lagrangian quantum homology of real parts of Fano toric manifolds of minimal Chern number at least 2, using coefficients in a ring of Laurent polynomials over Z/2Z. We show that these Lagrangians are wide, in the sense that their quantum homology is isomorphic as a module to their classical homology tensored with this ring. Moreover, we show that the quantum homology is isomorphic as a ring to the quantum homology of the ambient symplectic manifold.
△ Less
Submitted 30 October, 2013; v1 submitted 29 August, 2011;
originally announced August 2011.