Skip to main content

Showing 1–5 of 5 results for author: Wüst, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16748  [pdf, other

    cs.LG cs.CL

    OCALM: Object-Centric Assessment with Language Models

    Authors: Timo Kaufmann, Jannis Blüml, Antonia Wüst, Quentin Delfosse, Kristian Kersting, Eyke Hüllermeier

    Abstract: Properly defining a reward signal to efficiently train a reinforcement learning (RL) agent is a challenging task. Designing balanced objective functions from which a desired behavior can emerge requires expert knowledge, especially for complex environments. Learning rewards from human feedback or using large language models (LLMs) to directly provide rewards are promising alternatives, allowing no… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted at the RLBRew Workshop at RLC 2024

  2. arXiv:2406.09949  [pdf, other

    cs.AI cs.LG cs.SC

    Neural Concept Binder

    Authors: Wolfgang Stammer, Antonia Wüst, David Steinmann, Kristian Kersting

    Abstract: The challenge in object-based visual reasoning lies in generating descriptive yet distinct concept representations. Moreover, doing this in an unsupervised fashion requires human users to understand a model's learned concepts and potentially revise false concepts. In addressing this challenge, we introduce the Neural Concept Binder, a new framework for deriving discrete concept representations res… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  3. arXiv:2402.12921  [pdf, other

    cs.LG

    Right on Time: Revising Time Series Models by Constraining their Explanations

    Authors: Maurice Kraus, David Steinmann, Antonia Wüst, Andre Kokozinski, Kristian Kersting

    Abstract: The reliability of deep time series models is often compromised by their tendency to rely on confounding factors, which may lead to incorrect outputs. Our newly recorded, naturally confounded dataset named P2S from a real mechanical production line emphasizes this. To avoid "Clever-Hans" moments in time series, i.e., to mitigate confounders, we introduce the method Right on Time (RioT). RioT enabl… ▽ More

    Submitted 19 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  4. arXiv:2402.08280  [pdf, other

    cs.AI cs.CV cs.LG

    Pix2Code: Learning to Compose Neural Visual Concepts as Programs

    Authors: Antonia Wüst, Wolfgang Stammer, Quentin Delfosse, Devendra Singh Dhami, Kristian Kersting

    Abstract: The challenge in learning abstract concepts from images in an unsupervised fashion lies in the required integration of visual perception and generalizable relational reasoning. Moreover, the unsupervised nature of this task makes it necessary for human users to be able to understand a model's learnt concepts and potentially revise false behaviours. To tackle both the generalizability and interpret… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  5. arXiv:2110.01406  [pdf

    cs.LG cs.DC cs.PF cs.SE

    MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

    Authors: Alexandros Karargyris, Renato Umeton, Micah J. Sheller, Alejandro Aristizabal, Johnu George, Srini Bala, Daniel J. Beutel, Victor Bittorf, Akshay Chaudhari, Alexander Chowdhury, Cody Coleman, Bala Desinghu, Gregory Diamos, Debo Dutta, Diane Feddema, Grigori Fursin, Junyi Guo, Xinyuan Huang, David Kanter, Satyananda Kashyap, Nicholas Lane, Indranil Mallick, Pietro Mascagni, Virendra Mehta, Vivek Natarajan , et al. (17 additional authors not shown)

    Abstract: Medical AI has tremendous potential to advance healthcare by supporting the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving provider and patient experience. We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data. To meet this need, we are building MedPerf,… ▽ More

    Submitted 28 December, 2021; v1 submitted 29 September, 2021; originally announced October 2021.