Skip to main content

Showing 1–9 of 9 results for author: Hofmarcher, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20309  [pdf, other

    cs.LG cs.AI cs.CL

    Large Language Models Can Self-Improve At Web Agent Tasks

    Authors: Ajay Patel, Markus Hofmarcher, Claudiu Leoveanu-Condrei, Marius-Constantin Dinu, Chris Callison-Burch, Sepp Hochreiter

    Abstract: Training models to act as agents that can effectively navigate and perform actions in a complex environment, such as a web browser, has typically been challenging due to lack of training data. Large language models (LLMs) have recently demonstrated some capability to navigate novel environments as agents in a zero-shot or few-shot fashion, purely guided by natural language instructions as prompts.… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  2. arXiv:2307.05591  [pdf, other

    cs.CV cs.CL cs.LG

    Linear Alignment of Vision-language Models for Image Captioning

    Authors: Fabian Paischer, Markus Hofmarcher, Sepp Hochreiter, Thomas Adler

    Abstract: Recently, vision-language models like CLIP have advanced the state of the art in a variety of multi-modal tasks including image captioning and caption evaluation. Many approaches adapt CLIP-style models to a downstream task by training a map** network between CLIP and a language model. This is costly as it usually involves calculating gradients for large models. We propose a more efficient train… ▽ More

    Submitted 6 February, 2024; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: 8 pages (+ references and appendix)

  3. arXiv:2306.14884  [pdf, other

    cs.LG cs.AI

    Learning to Modulate pre-trained Models in RL

    Authors: Thomas Schmied, Markus Hofmarcher, Fabian Paischer, Razvan Pascanu, Sepp Hochreiter

    Abstract: Reinforcement Learning (RL) has been successful in various domains like robotics, game playing, and simulation. While RL agents have shown impressive capabilities in their specific tasks, they insufficiently adapt to new tasks. In supervised learning, this adaptation problem is addressed by large-scale pre-training followed by fine-tuning to new down-stream tasks. Recently, pre-training on multipl… ▽ More

    Submitted 27 October, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: 10 pages (+ references and appendix), Code: https://github.com/ml-jku/L2M

  4. arXiv:2306.09312  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Semantic HELM: A Human-Readable Memory for Reinforcement Learning

    Authors: Fabian Paischer, Thomas Adler, Markus Hofmarcher, Sepp Hochreiter

    Abstract: Reinforcement learning agents deployed in the real world often have to cope with partially observable environments. Therefore, most agents employ memory mechanisms to approximate the state of the environment. Recently, there have been impressive success stories in mastering partially observable environments, mostly in the realm of computer games like Dota 2, StarCraft II, or MineCraft. However, ex… ▽ More

    Submitted 27 October, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: To appear at NeurIPS 2023, 10 pages (+ references and appendix), Code: https://github.com/ml-jku/helm

  5. arXiv:2111.04714  [pdf, other

    cs.LG cs.AI

    A Dataset Perspective on Offline Reinforcement Learning

    Authors: Kajetan Schweighofer, Andreas Radler, Marius-Constantin Dinu, Markus Hofmarcher, Vihang Patil, Angela Bitto-Nemling, Hamid Eghbal-zadeh, Sepp Hochreiter

    Abstract: The application of Reinforcement Learning (RL) in real world environments can be expensive or risky due to sub-optimal policies during training. In Offline RL, this problem is avoided since interactions with an environment are prohibited. Policies are learned from a given dataset, which solely determines their performance. Despite this fact, how dataset characteristics influence Offline RL algorit… ▽ More

    Submitted 12 July, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

    Comments: Code: https://github.com/ml-jku/OfflineRL

  6. arXiv:2009.14108  [pdf, other

    cs.LG cs.AI stat.ML

    Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

    Authors: Vihang P. Patil, Markus Hofmarcher, Marius-Constantin Dinu, Matthias Dorfer, Patrick M. Blies, Johannes Brandstetter, Jose A. Arjona-Medina, Sepp Hochreiter

    Abstract: Reinforcement learning algorithms require many samples when solving complex hierarchical tasks with sparse and delayed rewards. For such complex tasks, the recently proposed RUDDER uses reward redistribution to leverage steps in the Q-function that are associated with accomplishing sub-tasks. However, often only few episodes with high rewards are available as demonstrations since current explorati… ▽ More

    Submitted 28 June, 2022; v1 submitted 29 September, 2020; originally announced September 2020.

    Comments: Github: https://github.com/ml-jku/align-rudder, YouTube: https://youtu.be/HO-_8ZUl-UY

  7. arXiv:2004.00979  [pdf, other

    q-bio.BM cs.LG q-bio.QM stat.ML

    Large-scale ligand-based virtual screening for SARS-CoV-2 inhibitors using deep neural networks

    Authors: Markus Hofmarcher, Andreas Mayr, Elisabeth Rumetshofer, Peter Ruch, Philipp Renz, Johannes Schimunek, Philipp Seidl, Andreu Vall, Michael Widrich, Sepp Hochreiter, Günter Klambauer

    Abstract: Due to the current severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, there is an urgent need for novel therapies and drugs. We conducted a large-scale virtual screening for small molecules that are potential CoV-2 inhibitors. To this end, we utilized "ChemAI", a deep neural network trained on more than 220M data points across 3.6M molecules from three public drug-discovery dat… ▽ More

    Submitted 17 August, 2020; v1 submitted 25 March, 2020; originally announced April 2020.

    Comments: Additional results added. Various corrections to formulations and typos

  8. arXiv:1911.06616  [pdf, other

    eess.IV cs.LG stat.ML

    Detecting cutaneous basal cell carcinomas in ultra-high resolution and weakly labelled histopathological images

    Authors: Susanne Kimeswenger, Elisabeth Rumetshofer, Markus Hofmarcher, Philipp Tschandl, Harald Kittler, Sepp Hochreiter, Wolfram Hötzenecker, Günter Klambauer

    Abstract: Diagnosing basal cell carcinomas (BCC), one of the most common cutaneous malignancies in humans, is a task regularly performed by pathologists and dermato-pathologists. Improving histological diagnosis by providing diagnosis suggestions, i.e. computer-assisted diagnoses is actively researched to improve safety, quality and efficiency. Increasingly, machine learning methods are applied due to their… ▽ More

    Submitted 2 December, 2019; v1 submitted 14 November, 2019; originally announced November 2019.

    Comments: Machine Learning for Health (ML4H) at NeurIPS 2019

  9. arXiv:1910.04093  [pdf, other

    cs.CV cs.LG

    Patch Refinement -- Localized 3D Object Detection

    Authors: Johannes Lehner, Andreas Mitterecker, Thomas Adler, Markus Hofmarcher, Bernhard Nessler, Sepp Hochreiter

    Abstract: We introduce Patch Refinement a two-stage model for accurate 3D object detection and localization from point cloud data. Patch Refinement is composed of two independently trained Voxelnet-based networks, a Region Proposal Network (RPN) and a Local Refinement Network (LRN). We decompose the detection task into a preliminary Bird's Eye View (BEV) detection step and a local 3D detection step. Based o… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

    Comments: Machine Learning for Autonomous Driving Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada