Skip to main content

Showing 1–6 of 6 results for author: Liévin, V

.
  1. ThoughtSource: A central hub for large language model reasoning data

    Authors: Simon Ott, Konstantin Hebenstreit, Valentin Liévin, Christoffer Egeberg Hother, Milad Moradi, Maximilian Mayrhauser, Robert Praas, Ole Winther, Matthias Samwald

    Abstract: Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results across a wide range of tasks. LLMs are still limited, however, in that they frequently fail at complex reasoning, their reasoning processes are opaque, they are prone to 'hallucinate' facts, and there are concerns about their underlying biases. Letting models verbalize reasoning steps as natural language, a te… ▽ More

    Submitted 27 July, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: Revision: added datasets, formatting

    Journal ref: Scientific Data 10, 528 (2023)

  2. arXiv:2210.06345  [pdf, other

    cs.CL cs.IR cs.LG

    Variational Open-Domain Question Answering

    Authors: Valentin Liévin, Andreas Geert Motzfeldt, Ida Riis Jensen, Ole Winther

    Abstract: Retrieval-augmented models have proven to be effective in natural language processing tasks, yet there remains a lack of research on their optimization using variational inference. We introduce the Variational Open-Domain (VOD) framework for end-to-end training and evaluation of retrieval-augmented models, focusing on open-domain question answering and language modelling. The VOD objective, a self… ▽ More

    Submitted 31 May, 2023; v1 submitted 23 September, 2022; originally announced October 2022.

    Comments: 28 pages, 5 figures. Accepted at ICML 2023

    ACM Class: I.2.7; H.3.3; I.2.1

  3. arXiv:2207.08143  [pdf, other

    cs.CL cs.AI cs.LG

    Can large language models reason about medical questions?

    Authors: Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther

    Abstract: Although large language models (LLMs) often produce impressive outputs, it remains unclear how they perform in real-world scenarios requiring strong reasoning skills and expert domain knowledge. We set out to investigate whether close- and open-source models (GPT-3.5, LLama-2, etc.) can be applied to answer and reason about difficult real-world-based questions. We focus on three popular medical be… ▽ More

    Submitted 24 December, 2023; v1 submitted 17 July, 2022; originally announced July 2022.

    Comments: 37 pages, 23 figures. v1: results using InstructGPT, v2.0: added the Codex experiments, v2.1: added the missing test MedMCQA results for Codex 5-shot CoT and using k=100 samples, v3.0: added results for open source models -- ready for publication (final version)

    ACM Class: I.2.1; I.2.7

  4. arXiv:2203.09445  [pdf, other

    cs.CV cs.LG eess.IV

    Image Super-Resolution With Deep Variational Autoencoders

    Authors: Darius Chira, Ilian Haralampiev, Ole Winther, Andrea Dittadi, Valentin Liévin

    Abstract: Image super-resolution (SR) techniques are used to generate a high-resolution image from a low-resolution image. Until now, deep generative models such as autoregressive models and Generative Adversarial Networks (GANs) have proven to be effective at modelling high-resolution images. VAE-based models have often been criticised for their feeble generative performance, but with new advancements such… ▽ More

    Submitted 26 October, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: ECCV 2022 Workshop on Advances in Image Manipulation

  5. arXiv:2008.01998  [pdf, other

    stat.ML cs.LG

    Optimal Variance Control of the Score Function Gradient Estimator for Importance Weighted Bounds

    Authors: Valentin Liévin, Andrea Dittadi, Anders Christensen, Ole Winther

    Abstract: This paper introduces novel results for the score function gradient estimator of the importance weighted variational bound (IWAE). We prove that in the limit of large $K$ (number of importance samples) one can choose the control variate such that the Signal-to-Noise ratio (SNR) of the estimator grows as $\sqrt{K}$. This is in contrast to the standard pathwise gradient estimator where the SNR decre… ▽ More

    Submitted 8 December, 2020; v1 submitted 5 August, 2020; originally announced August 2020.

  6. arXiv:1902.02102  [pdf, other

    stat.ML cs.CV cs.LG

    BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling

    Authors: Lars Maaløe, Marco Fraccaro, Valentin Liévin, Ole Winther

    Abstract: With the introduction of the variational autoencoder (VAE), probabilistic latent variable models have received renewed attention as powerful generative models. However, their performance in terms of test likelihood and quality of generated samples has been surpassed by autoregressive models without stochastic units. Furthermore, flow-based models have recently been shown to be an attractive altern… ▽ More

    Submitted 6 November, 2019; v1 submitted 6 February, 2019; originally announced February 2019.