Search | arXiv e-print repository

The Two Sides of the Coin: Hallucination Generation and Detection with LLMs as Evaluators for LLMs

Authors: Anh Thu Maria Bui, Saskia Felizitas Brech, Natalie Hußfeldt, Tobias Jennert, Melanie Ullrich, Timo Breuer, Narjes Nikzad Khasmakhi, Philipp Schaer

Abstract: Hallucination detection in Large Language Models (LLMs) is crucial for ensuring their reliability. This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content. We explored the capabilities of four LLMs: Llama 3, Gemma, GPT-3.5 Turbo, and GPT-4, for this purpose. We also employed ens… ▽ More Hallucination detection in Large Language Models (LLMs) is crucial for ensuring their reliability. This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content. We explored the capabilities of four LLMs: Llama 3, Gemma, GPT-3.5 Turbo, and GPT-4, for this purpose. We also employed ensemble majority voting to incorporate all four models for the detection task. The results provide valuable insights into the strengths and weaknesses of these LLMs in handling hallucination generation and detection tasks. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: Paper accepted at ELOQUENT@CLEF'24

arXiv:2406.07108 [pdf, ps, other]

On the power of adaption and randomization

Authors: David Krieg, Erich Novak, Mario Ullrich

Abstract: We present bounds between different widths of convex subsets of Banach spaces, including Gelfand and Bernstein widths. Using this, and some relations between widths and minimal errors, we obtain bounds on the maximal gain of adaptive and randomized algorithms over non-adaptive, deterministic ones for approximating linear operators on convex sets. Our results also apply to the approximation of embe… ▽ More We present bounds between different widths of convex subsets of Banach spaces, including Gelfand and Bernstein widths. Using this, and some relations between widths and minimal errors, we obtain bounds on the maximal gain of adaptive and randomized algorithms over non-adaptive, deterministic ones for approximating linear operators on convex sets. Our results also apply to the approximation of embeddings into the space of bounded functions based on function evaluations, i.e., to sampling recovery in the uniform norm. We conclude with a list of open problems. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2402.15294 [pdf, other]

A Survey of Music Generation in the Context of Interaction

Authors: Ismael Agchar, Ilja Baumann, Franziska Braun, Paula Andrea Perez-Toro, Korbinian Riedhammer, Sebastian Trump, Martin Ullrich

Abstract: In recent years, machine learning, and in particular generative adversarial neural networks (GANs) and attention-based neural networks (transformers), have been successfully used to compose and generate music, both melodies and polyphonic pieces. Current research focuses foremost on style replication (eg. generating a Bach-style chorale) or style transfer (eg. classical to jazz) based on large amo… ▽ More In recent years, machine learning, and in particular generative adversarial neural networks (GANs) and attention-based neural networks (transformers), have been successfully used to compose and generate music, both melodies and polyphonic pieces. Current research focuses foremost on style replication (eg. generating a Bach-style chorale) or style transfer (eg. classical to jazz) based on large amounts of recorded or transcribed music, which in turn also allows for fairly straight-forward "performance" evaluation. However, most of these models are not suitable for human-machine co-creation through live interaction, neither is clear, how such models and resulting creations would be evaluated. This article presents a thorough review of music representation, feature analysis, heuristic algorithms, statistical and parametric modelling, and human and automatic evaluation measures, along with a discussion of which approaches and models seem most suitable for live interaction. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2310.12740 [pdf, ps, other]

doi 10.30970/ana.2023.1.88

On the power of iid information for linear approximation

Authors: Mathias Sonnleitner, Mario Ullrich

Abstract: This survey is concerned with the power of random information for approximation in the (deterministic) worst-case setting, with special emphasis on information consisting of functionals selected independently and identically distributed (iid) at random on a class of admissible information functionals. We present a general result based on a weighted least squares method and derive consequences for… ▽ More This survey is concerned with the power of random information for approximation in the (deterministic) worst-case setting, with special emphasis on information consisting of functionals selected independently and identically distributed (iid) at random on a class of admissible information functionals. We present a general result based on a weighted least squares method and derive consequences for special cases. Improvements are available if the information is ``Gaussian'' or if we consider iid function values for Sobolev spaces. We include open questions to guide future research on the power of random information in the context of information-based complexity. △ Less

Submitted 8 January, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

Comments: 63 pages

MSC Class: 65-02; 41A25; 47B06; 68Q25; 94A20

Journal ref: JANA 1 (2023) 88-126

arXiv:2305.07539 [pdf, ps, other]

Sampling recovery in $L_2$ and other norms

Authors: David Krieg, Kateryna Pozharska, Mario Ullrich, Tino Ullrich

Abstract: We study the recovery of functions in various norms, including $L_p$ with $1\le p\le\infty$, based on function evaluations. We obtain worst case error bounds for general classes of functions in terms of the best $L_2$-approximation from a given nested sequence of subspaces and the Christoffel function of these subspaces. In the case $p=\infty$, our results imply that linear sampling algorithms are… ▽ More We study the recovery of functions in various norms, including $L_p$ with $1\le p\le\infty$, based on function evaluations. We obtain worst case error bounds for general classes of functions in terms of the best $L_2$-approximation from a given nested sequence of subspaces and the Christoffel function of these subspaces. In the case $p=\infty$, our results imply that linear sampling algorithms are optimal up to a constant factor for many reproducing kernel Hilbert spaces. △ Less

Submitted 1 November, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

Comments: The title has changed slightly. Some results from earlier versions are shifted to another publication

MSC Class: 68Q25; 41A50; 46B09; 41A63; 47B06

arXiv:2205.04141 [pdf, ps, other]

doi 10.1007/s10444-023-10021-7

Exponential tractability of $L_2$-approximation with function values

Authors: David Krieg, Pawel Siedlecki, Mario Ullrich, Henryk Woźniakowski

Abstract: We study the complexity of high-dimensional approximation in the $L_2$-norm when different classes of information are available; we compare the power of function evaluations with the power of arbitrary continuous linear measurements. Here, we discuss the situation when the number of linear measurements required to achieve an error $\varepsilon \in (0,1)$ in dimension $d\in\mathbb{N}$ depends only… ▽ More We study the complexity of high-dimensional approximation in the $L_2$-norm when different classes of information are available; we compare the power of function evaluations with the power of arbitrary continuous linear measurements. Here, we discuss the situation when the number of linear measurements required to achieve an error $\varepsilon \in (0,1)$ in dimension $d\in\mathbb{N}$ depends only poly-logarithmically on $\varepsilon^{-1}$. This corresponds to an exponential order of convergence of the approximation error, which often happens in applications. However, it does not mean that the high-dimensional approximation problem is easy, the main difficulty usually lies within the dependence on the dimension $d$. We determine to which extent the required amount of information changes, if we allow only function evaluation instead of arbitrary linear information. It turns out that in this case we only lose very little, and we can even restrict to linear algorithms. In particular, several notions of tractability hold simultaneously for both types of available information. △ Less

Submitted 22 March, 2023; v1 submitted 9 May, 2022; originally announced May 2022.

MSC Class: 65Y20; 41A25; 41A65; 41A63

Journal ref: Adv Comput Math 49, 18 (2023)

arXiv:2006.15135 [pdf, ps, other]

Generating induction principles and subterm relations for inductive types using MetaCoq

Authors: Bohdan Liesnikov, Marcel Ullrich, Yannick Forster

Abstract: We implement three Coq plugins regarding inductive types in MetaCoq. The first plugin is a simple syntax transformation generating alternative constructors for inductive types by abstracting over concrete indices in the types of the constructors. The second plugin re-implements Coq's $\texttt{Scheme Induction}$ command in MetaCoq, and extends it to nested inductive types, e.g. types like rose tree… ▽ More We implement three Coq plugins regarding inductive types in MetaCoq. The first plugin is a simple syntax transformation generating alternative constructors for inductive types by abstracting over concrete indices in the types of the constructors. The second plugin re-implements Coq's $\texttt{Scheme Induction}$ command in MetaCoq, and extends it to nested inductive types, e.g. types like rose trees which use $\texttt{list}$ in their definition, similar to the Elpi-plugin by Tassi. The third plugin implements the $\texttt{Derive Subterm}$ command provided by the Equations package in MetaCoq. △ Less

Submitted 25 June, 2020; originally announced June 2020.

Comments: accepted for presentation at the Coq Workshop 2020

arXiv:1901.06702 [pdf, ps, other]

Deterministic constructions of high-dimensional sets with small dispersion

Authors: Mario Ullrich, Jan Vybíral

Abstract: The dispersion of a point set $P\subset[0,1]^d$ is the volume of the largest box with sides parallel to the coordinate axes, which does not intersect $P$. Here, we show a construction of low-dispersion point sets, which can be deduced from solutions of certain $k$-restriction problems, which are well-known in coding theory. It was observed only recently that, for any $\varepsilon>0$, certain ran… ▽ More The dispersion of a point set $P\subset[0,1]^d$ is the volume of the largest box with sides parallel to the coordinate axes, which does not intersect $P$. Here, we show a construction of low-dispersion point sets, which can be deduced from solutions of certain $k$-restriction problems, which are well-known in coding theory. It was observed only recently that, for any $\varepsilon>0$, certain randomized constructions provide point sets with dispersion smaller than $\varepsilon$ and number of elements growing only logarithmically in $d$. Based on deep results from coding theory, we present explicit, deterministic algorithms to construct such point sets in time that is only polynomial in $d$. Note that, however, the running-time will be super-exponential in $\varepsilon^{-1}$. △ Less

Submitted 20 January, 2019; originally announced January 2019.

arXiv:1710.08694 [pdf, ps, other]

doi 10.1016/j.dam.2018.08.032

A note on the dispersion of admissible lattices

Authors: Mario Ullrich

Abstract: In this note we show that the volume of axis-parallel boxes in $\mathbb{R}^d$ which do not intersect an admissible lattice $\mathbb{L}\subset\mathbb{R}^d$ is uniformly bounded. In particular, this implies that the dispersion of the dilated lattices $N^{-1/d}\mathbb{L}$ restricted to the unit cube is of the (optimal) order $N^{-1}$ as $N$ goes to infinity. This result was obtained independently b… ▽ More In this note we show that the volume of axis-parallel boxes in $\mathbb{R}^d$ which do not intersect an admissible lattice $\mathbb{L}\subset\mathbb{R}^d$ is uniformly bounded. In particular, this implies that the dispersion of the dilated lattices $N^{-1/d}\mathbb{L}$ restricted to the unit cube is of the (optimal) order $N^{-1}$ as $N$ goes to infinity. This result was obtained independently by V.N. Temlyakov (arXiv:1709.08158). △ Less

Submitted 24 October, 2017; originally announced October 2017.

Comments: 4 pages

Journal ref: Discrete Applied Mathematics 257 (2019), 385-387

arXiv:1608.08687 [pdf, ps, other]

doi 10.1007/s10231-017-0670-3

Lattice based integration algorithms: Kronecker sequences and rank-1 lattices

Authors: Josef Dick, Friedrich Pillichshammer, Kosuke Suzuki, Mario Ullrich, Takehito Yoshiki

Abstract: We prove upper bounds on the order of convergence of lattice based algorithms for numerical integration in function spaces of dominating mixed smoothness on the unit cube with homogeneous boundary condition. More precisely, we study worst-case integration errors for Besov spaces of dominating mixed smoothness $\mathring{\mathbf{B}}^s_{p,θ}$, which also comprise the concept of Sobolev spaces of dom… ▽ More We prove upper bounds on the order of convergence of lattice based algorithms for numerical integration in function spaces of dominating mixed smoothness on the unit cube with homogeneous boundary condition. More precisely, we study worst-case integration errors for Besov spaces of dominating mixed smoothness $\mathring{\mathbf{B}}^s_{p,θ}$, which also comprise the concept of Sobolev spaces of dominating mixed smoothness $\mathring{\mathbf{H}}^s_{p}$ as special cases. The considered algorithms are quasi-Monte Carlo rules with underlying nodes from $T_N(\mathbb{Z}^d) \cap [0,1)^d$, where $T_N$ is a real invertible generator matrix of size $d$. For such rules the worst-case error can be bounded in terms of the Zaremba index of the lattice $\mathbb{X}_N=T_N(\mathbb{Z}^d)$. We apply this result to Kronecker lattices and to rank-1 lattice point sets, which both lead to optimal error bounds up to $\log N$-factors for arbitrary smoothness $s$. The advantage of Kronecker lattices and classical lattice point sets is that the run-time of algorithms generating these point sets is very short. △ Less

Submitted 30 August, 2016; originally announced August 2016.

Comments: 19 pages

MSC Class: 65D30; 65D32; 11K31

Journal ref: Annali di Matematica (2018) 197: 109

arXiv:1510.04617 [pdf, other]

doi 10.1016/j.matcom.2015.12.005

A lower bound for the dispersion on the torus

Authors: Mario Ullrich

Abstract: We consider the volume of the largest axis-parallel box in the $d$-dimensional torus that contains no point of a given point set $\mathcal{P}_n$ with $n$ elements. We prove that, for all natural numbers $d, n$ and every point set $\mathcal{P}_n$, this volume is bounded from below by $\min\{1,d/n\}$. This implies the same lower bound for the discrepancy on the torus. We consider the volume of the largest axis-parallel box in the $d$-dimensional torus that contains no point of a given point set $\mathcal{P}_n$ with $n$ elements. We prove that, for all natural numbers $d, n$ and every point set $\mathcal{P}_n$, this volume is bounded from below by $\min\{1,d/n\}$. This implies the same lower bound for the discrepancy on the torus. △ Less

Submitted 15 October, 2015; originally announced October 2015.

Comments: 6 pages

arXiv:1301.4055 [pdf, ps, other]

doi 10.1016/j.laa.2014.04.018

Structure and eigenvalues of heat-bath Markov chains

Authors: Martin Dyer, Catherine Greenhill, Mario Ullrich

Abstract: We prove that heat-bath chains (which we define in a general setting) have no negative eigenvalues. Two applications of this result are presented: one to single-site heat-bath chains for spin systems and one to a heat-bath Markov chain for sampling contingency tables. Some implications of our main result for the analysis of the mixing time of heat-bath Markov chains are discussed. We also prove an… ▽ More We prove that heat-bath chains (which we define in a general setting) have no negative eigenvalues. Two applications of this result are presented: one to single-site heat-bath chains for spin systems and one to a heat-bath Markov chain for sampling contingency tables. Some implications of our main result for the analysis of the mixing time of heat-bath Markov chains are discussed. We also prove an alternative characterisation of heat-bath chains, and consider possible generalisations. △ Less

Submitted 9 April, 2014; v1 submitted 17 January, 2013; originally announced January 2013.

Comments: 15 pages. Minor edits to address referee's comments

Journal ref: Linear Algebra Appl. 454 (2014), 57-71

arXiv:1202.6321 [pdf, ps, other]

Rapid mixing of Swendsen-Wang and single-bond dynamics in two dimensions

Authors: Mario Ullrich

Abstract: We prove that the spectral gap of the Swendsen-Wang dynamics for the random-cluster model on arbitrary graphs with m edges is bounded above by 16 m log m times the spectral gap of the single-bond (or heat-bath) dynamics. This and the corresponding lower bound imply that rapid mixing of these two dynamics is equivalent. Using the known lower bound on the spectral gap of the Swendsen-Wang dynamics… ▽ More We prove that the spectral gap of the Swendsen-Wang dynamics for the random-cluster model on arbitrary graphs with m edges is bounded above by 16 m log m times the spectral gap of the single-bond (or heat-bath) dynamics. This and the corresponding lower bound imply that rapid mixing of these two dynamics is equivalent. Using the known lower bound on the spectral gap of the Swendsen-Wang dynamics for the two dimensional square lattice $Z_L^2$ of side length L at high temperatures and a result for the single-bond dynamics on dual graphs, we obtain rapid mixing of both dynamics on $\Z_L^2$ at all non-critical temperatures. In particular this implies, as far as we know, the first proof of rapid mixing of a classical Markov chain for the Ising model on $\Z_L^2$ at all temperatures. △ Less

Submitted 28 February, 2012; originally announced February 2012.

Comments: 20 pages

arXiv:1201.5793 [pdf, ps, other]

doi 10.1137/120864003

Swendsen-Wang is faster than single-bond dynamics

Authors: Mario Ullrich

Abstract: We prove that the spectral gap of the Swendsen-Wang dynamics for the random-cluster model is larger than the spectral gap of a single-bond dynamics, that updates only a single edge per step. For this we give a representation of the algorithms on the joint (Potts/random-cluster) model. Furthermore we obtain upper and lower bounds on the mixing time of the single-bond dynamics on the discrete $d$-di… ▽ More We prove that the spectral gap of the Swendsen-Wang dynamics for the random-cluster model is larger than the spectral gap of a single-bond dynamics, that updates only a single edge per step. For this we give a representation of the algorithms on the joint (Potts/random-cluster) model. Furthermore we obtain upper and lower bounds on the mixing time of the single-bond dynamics on the discrete $d$-dimensional torus of side length $L$ at the Potts transition temperature for $q$ large enough that are exponential in $L^{d-1}$, complementing a result of Borgs, Chayes and Tetali. △ Less

Submitted 18 January, 2014; v1 submitted 27 January, 2012; originally announced January 2012.

Comments: 17 pages

MSC Class: Primary; 60J10; Secondary; 60K35; 68Q87

Journal ref: SIAM J. Discrete Math. 28 (2014), pp. 37-48

Showing 1–14 of 14 results for author: Ullrich, M