-
SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models
Authors:
Xiang Gao,
Jiaxin Zhang,
Lalla Mouatadid,
Kamalika Das
Abstract:
In recent years, large language models (LLMs) have become increasingly prevalent, offering remarkable text generation capabilities. However, a pressing challenge is their tendency to make confidently wrong predictions, highlighting the critical need for uncertainty quantification (UQ) in LLMs. While previous works have mainly focused on addressing aleatoric uncertainty, the full spectrum of uncert…
▽ More
In recent years, large language models (LLMs) have become increasingly prevalent, offering remarkable text generation capabilities. However, a pressing challenge is their tendency to make confidently wrong predictions, highlighting the critical need for uncertainty quantification (UQ) in LLMs. While previous works have mainly focused on addressing aleatoric uncertainty, the full spectrum of uncertainties, including epistemic, remains inadequately explored. Motivated by this gap, we introduce a novel UQ method, sampling with perturbation for UQ (SPUQ), designed to tackle both aleatoric and epistemic uncertainties. The method entails generating a set of perturbations for LLM inputs, sampling outputs for each perturbation, and incorporating an aggregation module that generalizes the sampling uncertainty approach for text generation tasks. Through extensive experiments on various datasets, we investigated different perturbation and aggregation techniques. Our findings show a substantial improvement in model uncertainty calibration, with a reduction in Expected Calibration Error (ECE) by 50\% on average. Our findings suggest that our proposed UQ method offers promising steps toward enhancing the reliability and trustworthiness of LLMs.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
DECDM: Document Enhancement using Cycle-Consistent Diffusion Models
Authors:
Jiaxin Zhang,
Joy Rimchala,
Lalla Mouatadid,
Kamalika Das,
Sricharan Kumar
Abstract:
The performance of optical character recognition (OCR) heavily relies on document image quality, which is crucial for automatic document processing and document intelligence. However, most existing document enhancement methods require supervised data pairs, which raises concerns about data separation and privacy protection, and makes it challenging to adapt these methods to new domain pairs. To ad…
▽ More
The performance of optical character recognition (OCR) heavily relies on document image quality, which is crucial for automatic document processing and document intelligence. However, most existing document enhancement methods require supervised data pairs, which raises concerns about data separation and privacy protection, and makes it challenging to adapt these methods to new domain pairs. To address these issues, we propose DECDM, an end-to-end document-level image translation method inspired by recent advances in diffusion models. Our method overcomes the limitations of paired training by independently training the source (noisy input) and target (clean output) models, making it possible to apply domain-specific diffusion models to other pairs. DECDM trains on one dataset at a time, eliminating the need to scan both datasets concurrently, and effectively preserving data privacy from the source or target domain. We also introduce simple data augmentation strategies to improve character-glyph conservation during translation. We compare DECDM with state-of-the-art methods on multiple synthetic data and benchmark datasets, such as document denoising and {\color{black}shadow} removal, and demonstrate the superiority of performance quantitatively and qualitatively.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
RE$^2$: Region-Aware Relation Extraction from Visually Rich Documents
Authors:
Pritika Ramu,
Sijia Wang,
Lalla Mouatadid,
Joy Rimchala,
Lifu Huang
Abstract:
Current research in form understanding predominantly relies on large pre-trained language models, necessitating extensive data for pre-training. However, the importance of layout structure (i.e., the spatial relationship between the entity blocks in the visually rich document) to relation extraction has been overlooked. In this paper, we propose REgion-Aware Relation Extraction (RE$^2$) that lever…
▽ More
Current research in form understanding predominantly relies on large pre-trained language models, necessitating extensive data for pre-training. However, the importance of layout structure (i.e., the spatial relationship between the entity blocks in the visually rich document) to relation extraction has been overlooked. In this paper, we propose REgion-Aware Relation Extraction (RE$^2$) that leverages region-level spatial structure among the entity blocks to improve their relation prediction. We design an edge-aware graph attention network to learn the interaction between entities while considering their spatial relationship defined by their region-level representations. We also introduce a constraint objective to regularize the model towards consistency with the inherent constraints of the relation extraction task. Extensive experiments across various datasets, languages and domains demonstrate the superiority of our proposed approach.
△ Less
Submitted 3 June, 2024; v1 submitted 23 May, 2023;
originally announced May 2023.
-
A Scalable Technique for Weak-Supervised Learning with Domain Constraints
Authors:
Sudhir Agarwal,
Anu Sreepathy,
Lalla Mouatadid
Abstract:
We propose a novel scalable end-to-end pipeline that uses symbolic domain knowledge as constraints for learning a neural network for classifying unlabeled data in a weak-supervised manner. Our approach is particularly well-suited for settings where the data consists of distinct groups (classes) that lends itself to clustering-friendly representation learning and the domain constraints can be refor…
▽ More
We propose a novel scalable end-to-end pipeline that uses symbolic domain knowledge as constraints for learning a neural network for classifying unlabeled data in a weak-supervised manner. Our approach is particularly well-suited for settings where the data consists of distinct groups (classes) that lends itself to clustering-friendly representation learning and the domain constraints can be reformulated for use of efficient mathematical optimization techniques by considering multiple training examples at once. We evaluate our approach on a variant of the MNIST image classification problem where a training example consists of image sequences and the sum of the numbers represented by the sequences, and show that our approach scales significantly better than previous approaches that rely on computing all constraint satisfying combinations for each training example.
△ Less
Submitted 19 October, 2023; v1 submitted 12 January, 2023;
originally announced January 2023.
-
(α, β)-Modules in Graphs
Authors:
Michel Habib,
Lalla Mouatadid,
Eric Sopena,
Mengchuan Zou
Abstract:
Modular Decomposition focuses on repeatedly identifying a module M (a collection of vertices that shares exactly the same neighbourhood outside of M) and collapsing it into a single vertex. This notion of exactitude of neighbourhood is very strict, especially when dealing with real world graphs. We study new ways to relax this exactitude condition. However, generalizing modular decomposition is fa…
▽ More
Modular Decomposition focuses on repeatedly identifying a module M (a collection of vertices that shares exactly the same neighbourhood outside of M) and collapsing it into a single vertex. This notion of exactitude of neighbourhood is very strict, especially when dealing with real world graphs. We study new ways to relax this exactitude condition. However, generalizing modular decomposition is far from obvious. Most of the previous proposals lose algebraic properties of modules and thus most of the nice algorithmic consequences. We introduce the notion of an (α, β)-module, a relaxation that allows a bounded number of errors in each node and maintains some of the algebraic structure. It leads to a new combinatorial decomposition with interesting properties. Among the main results in this work, we show that minimal (α, β)-modules can be computed in polynomial time, and that every graph admits an (α,β)-modular decomposition tree, thus generalizing Gallai's Theorem (which corresponds to the case for α = β = 0). Unfortunately we give evidence that computing such a decomposition tree can be difficult.
△ Less
Submitted 21 January, 2021;
originally announced January 2021.
-
Maximum Induced Matching Algorithms via Vertex Ordering Characterizations
Authors:
Michel Habib,
Lalla Mouatadid
Abstract:
We study the maximum induced matching problem on a graph g. Induced matchings correspond to independent sets in L2(g), the square of the line graph of g. The problem is NP-complete on bipartite graphs. In this work, we show that for a number of graph families with forbidden vertex orderings, almost all forbidden patterns on three vertices are preserved when taking the square of the line graph. The…
▽ More
We study the maximum induced matching problem on a graph g. Induced matchings correspond to independent sets in L2(g), the square of the line graph of g. The problem is NP-complete on bipartite graphs. In this work, we show that for a number of graph families with forbidden vertex orderings, almost all forbidden patterns on three vertices are preserved when taking the square of the line graph. These orderings can be computed in linear time in the size of the input graph. In particular, given a graph class G characterized by a vertex ordering, and a graph g = (V, E) in G with a corresponding vertex ordering σof V , one can produce (in linear time in the size of g) an ordering on the vertices of L2(g), that shows that L2(g) in G - for a number of graph classes G - without computing the line graph or the square of the line graph of g. These results generalize and unify previous ones on showing closure under L2(.) for various graph families. Furthermore, these orderings on L2(g) can be exploited algorithmically to compute a maximum induced matching on G faster. We illustrate this latter fact in the second half of the paper where we focus on cocomparability graphs, a large graph class that includes interval, permutation, trapezoid graphs, and co-graphs, and we present the first O(mn) time algorithm to compute a maximum weighted induced matching on cocomparability graphs; an improvement from the best known O(n4) time algorithm for the unweighted case.
△ Less
Submitted 20 September, 2017; v1 submitted 5 July, 2017;
originally announced July 2017.
-
A New Graph Parameter To Measure Linearity
Authors:
Pierre Charbit,
Michel Habib,
Lalla Mouatadid,
Reza Naserasr
Abstract:
Consider a sequence of LexBFS vertex orderings σ1, σ2, . . . where each ordering σi is used to break ties for σi+1. Since the total number of vertex orderings of a finite graph is finite, this sequence must end in a cycle of vertex orderings. The possible length of this cycle is the main subject of this work. Intuitively, we prove for graphs with a known notion of linearity (e.g., interval graphs…
▽ More
Consider a sequence of LexBFS vertex orderings σ1, σ2, . . . where each ordering σi is used to break ties for σi+1. Since the total number of vertex orderings of a finite graph is finite, this sequence must end in a cycle of vertex orderings. The possible length of this cycle is the main subject of this work. Intuitively, we prove for graphs with a known notion of linearity (e.g., interval graphs with their interval representation on the real line), this cycle cannot be too big, no matter which vertex ordering we start with. More precisely, it was conjectured in [9] that for cocomparability graphs, the size of this cycle is always 2, independent of the starting order. Furthermore [27] asked whether for arbitrary graphs, the size of such a cycle is always bounded by the asteroidal number of the graph. In this work, while we answer this latter question negatively, we provide support for the conjecture on cocomparability graphs by proving it for the subclass of domino-free cocomparability graphs. This subclass contains cographs, proper interval, interval, and cobipartite graphs. We also provide simpler independent proofs for each of these cases which lead to stronger results on this subclasses.
△ Less
Submitted 17 October, 2022; v1 submitted 7 February, 2017;
originally announced February 2017.
-
Path Graphs, Clique Trees, and Flowers
Authors:
Lalla Mouatadid,
Robert Robere
Abstract:
An \emph{asteroidal triple} is a set of three independent vertices in a graph such that any two vertices in the set are connected by a path which avoids the neighbourhood of the third.
A classical result by Lekkerkerker and Boland \cite{6} showed that interval graphs are precisely the chordal graphs that do not have asteroidal triples.
Interval graphs are chordal, as are the \emph{directed pat…
▽ More
An \emph{asteroidal triple} is a set of three independent vertices in a graph such that any two vertices in the set are connected by a path which avoids the neighbourhood of the third.
A classical result by Lekkerkerker and Boland \cite{6} showed that interval graphs are precisely the chordal graphs that do not have asteroidal triples.
Interval graphs are chordal, as are the \emph{directed path graphs} and the \emph{path graphs}.
Similar to Lekkerkerker and Boland, Cameron, Hoáng, and Lévêque \cite{4} gave a characterization of directed path graphs by a "special type" of asteroidal triple, and asked whether or not there was such a characterization for path graphs.
We give strong evidence that asteroidal triples alone are insufficient to characterize the family of path graphs, and give a new characterization of path graphs via a forbidden induced subgraph family that we call \emph{sun systems}.
Key to our new characterization is the study of \emph{asteroidal sets} in sun systems, which are a natural generalization of asteroidal triples.
Our characterization of path graphs by forbidding sun systems also generalizes a characterization of directed path graphs by forbidding odd suns that was given by Chaplick et al.~\cite{9}.
△ Less
Submitted 28 May, 2015;
originally announced May 2015.
-
Linear Time LexDFS on Cocomparability Graphs
Authors:
Ekkehard Köhler,
Lalla Mouatadid
Abstract:
Lexicographic depth first search (LexDFS) is a graph search protocol which has already proved to be a powerful tool on cocomparability graphs. Cocomparability graphs have been well studied by investigating their complements (comparability graphs) and their corresponding posets. Recently however LexDFS has led to a number of elegant polynomial and near linear time algorithms on cocomparability grap…
▽ More
Lexicographic depth first search (LexDFS) is a graph search protocol which has already proved to be a powerful tool on cocomparability graphs. Cocomparability graphs have been well studied by investigating their complements (comparability graphs) and their corresponding posets. Recently however LexDFS has led to a number of elegant polynomial and near linear time algorithms on cocomparability graphs when used as a preprocessing step [2, 3, 11]. The nonlinear runtime of some of these results is a consequence of complexity of this preprocessing step. We present the first linear time algorithm to compute a LexDFS cocomparability ordering, therefore answering a problem raised in [2] and hel** achieve the first linear time algorithms for the minimum path cover problem, and thus the Hamilton path problem, the maximum independent set problem and the minimum clique cover for this graph family.
△ Less
Submitted 23 April, 2014;
originally announced April 2014.