-
Single Exposure Quantitative Phase Imaging with a Conventional Microscope using Diffusion Models
Authors:
Gabriel della Maggiora,
Luis Alberto Croquevielle,
Harry Horsley,
Thomas Heinis,
Artur Yakimovich
Abstract:
Phase imaging is gaining importance due to its applications in fields like biomedical imaging and material characterization. In biomedical applications, it can provide quantitative information missing in label-free microscopy modalities. One of the most prominent methods in phase quantification is the Transport-of-Intensity Equation (TIE). TIE often requires multiple acquisitions at different defo…
▽ More
Phase imaging is gaining importance due to its applications in fields like biomedical imaging and material characterization. In biomedical applications, it can provide quantitative information missing in label-free microscopy modalities. One of the most prominent methods in phase quantification is the Transport-of-Intensity Equation (TIE). TIE often requires multiple acquisitions at different defocus distances, which is not always feasible in a clinical setting. To address this issue, we propose to use chromatic aberrations to induce the required through-focus images with a single exposure, effectively generating a through-focus stack. Since the defocus distance induced by the aberrations is small, conventional TIE solvers are insufficient to address the resulting artifacts. We propose Zero-Mean Diffusion, a modified version of diffusion models designed for quantitative image prediction, and train it with synthetic data to ensure robust phase retrieval. Our contributions offer an alternative TIE approach that leverages chromatic aberrations, achieving accurate single-exposure phase measurement with white light and thus improving the efficiency of phase imaging. Moreover, we present a new class of diffusion models that are well-suited for quantitative data and have a sound theoretical basis. To validate our approach, we employ a widespread brightfield microscope equipped with a commercially available color camera. We apply our model to clinical microscopy of patients' urine, obtaining accurate phase measurements.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Coding Over Coupon Collector Channels for Combinatorial Motif-Based DNA Storage
Authors:
Roman Sokolovskii,
Parv Agarwal,
Luis Alberto Croquevielle,
Zijian Zhou,
Thomas Heinis
Abstract:
Encoding information in combinations of pre-synthesised deoxyribonucleic acid (DNA) strands (referred to as motifs) is an interesting approach to DNA storage that could potentially circumvent the prohibitive costs of nucleotide-by-nucleotide DNA synthesis. Based on our analysis of an empirical data set from HelixWorks, we propose two channel models for this setup (with and without interference) an…
▽ More
Encoding information in combinations of pre-synthesised deoxyribonucleic acid (DNA) strands (referred to as motifs) is an interesting approach to DNA storage that could potentially circumvent the prohibitive costs of nucleotide-by-nucleotide DNA synthesis. Based on our analysis of an empirical data set from HelixWorks, we propose two channel models for this setup (with and without interference) and analyse their fundamental limits. We propose a coding scheme that approaches those limits by leveraging all information available at the output of the channel, in contrast to earlier schemes developed for a similar setup by Preuss et al. We highlight an important connection between channel capacity curves and the fundamental trade-off between synthesis (writing) and sequencing (reading), and offer a way to mitigate an exponential growth in decoding complexity with the size of the motif library.
△ Less
Submitted 13 June, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
Querying in Constant Time with Learned Indexes
Authors:
Luis Croquevielle,
Guang Yang,
Liang Lian,
Ali Hadian,
Thomas Heinis
Abstract:
Learned indexes leverage machine learning models to accelerate query answering in databases, showing impressive practical performance. However, theoretical understanding of these methods remains incomplete. Existing research suggests that learned indexes have superior asymptotic complexity compared to their non-learned counterparts, but these findings have been established under restrictive probab…
▽ More
Learned indexes leverage machine learning models to accelerate query answering in databases, showing impressive practical performance. However, theoretical understanding of these methods remains incomplete. Existing research suggests that learned indexes have superior asymptotic complexity compared to their non-learned counterparts, but these findings have been established under restrictive probabilistic assumptions. Specifically, for a sorted array with $n$ elements, it has been shown that learned indexes can find a key in $O(\log(\log n))$ expected time using at most linear space, compared with $O(\log n)$ for non-learned methods.
In this work, we prove $O(1)$ expected time can be achieved with at most linear space, thereby establishing the tightest upper bound so far for the time complexity of an asymptotically optimal learned index. Notably, we use weaker probabilistic assumptions than prior work, meaning our results generalize previous efforts. Furthermore, we introduce a new measure of statistical complexity for data. This metric exhibits an information-theoretical interpretation and can be estimated in practice. This characterization provides further theoretical understanding of learned indexes, by hel** to explain why some datasets seem to be particularly challenging for these methods.
△ Less
Submitted 13 June, 2024; v1 submitted 6 May, 2024;
originally announced May 2024.
-
Conditional Variational Diffusion Models
Authors:
Gabriel della Maggiora,
Luis Alberto Croquevielle,
Nikita Deshpande,
Harry Horsley,
Thomas Heinis,
Artur Yakimovich
Abstract:
Inverse problems aim to determine parameters from observations, a crucial task in engineering and science. Lately, generative models, especially diffusion models, have gained popularity in this area for their ability to produce realistic solutions and their good mathematical properties. Despite their success, an important drawback of diffusion models is their sensitivity to the choice of variance…
▽ More
Inverse problems aim to determine parameters from observations, a crucial task in engineering and science. Lately, generative models, especially diffusion models, have gained popularity in this area for their ability to produce realistic solutions and their good mathematical properties. Despite their success, an important drawback of diffusion models is their sensitivity to the choice of variance schedule, which controls the dynamics of the diffusion process. Fine-tuning this schedule for specific applications is crucial but time-costly and does not guarantee an optimal result. We propose a novel approach for learning the schedule as part of the training process. Our method supports probabilistic conditioning on data, provides high-quality solutions, and is flexible, proving able to adapt to different applications with minimum overhead. This approach is tested in two unrelated inverse problems: super-resolution microscopy and quantitative phase imaging, yielding comparable or superior results to previous methods and fine-tuned diffusion models. We conclude that fine-tuning the schedule by experimentation should be avoided because it can be learned during training in a stable way that yields better results.
△ Less
Submitted 26 April, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
When is Approximate Counting for Conjunctive Queries Tractable?
Authors:
Marcelo Arenas,
Luis Alberto Croquevielle,
Rajesh Jayaram,
Cristian Riveros
Abstract:
Conjunctive queries are one of the most common class of queries used in database systems, and the best studied in the literature. A seminal result of Grohe, Schwentick, and Segoufin (STOC 2001) demonstrates that for every class $G$ of graphs, the evaluation of all conjunctive queries whose underlying graph is in $G$ is tractable if, and only if, $G$ has bounded treewidth. In this work, we extend t…
▽ More
Conjunctive queries are one of the most common class of queries used in database systems, and the best studied in the literature. A seminal result of Grohe, Schwentick, and Segoufin (STOC 2001) demonstrates that for every class $G$ of graphs, the evaluation of all conjunctive queries whose underlying graph is in $G$ is tractable if, and only if, $G$ has bounded treewidth. In this work, we extend this characterization to the counting problem for conjunctive queries. Specifically, for every class $C$ of conjunctive queries with bounded treewidth, we introduce the first fully polynomial-time randomized approximation scheme (FPRAS) for counting answers to a query in $C$, and the first polynomial-time algorithm for sampling answers uniformly from a query in $C$. As a corollary, it follows that for every class $G$ of graphs, the counting problem for conjunctive queries whose underlying graph is in $G$ admits an FPRAS if, and only if, $G$ has bounded treewidth (unless $\text{BPP} \neq \text{P}$)}. In fact, our FPRAS is more general, and also applies to conjunctive queries with bounded hypertree width, as well as unions of such queries.
The key ingredient in our proof is the resolution of a fundamental counting problem from automata theory. Specifically, we demonstrate the first FPRAS and polynomial time sampler for the set of trees of size $n$ accepted by a tree automaton, which improves the prior quasi-polynomial time randomized approximation scheme (QPRAS) and sampling algorithm of Gore, Jerrum, Kannan, Sweedyk, and Mahaney '97. We demonstrate how this algorithm can be used to obtain an FPRAS for many hitherto open problems, such as counting solutions to constraint satisfaction problems (CSP) with bounded hypertree-width, counting the number of error threads in programs with nested call subroutines, and counting valid assignments to structured DNNF circuits.
△ Less
Submitted 20 November, 2020; v1 submitted 20 May, 2020;
originally announced May 2020.
-
$\text{#NFA}$ admits an FPRAS: Efficient Enumeration, Counting, and Uniform Generation for Logspace Classes
Authors:
Marcelo Arenas,
Luis Alberto Croquevielle,
Rajesh Jayaram,
Cristian Riveros
Abstract:
In this work, we study two simple yet general complexity classes, based on logspace Turing machines, which provide a unifying framework for efficient query evaluation in areas like information extraction and graph databases, among others. We investigate the complexity of three fundamental algorithmic problems for these classes: enumeration, counting and uniform generation of solutions, and show th…
▽ More
In this work, we study two simple yet general complexity classes, based on logspace Turing machines, which provide a unifying framework for efficient query evaluation in areas like information extraction and graph databases, among others. We investigate the complexity of three fundamental algorithmic problems for these classes: enumeration, counting and uniform generation of solutions, and show that they have several desirable properties in this respect.
Both complexity classes are defined in terms of non-deterministic logspace transducers (NL transducers). For the first class, we consider the case of unambiguous NL transducers, and we prove constant delay enumeration, and both counting and uniform generation of solutions in polynomial time. For the second class, we consider unrestricted NL transducers, and we obtain polynomial delay enumeration, approximate counting in polynomial time, and polynomial-time randomized algorithms for uniform generation. More specifically, we show that each problem in this second class admits a fully polynomial-time randomized approximation scheme (FPRAS) and a polynomial-time Las Vegas algorithm for uniform generation. Interestingly, the key idea to prove these results is to show that the fundamental problem $\text{#NFA}$ admits an FPRAS, where $\text{#NFA}$ is the problem of counting the number of strings of length $n$ (given in unary) accepted by a non-deterministic finite automaton (NFA). While this problem is known to be $\text{#P}$-complete and, more precisely, $\text{SpanL}$-complete, it was open whether this problem admits an FPRAS. In this work, we solve this open problem, and obtain as a welcome corollary that every function in $\text{SpanL}$ admits an FPRAS.
△ Less
Submitted 23 June, 2021; v1 submitted 21 June, 2019;
originally announced June 2019.