Search | arXiv e-print repository

Do stable neural networks exist for classification problems? -- A new view on stability in AI

Abstract: In deep learning (DL) the instability phenomenon is widespread and well documented, most commonly using the classical measure of stability, the Lipschitz constant. While a small Lipchitz constant is traditionally viewed as guarantying stability, it does not capture the instability phenomenon in DL for classification well. The reason is that a classification function -- which is the target function… ▽ More In deep learning (DL) the instability phenomenon is widespread and well documented, most commonly using the classical measure of stability, the Lipschitz constant. While a small Lipchitz constant is traditionally viewed as guarantying stability, it does not capture the instability phenomenon in DL for classification well. The reason is that a classification function -- which is the target function to be approximated -- is necessarily discontinuous, thus having an 'infinite' Lipchitz constant. As a result, the classical approach will deem every classification function unstable, yet basic classification functions a la 'is there a cat in the image?' will typically be locally very 'flat' -- and thus locally stable -- except at the decision boundary. The lack of an appropriate measure of stability hinders a rigorous theory for stability in DL, and consequently, there are no proper approximation theoretic results that can guarantee the existence of stable networks for classification functions. In this paper we introduce a novel stability measure $\mathscr{S}(f)$, for any classification function $f$, appropriate to study the stability of discontinuous functions and their approximations. We further prove two approximation theorems: First, for any $ε> 0$ and any classification function $f$ on a \emph{compact set}, there is a neural network (NN) $ψ$, such that $ψ- f \neq 0$ only on a set of measure $< ε$, moreover, $\mathscr{S}(ψ) \geq \mathscr{S}(f) - ε$ (as accurate and stable as $f$ up to $ε$). Second, for any classification function $f$ and $ε> 0$, there exists a NN $ψ$ such that $ψ= f$ on the set of points that are at least $ε$ away from the decision boundary. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2312.11425 [pdf, other]

When can you trust feature selection? -- I: A condition-based analysis of LASSO and generalised hardness of approximation

Authors: Alexander Bastounis, Felipe Cucker, Anders C. Hansen

Abstract: The arrival of AI techniques in computations, with the potential for hallucinations and non-robustness, has made trustworthiness of algorithms a focal point. However, trustworthiness of the many classical approaches are not well understood. This is the case for feature selection, a classical problem in the sciences, statistics, machine learning etc. Here, the LASSO optimisation problem is standard… ▽ More The arrival of AI techniques in computations, with the potential for hallucinations and non-robustness, has made trustworthiness of algorithms a focal point. However, trustworthiness of the many classical approaches are not well understood. This is the case for feature selection, a classical problem in the sciences, statistics, machine learning etc. Here, the LASSO optimisation problem is standard. Despite its widespread use, it has not been established when the output of algorithms attempting to compute support sets of minimisers of LASSO in order to do feature selection can be trusted. In this paper we establish how no (randomised) algorithm that works on all inputs can determine the correct support sets (with probability $> 1/2$) of minimisers of LASSO when reading approximate input, regardless of precision and computing power. However, we define a LASSO condition number and design an efficient algorithm for computing these support sets provided the input data is well-posed (has finite condition number) in time polynomial in the dimensions and logarithm of the condition number. For ill-posed inputs the algorithm runs forever, hence, it will never produce a wrong answer. Furthermore, the algorithm computes an upper bound for the condition number when this is finite. Finally, for any algorithm defined on an open set containing a point with infinite condition number, there is an input for which the algorithm will either run forever or produce a wrong answer. Our impossibility results stem from generalised hardness of approximation -- within the Solvability Complexity Index (SCI) hierarchy framework -- that generalises the classical phenomenon of hardness of approximation. △ Less

Submitted 18 December, 2023; originally announced December 2023.

Comments: 24 pages, 1 figure

arXiv:2309.07072 [pdf, ps, other]

The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning

Authors: Alexander Bastounis, Alexander N. Gorban, Anders C. Hansen, Desmond J. Higham, Danil Prokhorov, Oliver Sutton, Ivan Y. Tyukin, Qinghua Zhou

Abstract: In this work, we assess the theoretical limitations of determining guaranteed stability and accuracy of neural networks in classification tasks. We consider classical distribution-agnostic framework and algorithms minimising empirical risks and potentially subjected to some weights regularisation. We show that there is a large family of tasks for which computing and verifying ideal stable and accu… ▽ More In this work, we assess the theoretical limitations of determining guaranteed stability and accuracy of neural networks in classification tasks. We consider classical distribution-agnostic framework and algorithms minimising empirical risks and potentially subjected to some weights regularisation. We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks in the above settings is extremely challenging, if at all possible, even when such ideal solutions exist within the given class of neural architectures. △ Less

Submitted 13 September, 2023; originally announced September 2023.

MSC Class: 68T07; 68T05

arXiv:2307.07410 [pdf, other]

Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks

Authors: Johan S. Wind, Vegard Antun, Anders C. Hansen

Abstract: Understanding the implicit regularization imposed by neural network architectures and gradient based optimization methods is a key challenge in deep learning and AI. In this work we provide sharp results for the implicit regularization imposed by the gradient flow of Diagonal Linear Networks (DLNs) in the over-parameterized regression setting and, potentially surprisingly, link this to the phenome… ▽ More Understanding the implicit regularization imposed by neural network architectures and gradient based optimization methods is a key challenge in deep learning and AI. In this work we provide sharp results for the implicit regularization imposed by the gradient flow of Diagonal Linear Networks (DLNs) in the over-parameterized regression setting and, potentially surprisingly, link this to the phenomenon of phase transitions in generalized hardness of approximation (GHA). GHA generalizes the phenomenon of hardness of approximation from computer science to, among others, continuous and robust optimization. It is well-known that the $\ell^1$-norm of the gradient flow of DLNs with tiny initialization converges to the objective function of basis pursuit. We improve upon these results by showing that the gradient flow of DLNs with tiny initialization approximates minimizers of the basis pursuit optimization problem (as opposed to just the objective function), and we obtain new and sharp convergence bounds w.r.t.\ the initialization size. Non-sharpness of our results would imply that the GHA phenomenon would not occur for the basis pursuit optimization problem -- which is a contradiction -- thus implying sharpness. Moreover, we characterize $\textit{which}$ $\ell_1$ minimizer of the basis pursuit problem is chosen by the gradient flow whenever the minimizer is not unique. Interestingly, this depends on the depth of the DLN. △ Less

Submitted 13 July, 2023; originally announced July 2023.

MSC Class: 90C25; 68T07; 90C17 (Primary) 15A29; 94A08; 46N10 (Secondary)

arXiv:2109.06098 [pdf, ps, other]

The mathematics of adversarial attacks in AI -- Why deep learning is unstable despite the existence of stable neural networks

Authors: Alexander Bastounis, Anders C Hansen, Verner Vlačić

Abstract: The unprecedented success of deep learning (DL) makes it unchallenged when it comes to classification problems. However, it is well established that the current DL methodology produces universally unstable neural networks (NNs). The instability problem has caused an enormous research effort -- with a vast literature on so-called adversarial attacks -- yet there has been no solution to the problem.… ▽ More The unprecedented success of deep learning (DL) makes it unchallenged when it comes to classification problems. However, it is well established that the current DL methodology produces universally unstable neural networks (NNs). The instability problem has caused an enormous research effort -- with a vast literature on so-called adversarial attacks -- yet there has been no solution to the problem. Our paper addresses why there has been no solution to the problem, as we prove the following mathematical paradox: any training procedure based on training neural networks for classification problems with a fixed architecture will yield neural networks that are either inaccurate or unstable (if accurate) -- despite the provable existence of both accurate and stable neural networks for the same classification problems. The key is that the stable and accurate neural networks must have variable dimensions depending on the input, in particular, variable dimensions is a necessary condition for stability. Our result points towards the paradox that accurate and stable neural networks exist, however, modern algorithms do not compute them. This yields the question: if the existence of neural networks with desirable properties can be proven, can one also find algorithms that compute them? There are cases in mathematics where provable existence implies computability, but will this be the case for neural networks? The contrary is true, as we demonstrate how neural networks can provably exist as approximate minimisers to standard optimisation problems with standard cost functions, however, no randomised algorithm can compute them with probability better than 1/2. △ Less

Submitted 13 September, 2021; originally announced September 2021.

Comments: 29 pages, 1 figure

arXiv:2101.08286 [pdf, other]

doi 10.1073/pnas.2107151119

Can stable and accurate neural networks be computed? -- On the barriers of deep learning and Smale's 18th problem

Authors: Matthew J. Colbrook, Vegard Antun, Anders C. Hansen

Abstract: Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existen… ▽ More Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existence of NNs with great approximation qualities, however, there does not exist any algorithm, even randomised, that can train (or compute) such a NN. For any positive integers $K > 2$ and $L$, there are cases where simultaneously: (a) no randomised training algorithm can compute a NN correct to $K$ digits with probability greater than $1/2$, (b) there exists a deterministic training algorithm that computes a NN with $K-1$ correct digits, but any such (even randomised) algorithm needs arbitrarily many training data, (c) there exists a deterministic training algorithm that computes a NN with $K-2$ correct digits using no more than $L$ training samples. These results imply a classification theory describing conditions under which (stable) NNs with a given accuracy can be computed by an algorithm. We begin this theory by establishing sufficient conditions for the existence of algorithms that compute stable NNs in inverse problems. We introduce Fast Iterative REstarted NETworks (FIRENETs), which we both prove and numerically verify are stable. Moreover, we prove that only $\mathcal{O}(|\log(ε)|)$ layers are needed for an $ε$-accurate solution to the inverse problem. △ Less

Submitted 15 April, 2021; v1 submitted 20 January, 2021; originally announced January 2021.

Comments: 14 pages + SI Appendix

Journal ref: Proc. Natl. Acad. Sci. USA, 2022

arXiv:2001.01258 [pdf, other]

The troublesome kernel -- On hallucinations, no free lunches and the accuracy-stability trade-off in inverse problems

Authors: Nina M. Gottschling, Vegard Antun, Anders C. Hansen, Ben Adcock

Abstract: Methods inspired by Artificial Intelligence (AI) are starting to fundamentally change computational science and engineering through breakthrough performances on challenging problems. However, reliability and trustworthiness of such techniques is a major concern. In inverse problems in imaging, the focus of this paper, there is increasing empirical evidence that methods may suffer from hallucinatio… ▽ More Methods inspired by Artificial Intelligence (AI) are starting to fundamentally change computational science and engineering through breakthrough performances on challenging problems. However, reliability and trustworthiness of such techniques is a major concern. In inverse problems in imaging, the focus of this paper, there is increasing empirical evidence that methods may suffer from hallucinations, i.e., false, but realistic-looking artifacts; instability, i.e., sensitivity to perturbations in the data; and unpredictable generalization, i.e., excellent performance on some images, but significant deterioration on others. This paper provides a theoretical foundation for these phenomena. We give mathematical explanations for how and when such effects arise in arbitrary reconstruction methods, with several of our results taking the form of `no free lunch' theorems. Specifically, we show that (i) methods that overperform on a single image can wrongly transfer details from one image to another, creating a hallucination, (ii) methods that overperform on two or more images can hallucinate or be unstable, (iii) optimizing the accuracy-stability trade-off is generally difficult, (iv) hallucinations and instabilities, if they occur, are not rare events, and may be encouraged by standard training, (v) it may be impossible to construct optimal reconstruction maps for certain problems. Our results trace these effects to the kernel of the forward operator whenever it is nontrivial, but also apply to the case when the forward operator is ill-conditioned. Based on these insights, our work aims to spur research into new ways to develop robust and reliable AI-based methods for inverse problems in imaging. △ Less

Submitted 18 June, 2024; v1 submitted 5 January, 2020; originally announced January 2020.

MSC Class: 65R32; 94A08; 68T05; 65M12

arXiv:1909.01143 [pdf, ps, other]

doi 10.1007/s00041-021-09813-6

Non-uniform recovery guarantees for binary measurements and infinite-dimensional compressed sensing

Authors: Laura Thesing, Anders Christian Hansen

Abstract: Due to the many applications in Magnetic Resonance Imaging (MRI), Nuclear Magnetic Resonance (NMR), radio interferometry, helium atom scattering etc., the theory of compressed sensing with Fourier transform measurements has reached a mature level. However, for binary measurements via the Walsh transform, the theory has been merely non-existent, despite the large number of applications such as fluo… ▽ More Due to the many applications in Magnetic Resonance Imaging (MRI), Nuclear Magnetic Resonance (NMR), radio interferometry, helium atom scattering etc., the theory of compressed sensing with Fourier transform measurements has reached a mature level. However, for binary measurements via the Walsh transform, the theory has been merely non-existent, despite the large number of applications such as fluorescence microscopy, single pixel cameras, lensless cameras, compressive holography, laser-based failure-analysis etc. Binary measurements are a mainstay in signal and image processing and can be modelled by the Walsh transform and Walsh series that are binary cousins of the respective Fourier counterparts. We help bridging the theoretical gap by providing non-uniform recovery guarantees for infinite-dimensional compressed sensing with Walsh samples and wavelet reconstruction. The theoretical results demonstrate that compressed sensing with Walsh samples, as long as the sampling strategy is highly structured and follows the structured sparsity of the signal, is as effective as in the Fourier case. However, there is a fundamental difference in the asymptotic results when the smoothness and vanishing moments of the wavelet increase. In the Fourier case, this changes the optimal sampling patterns, whereas this is not the case in the Walsh setting. △ Less

Submitted 30 March, 2021; v1 submitted 3 September, 2019; originally announced September 2019.

MSC Class: 42C10; 42C40; 94A12; 94A20

Journal ref: Journal of Fourier Analysis and Applications 27.2 (2021): 1-44

arXiv:1908.00185 [pdf, ps, other]

doi 10.1016/j.acha.2018.08.004

On the stable sampling rate for binary measurements and wavelet reconstruction

Authors: Anders Christian Hansen, Laura Thesing

Abstract: This paper is concerned with the problem of reconstructing an infinite-dimensional signal from a limited number of linear measurements. In particular, we show that for binary measurements (modelled with Walsh functions and Hadamard matrices) and wavelet reconstruction the stable sampling rate is linear. This implies that binary measurements are as efficient as Fourier samples when using wavelets a… ▽ More This paper is concerned with the problem of reconstructing an infinite-dimensional signal from a limited number of linear measurements. In particular, we show that for binary measurements (modelled with Walsh functions and Hadamard matrices) and wavelet reconstruction the stable sampling rate is linear. This implies that binary measurements are as efficient as Fourier samples when using wavelets as the reconstruction space. Powerful techniques for reconstructions include generalized sampling and its compressed versions, as well as recent methods based on data assimilation. Common to these methods is that the reconstruction quality depends highly on the subspace angle between the sampling and the reconstruction space, which is dictated by the stable sampling rate. As a result of the theory provided in this paper, these methods can now easily use binary measurements and wavelet reconstruction bases. △ Less

Submitted 29 July, 2019; originally announced August 2019.

Journal ref: Applied Computational and Harmonic Analysis, 2018

arXiv:1907.12286 [pdf, other]

Linear reconstructions and the analysis of the stable sampling rate

Authors: Laura Thesing, Anders Christian Hansen

Abstract: The theory of sampling and the reconstruction of data has a wide range of applications and a rich collection of techniques. For many methods a core problem is the estimation of the number of samples needed in order to secure a stable and accurate reconstruction. This can often be controlled by the Stable Sampling Rate (SSR). In this paper we discuss the SSR and how it is crucial for two key linear… ▽ More The theory of sampling and the reconstruction of data has a wide range of applications and a rich collection of techniques. For many methods a core problem is the estimation of the number of samples needed in order to secure a stable and accurate reconstruction. This can often be controlled by the Stable Sampling Rate (SSR). In this paper we discuss the SSR and how it is crucial for two key linear methods in sampling theory: generalized sampling and the recently developed Parametrized Background Data Weak (PBDW) method. Both of these approaches rely on estimates of the SSR in order to be accurate. In many areas of signal and image processing binary samples are crucial and such samples, which can be modelled by Walsh functions, are the core of our analysis. As we show, the SSR is linear when considering binary sampling with Walsh functions and wavelet reconstruction. Moreover, for certain wavelets it is possible to determine the SSR exactly, allowing sharp estimates for the performance of the methods. △ Less

Submitted 29 July, 2019; originally announced July 2019.

Journal ref: Sampling Theory in Signal and Image Processing 17, 103-126, 2018

arXiv:1906.01478 [pdf, other]

What do AI algorithms actually learn? - On false structures in deep learning

Authors: Laura Thesing, Vegard Antun, Anders C. Hansen

Abstract: There are two big unsolved mathematical questions in artificial intelligence (AI): (1) Why is deep learning so successful in classification problems and (2) why are neural nets based on deep learning at the same time universally unstable, where the instabilities make the networks vulnerable to adversarial attacks. We present a solution to these questions that can be summed up in two words; false s… ▽ More There are two big unsolved mathematical questions in artificial intelligence (AI): (1) Why is deep learning so successful in classification problems and (2) why are neural nets based on deep learning at the same time universally unstable, where the instabilities make the networks vulnerable to adversarial attacks. We present a solution to these questions that can be summed up in two words; false structures. Indeed, deep learning does not learn the original structures that humans use when recognising images (cats have whiskers, paws, fur, pointy ears, etc), but rather different false structures that correlate with the original structure and hence yield the success. However, the false structure, unlike the original structure, is unstable. The false structure is simpler than the original structure, hence easier to learn with less data and the numerical algorithm used in the training will more easily converge to the neural network that captures the false structure. We formally define the concept of false structures and formulate the solution as a conjecture. Given that trained neural networks always are computed with approximations, this conjecture can only be established through a combination of theoretical and computational results similar to how one establishes a postulate in theoretical physics (e.g. the speed of light is constant). Establishing the conjecture fully will require a vast research program characterising the false structures. We provide the foundations for such a program establishing the existence of the false structures in practice. Finally, we discuss the far reaching consequences the existence of the false structures has on state-of-the-art AI and Smale's 18th problem. △ Less

Submitted 4 June, 2019; originally announced June 2019.

arXiv:1905.00126 [pdf, other]

doi 10.1016/j.acha.2021.04.001

Uniform recovery in infinite-dimensional compressed sensing and applications to structured binary sampling

Authors: Ben Adcock, Vegard Antun, Anders C. Hansen

Abstract: Infinite-dimensional compressed sensing deals with the recovery of analog signals (functions) from linear measurements, often in the form of integral transforms such as the Fourier transform. This framework is well-suited to many real-world inverse problems, which are typically modelled in infinite-dimensional spaces, and where the application of finite-dimensional approaches can lead to noticeabl… ▽ More Infinite-dimensional compressed sensing deals with the recovery of analog signals (functions) from linear measurements, often in the form of integral transforms such as the Fourier transform. This framework is well-suited to many real-world inverse problems, which are typically modelled in infinite-dimensional spaces, and where the application of finite-dimensional approaches can lead to noticeable artefacts. Another typical feature of such problems is that the signals are not only sparse in some dictionary, but possess a so-called local sparsity in levels structure. Consequently, the sampling scheme should be designed so as to exploit this additional structure. In this paper, we introduce a series of uniform recovery guarantees for infinite-dimensional compressed sensing based on sparsity in levels and so-called multilevel random subsampling. By using a weighted $\ell^1$-regularizer we derive measurement conditions that are sharp up to log factors, in the sense they agree with those of certain oracle estimators. These guarantees also apply in finite dimensions, and improve existing results for unweighted $\ell^1$-regularization. To illustrate our results, we consider the problem of binary sampling with the Walsh transform using orthogonal wavelets. Binary sampling is an important mechanism for certain imaging modalities. Through carefully estimating the local coherence between the Walsh and wavelet bases, we derive the first known recovery guarantees for this problem. △ Less

Submitted 30 April, 2019; originally announced May 2019.

MSC Class: 94A20; 42C40; 42C10; 15B52

Journal ref: Appl. Comput. Harmon. Anal. 55 (2021) 1-40

arXiv:1902.05300 [pdf, other]

doi 10.1073/pnas.1907377117

On instabilities of deep learning in image reconstruction - Does AI come at a cost?

Authors: Vegard Antun, Francesco Renna, Clarice Poon, Ben Adcock, Anders C. Hansen

Abstract: Deep learning, due to its unprecedented success in tasks such as image classification, has emerged as a new tool in image reconstruction with potential to change the field. In this paper we demonstrate a crucial phenomenon: deep learning typically yields unstablemethods for image reconstruction. The instabilities usually occur in several forms: (1) tiny, almost undetectable perturbations, both in… ▽ More Deep learning, due to its unprecedented success in tasks such as image classification, has emerged as a new tool in image reconstruction with potential to change the field. In this paper we demonstrate a crucial phenomenon: deep learning typically yields unstablemethods for image reconstruction. The instabilities usually occur in several forms: (1) tiny, almost undetectable perturbations, both in the image and sampling domain, may result in severe artefacts in the reconstruction, (2) a small structural change, for example a tumour, may not be captured in the reconstructed image and (3) (a counterintuitive type of instability) more samples may yield poorer performance. Our new stability test with algorithms and easy to use software detects the instability phenomena. The test is aimed at researchers to test their networks for instabilities and for government agencies, such as the Food and Drug Administration (FDA), to secure safe use of deep learning methods. △ Less

Submitted 14 February, 2019; originally announced February 2019.

Journal ref: Proc. Natl. Acad. Sci. USA, 2020

arXiv:1508.03280 [pdf, other]

Computing Spectra -- On the Solvability Complexity Index Hierarchy and Towers of Algorithms

Authors: Jonathan Ben-Artzi, Matthew J. Colbrook, Anders C. Hansen, Olavi Nevanlinna, Markus Seidel

Abstract: This paper establishes some of the fundamental barriers in the theory of computations and finally settles the long-standing computational spectral problem. That is to determine the existence of algorithms that can compute spectra $\mathrm{sp}(A)$ of classes of bounded operators $A = \{a_{ij}\}_{i,j \in \mathbb{N}} \in \mathcal{B}(l^2(\mathbb{N}))$, given the matrix elements… ▽ More This paper establishes some of the fundamental barriers in the theory of computations and finally settles the long-standing computational spectral problem. That is to determine the existence of algorithms that can compute spectra $\mathrm{sp}(A)$ of classes of bounded operators $A = \{a_{ij}\}_{i,j \in \mathbb{N}} \in \mathcal{B}(l^2(\mathbb{N}))$, given the matrix elements $\{a_{ij}\}_{i,j \in \mathbb{N}}$, that are sharp in the sense that they achieve the boundary of what a digital computer can achieve. Similarly, for a Schrödinger operator $H = -Δ+V$, determine the existence of algorithms that can compute the spectrum $\mathrm{sp}(H)$ given point samples of the potential function $V$. In order to solve these problems, we establish the Solvability Complexity Index (SCI) hierarchy and provide a collection of new algorithms that allow for problems that were previously out of reach. The SCI is the smallest number of limits needed in the computation, yielding a classification hierarchy for all types of problems in computational mathematics that determines the boundaries of what computers can achieve in scientific computing. In addition, the SCI hierarchy provides classifications of computational problems that can be used in computer-assisted proofs. The SCI hierarchy captures many key computational issues in the history of mathematics including the insolvability of the quintic, Smale's problem on the existence of iterative generally convergent algorithm for polynomial root finding, the computational spectral problem, inverse problems, optimisation etc. △ Less

Submitted 15 June, 2020; v1 submitted 13 August, 2015; originally announced August 2015.

MSC Class: 47A10 (primary); 81Q10; 34L16; 46N40 (secondary)

arXiv:1411.4449 [pdf, other]

On the absence of the RIP in real-world applications of compressed sensing and the RIP in levels

Authors: Alexander Bastounis, Anders C. Hansen

Abstract: The purpose of this paper is twofold. The first is to point out that the Restricted Isometry Property (RIP) does not hold in many applications where compressed sensing is successfully used. This includes fields like Magnetic Resonance Imaging (MRI), Computerized Tomography, Electron Microscopy, Radio Interferometry and Fluorescence Microscopy. We demonstrate that for natural compressed sensing mat… ▽ More The purpose of this paper is twofold. The first is to point out that the Restricted Isometry Property (RIP) does not hold in many applications where compressed sensing is successfully used. This includes fields like Magnetic Resonance Imaging (MRI), Computerized Tomography, Electron Microscopy, Radio Interferometry and Fluorescence Microscopy. We demonstrate that for natural compressed sensing matrices involving a level based reconstruction basis (e.g. wavelets), the number of measurements required to recover all $s$-sparse signals for reasonable $s$ is excessive. In particular, uniform recovery of all $s$-sparse signals is quite unrealistic. This realisation shows that the RIP is insufficient for explaining the success of compressed sensing in various practical applications. The second purpose of the paper is to introduce a new framework based on a generalised RIP-like definition that fits the applications where compressed sensing is used. We show that the shortcomings that show that uniform recovery is unreasonable no longer apply if we instead ask for structured recovery that is uniform only within each of the levels. To examine this phenomenon, a new tool, termed the 'Restricted Isometry Property in Levels' is described and analysed. Furthermore, we show that with certain conditions on the Restricted Isometry Property in Levels, a form of uniform recovery within each level is possible. Finally, we conclude the paper by providing examples that demonstrate the optimality of the results obtained. △ Less

Submitted 16 October, 2015; v1 submitted 17 November, 2014; originally announced November 2014.

arXiv:1403.6540 [pdf, other]

The quest for optimal sampling: Computationally efficient, structure-exploiting measurements for compressed sensing

Authors: Ben Adcock, Anders C. Hansen, Bogdan Roman

Abstract: An intriguing phenomenon in many instances of compressed sensing is that the reconstruction quality is governed not just by the overall sparsity of the signal, but also on its structure. This paper is about understanding this phenomenon, and demonstrating how it can be fruitfully exploited by the design of suitable sampling strategies in order to outperform more standard compressed sensing techniq… ▽ More An intriguing phenomenon in many instances of compressed sensing is that the reconstruction quality is governed not just by the overall sparsity of the signal, but also on its structure. This paper is about understanding this phenomenon, and demonstrating how it can be fruitfully exploited by the design of suitable sampling strategies in order to outperform more standard compressed sensing techniques based on random matrices. △ Less

Submitted 27 March, 2014; v1 submitted 25 March, 2014; originally announced March 2014.

Comments: 23 pages, 7 figures, to appear in Springer book "Compressed Sensing and Its Applications", 2014

arXiv:1402.5324 [pdf, other]

On Asymptotic Incoherence and its Implications for Compressed Sensing of Inverse Problems

Authors: Alex D. Jones, Ben Adcock, Anders C. Hansen

Abstract: Recently, it has been shown that incoherence is an unrealistic assumption for compressed sensing when applied to many inverse problems. Instead, the key property that permits efficient recovery in such problems is so-called local incoherence. Similarly, the standard notion of sparsity is also inadequate for many real world problems. In particular, in many applications, the optimal sampling strateg… ▽ More Recently, it has been shown that incoherence is an unrealistic assumption for compressed sensing when applied to many inverse problems. Instead, the key property that permits efficient recovery in such problems is so-called local incoherence. Similarly, the standard notion of sparsity is also inadequate for many real world problems. In particular, in many applications, the optimal sampling strategy depends on asymptotic incoherence and the signal sparsity structure. The purpose of this paper is to study asymptotic incoherence and its implications towards the design of optimal sampling strategies and efficient sparsity bases. It is determined how fast asymptotic incoherence can decay in general for isometries. Furthermore it is shown that Fourier sampling and wavelet sparsity, whilst globally coherent, yield optimal asymptotic incoherence as a power law up to a constant factor. Sharp bounds on the asymptotic incoherence for Fourier sampling with polynomial bases are also provided. A numerical experiment is also presented to demonstrate the role of asymptotic incoherence in finding good subsampling strategies. △ Less

Submitted 7 July, 2015; v1 submitted 21 February, 2014; originally announced February 2014.

arXiv:1302.0561 [pdf, other]

Breaking the coherence barrier: A new theory for compressed sensing

Authors: Ben Adcock, Anders C. Hansen, Clarice Poon, Bogdan Roman

Abstract: This paper provides an extension of compressed sensing which bridges a substantial gap between existing theory and its current use in real-world applications. It introduces a mathematical framework that generalizes the three standard pillars of compressed sensing - namely, sparsity, incoherence and uniform random subsampling - to three new concepts: asymptotic sparsity, asymptotic incoherence and… ▽ More This paper provides an extension of compressed sensing which bridges a substantial gap between existing theory and its current use in real-world applications. It introduces a mathematical framework that generalizes the three standard pillars of compressed sensing - namely, sparsity, incoherence and uniform random subsampling - to three new concepts: asymptotic sparsity, asymptotic incoherence and multilevel random sampling. The new theorems show that compressed sensing is also possible, and reveals several advantages, under these substantially relaxed conditions. The importance of this is threefold. First, inverse problems to which compressed sensing is currently applied are typically coherent. The new theory provides the first comprehensive mathematical explanation for a range of empirical usages of compressed sensing in real-world applications, such as medical imaging, microscopy, spectroscopy and others. Second, in showing that compressed sensing does not require incoherence, but instead that asymptotic incoherence is sufficient, the new theory offers markedly greater flexibility in the design of sensing mechanisms. Third, by using asymptotic incoherence and multi-level sampling to exploit not just sparsity, but also structure, i.e. asymptotic sparsity, the new theory shows that substantially improved reconstructions can be obtained from fewer measurements. △ Less

Submitted 23 June, 2014; v1 submitted 3 February, 2013; originally announced February 2013.

arXiv:1301.2831 [pdf, other]

Beyond consistent reconstructions: optimality and sharp bounds for generalized sampling, and application to the uniform resampling problem

Authors: Ben Adcock, Anders C. Hansen, Clarice Poon

Abstract: Generalized sampling is a recently developed linear framework for sampling and reconstruction in separable Hilbert spaces. It allows one to recover any element in any finite-dimensional subspace given finitely many of its samples with respect to an arbitrary frame. Unlike more common approaches for this problem, such as the consistent reconstruction technique of Eldar et al, it leads to completely… ▽ More Generalized sampling is a recently developed linear framework for sampling and reconstruction in separable Hilbert spaces. It allows one to recover any element in any finite-dimensional subspace given finitely many of its samples with respect to an arbitrary frame. Unlike more common approaches for this problem, such as the consistent reconstruction technique of Eldar et al, it leads to completely stable numerical methods possessing both guaranteed stability and accuracy. The purpose of this paper is twofold. First, we give a complete and formal analysis of generalized sampling, the main result of which being the derivation of new, sharp bounds for the accuracy and stability of this approach. Such bounds improve those given previously, and result in a necessary and sufficient condition, the stable sampling rate, which guarantees a priori a good reconstruction. Second, we address the topic of optimality. Under some assumptions, we show that generalized sampling is an optimal, stable reconstruction. Correspondingly, whenever these assumptions hold, the stable sampling rate is a universal quantity. In the final part of the paper we illustrate our results by applying generalized sampling to the so-called uniform resampling problem. △ Less

Submitted 13 January, 2013; originally announced January 2013.

arXiv:1208.5959 [pdf, other]

On optimal wavelet reconstructions from Fourier samples: linearity and universality of the stable sampling rate

Authors: Ben Adcock, Anders C. Hansen, Clarice Poon

Abstract: In this paper we study the problem of computing wavelet coefficients of compactly supported functions from their Fourier samples. For this, we use the recently introduced framework of generalized sampling. Our first result demonstrates that using generalized sampling one obtains a stable and accurate reconstruction, provided the number of Fourier samples grows linearly in the number of wavelet coe… ▽ More In this paper we study the problem of computing wavelet coefficients of compactly supported functions from their Fourier samples. For this, we use the recently introduced framework of generalized sampling. Our first result demonstrates that using generalized sampling one obtains a stable and accurate reconstruction, provided the number of Fourier samples grows linearly in the number of wavelet coefficients recovered. For the class of Daubechies wavelets we derive the exact constant of proportionality. Our second result concerns the optimality of generalized sampling for this problem. Under some mild assumptions we show that generalized sampling cannot be outperformed in terms of approximation quality by more than a constant factor. Moreover, for the class of so-called perfect methods, any attempt to lower the sampling ratio below a certain critical threshold necessarily results in exponential ill-conditioning. Thus generalized sampling provides a nearly-optimal solution to this problem. △ Less

Submitted 12 May, 2013; v1 submitted 29 August, 2012; originally announced August 2012.

MSC Class: 94A20; 42C40; 65T60; 41A65; 46C05

arXiv:1007.1852 [pdf, ps, other]

A Generalized Sampling Theorem for Stable Reconstructions in Arbitrary Bases

Authors: Ben Adcock, Anders C. Hansen

Abstract: We introduce a generalized framework for sampling and reconstruction in separable Hilbert spaces. Specifically, we establish that it is always possible to stably reconstruct a vector in an arbitrary Riesz basis from sufficiently many of its samples in any other Riesz basis. This framework can be viewed as an extension of that of Eldar et al. However, whilst the latter imposes stringent assumptions… ▽ More We introduce a generalized framework for sampling and reconstruction in separable Hilbert spaces. Specifically, we establish that it is always possible to stably reconstruct a vector in an arbitrary Riesz basis from sufficiently many of its samples in any other Riesz basis. This framework can be viewed as an extension of that of Eldar et al. However, whilst the latter imposes stringent assumptions on the reconstruction basis, and may in practice be unstable, our framework allows for recovery in any (Riesz) basis in a manner that is completely stable. Whilst the classical Shannon Sampling Theorem is a special case of our theorem, this framework allows us to exploit additional information about the approximated vector (or, in this case, function), for example sparsity or regularity, to design a reconstruction basis that is better suited. Examples are presented illustrating this procedure. △ Less

Submitted 30 November, 2010; v1 submitted 12 July, 2010; originally announced July 2010.

Showing 1–21 of 21 results for author: Hansen, A C