-
Ambitious Data Science Can Be Painless
Authors:
Hatef Monajemi,
Riccardo Murri,
Eric Jonas,
Percy Liang,
Victoria Stodden,
David L. Donoho
Abstract:
Modern data science research can involve massive computational experimentation; an ambitious PhD in computational fields may do experiments consuming several million CPU hours. Traditional computing practices, in which researchers use laptops or shared campus-resident resources, are inadequate for experiments at the massive scale and varied scope that we now see in data science. On the other hand,…
▽ More
Modern data science research can involve massive computational experimentation; an ambitious PhD in computational fields may do experiments consuming several million CPU hours. Traditional computing practices, in which researchers use laptops or shared campus-resident resources, are inadequate for experiments at the massive scale and varied scope that we now see in data science. On the other hand, modern cloud computing promises seemingly unlimited computational resources that can be custom configured, and seems to offer a powerful new venue for ambitious data-driven science. Exploiting the cloud fully, the amount of work that could be completed in a fixed amount of time can expand by several orders of magnitude.
As potentially powerful as cloud-based experimentation may be in the abstract, it has not yet become a standard option for researchers in many academic disciplines. The prospect of actually conducting massive computational experiments in today's cloud systems confronts the potential user with daunting challenges. Leading considerations include: (i) the seeming complexity of today's cloud computing interface, (ii) the difficulty of executing an overwhelmingly large number of jobs, and (iii) the difficulty of monitoring and combining a massive collection of separate results. Starting a massive experiment `bare-handed' seems therefore highly problematic and prone to rapid `researcher burn out'.
New software stacks are emerging that render massive cloud experiments relatively painless. Such stacks simplify experimentation by systematizing experiment definition, automating distribution and management of tasks, and allowing easy harvesting of results and documentation. In this article, we discuss several painless computing stacks that abstract away the difficulties of massive experimentation, thereby allowing a proliferation of ambitious experiments for scientific discovery.
△ Less
Submitted 24 January, 2019;
originally announced January 2019.
-
Neural Proximal Gradient Descent for Compressive Imaging
Authors:
Morteza Mardani,
Qingyun Sun,
Shreyas Vasawanala,
Vardan Papyan,
Hatef Monajemi,
John Pauly,
David Donoho
Abstract:
Recovering high-resolution images from limited sensory data typically leads to a serious ill-posed inverse problem, demanding inversion algorithms that effectively capture the prior information. Learning a good inverse map** from training data faces severe challenges, including: (i) scarcity of training data; (ii) need for plausible reconstructions that are physically feasible; (iii) need for fa…
▽ More
Recovering high-resolution images from limited sensory data typically leads to a serious ill-posed inverse problem, demanding inversion algorithms that effectively capture the prior information. Learning a good inverse map** from training data faces severe challenges, including: (i) scarcity of training data; (ii) need for plausible reconstructions that are physically feasible; (iii) need for fast reconstruction, especially in real-time applications. We develop a successful system solving all these challenges, using as basic architecture the recurrent application of proximal gradient algorithm. We learn a proximal map that works well with real images based on residual networks. Contraction of the resulting map is analyzed, and incoherence conditions are investigated that drive the convergence of the iterates. Extensive experiments are carried out under different settings: (a) reconstructing abdominal MRI of pediatric patients from highly undersampled Fourier-space data and (b) superresolving natural face images. Our key findings include: 1. a recurrent ResNet with a single residual block unrolled from an iterative algorithm yields an effective proximal which accurately reveals MR image details. 2. Our architecture significantly outperforms conventional non-recurrent deep ResNets by 2dB SNR; it is also trained much more rapidly. 3. It outperforms state-of-the-art compressed-sensing Wavelet-based methods by 4dB SNR, with 100x speedups in reconstruction time.
△ Less
Submitted 1 June, 2018;
originally announced June 2018.
-
Recurrent Generative Adversarial Networks for Proximal Learning and Automated Compressive Image Recovery
Authors:
Morteza Mardani,
Hatef Monajemi,
Vardan Papyan,
Shreyas Vasanawala,
David Donoho,
John Pauly
Abstract:
Recovering images from undersampled linear measurements typically leads to an ill-posed linear inverse problem, that asks for proper statistical priors. Building effective priors is however challenged by the low train and test overhead dictated by real-time tasks; and the need for retrieving visually "plausible" and physically "feasible" images with minimal hallucination. To cope with these challe…
▽ More
Recovering images from undersampled linear measurements typically leads to an ill-posed linear inverse problem, that asks for proper statistical priors. Building effective priors is however challenged by the low train and test overhead dictated by real-time tasks; and the need for retrieving visually "plausible" and physically "feasible" images with minimal hallucination. To cope with these challenges, we design a cascaded network architecture that unrolls the proximal gradient iterations by permeating benefits from generative residual networks (ResNet) to modeling the proximal operator. A mixture of pixel-wise and perceptual costs is then deployed to train proximals. The overall architecture resembles back-and-forth projection onto the intersection of feasible and plausible images. Extensive computational experiments are examined for a global task of reconstructing MR images of pediatric patients, and a more local task of superresolving CelebA faces, that are insightful to design efficient architectures. Our observations indicate that for MRI reconstruction, a recurrent ResNet with a single residual block effectively learns the proximal. This simple architecture appears to significantly outperform the alternative deep ResNet architecture by 2dB SNR, and the conventional compressed-sensing MRI by 4dB SNR with 100x faster inference. For image superresolution, our preliminary results indicate that modeling the denoising proximal demands deep ResNets.
△ Less
Submitted 27 November, 2017;
originally announced November 2017.
-
Nonparametric estimation of galaxy cluster's emissivity and point source detection in astrophysics with two lasso penalties
Authors:
Jairo Diaz-Rodriguez,
Dominique Eckert,
Hatef Monajemi,
Stéphane Paltani,
Sylvain Sardy
Abstract:
Astrophysicists are interested in recovering the 3D gas emissivity of a galaxy cluster from a 2D image taken by a telescope. A blurring phenomenon and presence of point sources make this inverse problem even harder to solve. The current state-of-the-art technique is two step: first identify the location of potential point sources, then mask these locations and deproject the data.
We instead mode…
▽ More
Astrophysicists are interested in recovering the 3D gas emissivity of a galaxy cluster from a 2D image taken by a telescope. A blurring phenomenon and presence of point sources make this inverse problem even harder to solve. The current state-of-the-art technique is two step: first identify the location of potential point sources, then mask these locations and deproject the data.
We instead model the data as a Poisson generalized linear model (involving blurring, Abel and wavelets operators) regularized by two lasso penalties to induce sparse wavelet representation and sparse point sources. The amount of sparsity is controlled by two quantile universal thresholds. As a result, our method outperforms the existing one.
△ Less
Submitted 2 March, 2017;
originally announced March 2017.
-
Sparsity/Undersampling Tradeoffs in Anisotropic Undersampling, with Applications in MR Imaging/Spectroscopy
Authors:
Hatef Monajemi,
David L. Donoho
Abstract:
We study anisotropic undersampling schemes like those used in multi-dimensional NMR spectroscopy and MR imaging, which sample exhaustively in certain time dimensions and randomly in others.
Our analysis shows that anisotropic undersampling schemes are equivalent to certain block-diagonal measurement systems. We develop novel exact formulas for the sparsity/undersampling tradeoffs in such measure…
▽ More
We study anisotropic undersampling schemes like those used in multi-dimensional NMR spectroscopy and MR imaging, which sample exhaustively in certain time dimensions and randomly in others.
Our analysis shows that anisotropic undersampling schemes are equivalent to certain block-diagonal measurement systems. We develop novel exact formulas for the sparsity/undersampling tradeoffs in such measurement systems. Our formulas predict finite-N phase transition behavior differing substantially from the well known asymptotic phase transitions for classical Gaussian undersampling. Extensive empirical work shows that our formulas accurately describe observed finite-N behavior, while the usual formulas based on universality are substantially inaccurate.
We also vary the anisotropy, kee** the total number of samples fixed, and for each variation we determine the precise sparsity/undersampling tradeoff (phase transition). We show that, other things being equal, the ability to recover a sparse object decreases with an increasing number of exhaustively-sampled dimensions.
△ Less
Submitted 16 March, 2018; v1 submitted 9 February, 2017;
originally announced February 2017.
-
Incoherence of Partial-Component Sampling in multidimensional NMR
Authors:
Hatef Monajemi,
David L. Donoho,
Jeffrey C. Hoch,
Adam D. Schuyler
Abstract:
In NMR spectroscopy, undersampling in the indirect dimensions causes reconstruction artifacts whose size can be bounded using the so-called {\it coherence}. In experiments with multiple indirect dimensions, new undersampling approaches were recently proposed: random phase detection (RPD) \cite{Maciejewski11} and its generalization, partial component sampling (PCS) \cite{Schuyler13}. The new approa…
▽ More
In NMR spectroscopy, undersampling in the indirect dimensions causes reconstruction artifacts whose size can be bounded using the so-called {\it coherence}. In experiments with multiple indirect dimensions, new undersampling approaches were recently proposed: random phase detection (RPD) \cite{Maciejewski11} and its generalization, partial component sampling (PCS) \cite{Schuyler13}. The new approaches are fully aware of the fact that high-dimensional experiments generate hypercomplex-valued free induction decays; they randomly acquire only certain low-dimensional components of each high-dimensional hypercomplex entry. We provide a classification of various hypercomplex-aware undersampling schemes, and define a hypercomplex-aware coherence appropriate for such undersampling schemes; we then use it to quantify undersampling artifacts of RPD and various PCS schemes.
△ Less
Submitted 6 February, 2017;
originally announced February 2017.
-
Threshold Selection for Total Variation Denoising
Authors:
Sylvain Sardy,
Hatef Monajemi
Abstract:
Total variation (TV) denoising is a nonparametric smoothing method that has good properties for preserving sharp edges and contours in objects with spatial structures like natural images. The estimate is sparse in the sense that TV reconstruction leads to a piecewise constant function with a small number of jumps. A threshold parameter controls the number of jumps and the quality of the estimation…
▽ More
Total variation (TV) denoising is a nonparametric smoothing method that has good properties for preserving sharp edges and contours in objects with spatial structures like natural images. The estimate is sparse in the sense that TV reconstruction leads to a piecewise constant function with a small number of jumps. A threshold parameter controls the number of jumps and the quality of the estimation. In practice, this threshold is often selected by minimizing a goodness-of-fit criterion like cross-validation, which can be costly as it requires solving the high-dimensional and non-differentiable TV optimization problem many times. We propose instead a two step adaptive procedure via a connection to large deviation of stochastic processes. We also give conditions under which TV denoising achieves exact segmentation. We then apply our procedure to denoise a collection of 1D and 2D test signals verifying the effectiveness of our approach in practice.
△ Less
Submitted 4 May, 2016;
originally announced May 2016.
-
An Improved Data Assimilation Scheme for High Dimensional Nonlinear Systems
Authors:
Hatef Monajemi,
Peter K. Kitanidis
Abstract:
Nonlinear/non-Gaussian filtering has broad applications in many areas of life sciences where either the dynamic is nonlinear and/or the probability density function of uncertain state is non-Gaussian. In such problems, the accuracy of the estimated quantities depends highly upon how accurately their posterior pdf can be approximated. In low dimensional state spaces, methods based on Sequential Imp…
▽ More
Nonlinear/non-Gaussian filtering has broad applications in many areas of life sciences where either the dynamic is nonlinear and/or the probability density function of uncertain state is non-Gaussian. In such problems, the accuracy of the estimated quantities depends highly upon how accurately their posterior pdf can be approximated. In low dimensional state spaces, methods based on Sequential Importance Sampling (SIS) can suitably approximate the posterior pdf. For higher dimensional problems, however, these techniques are usually inappropriate since the required number of particles to achieve satisfactory estimates grows exponentially with the dimension of state space. On the other hand, ensemble Kalman filter (EnKF) and its variants are more suitable for large-scale problems due to transformation of particles in the Bayesian update step. It has been shown that the latter class of methods may lead to suboptimal solutions for strongly nonlinear problems due to the Gaussian assumption in the update step. In this paper, we introduce a new technique based on the Gaussian sum expansion which captures the non-Gaussian features more accurately while the required computational effort remains within reason for high dimensional problems. We demonstrate the performance of the method for non-Gaussian processes through several examples including the strongly nonlinear Lorenz models. Results show a remarkable improvement in the mean square error compared to EnKF, and a desirable convergence behavior as the number of particles increases.
△ Less
Submitted 31 July, 2012;
originally announced August 2012.