-
A divergence-based condition to ensure quantile improvement in black-box global optimization
Authors:
Thomas Guilmeau,
Emilie Chouzenoux,
Víctor Elvira
Abstract:
Black-box global optimization aims at minimizing an objective function whose analytical form is not known. To do so, many state-of-the-art methods rely on sampling-based strategies, where sampling distributions are built in an iterative fashion, so that their mass concentrate where the objective function is low. Despite empirical success, the theoretical study of these methods remains difficult. I…
▽ More
Black-box global optimization aims at minimizing an objective function whose analytical form is not known. To do so, many state-of-the-art methods rely on sampling-based strategies, where sampling distributions are built in an iterative fashion, so that their mass concentrate where the objective function is low. Despite empirical success, the theoretical study of these methods remains difficult. In this work, we introduce a new framework, based on divergence-decrease conditions, to study and design black-box global optimization algorithms. Our approach allows to establish and quantify the improvement of proposals at each iteration, in terms of expected value or quantile of the objective. We show that the information-geometric optimization approach fits within our framework, yielding a new approach for its analysis. We also establish proposal improvement results for two novel algorithms, one related with the cross-entropy approach with mixture models, and another one using heavy-tailed sampling proposal distributions.
△ Less
Submitted 19 June, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
A Novel Variational Approach for Multiphoton Microscopy Image Restoration: from PSF Estimation to 3D Deconvolution
Authors:
Julien Ajdenbaum,
Emilie Chouzenoux,
Claire Lefort,
Ségolène Martin,
Jean-Christophe Pesquet
Abstract:
In multi-photon microscopy (MPM), a recent in-vivo fluorescence microscopy system, the task of image restoration can be decomposed into two interlinked inverse problems: firstly, the characterization of the Point Spread Function (PSF) and subsequently, the deconvolution (i.e., deblurring) to remove the PSF effect, and reduce noise. The acquired MPM image quality is critically affected by PSF blurr…
▽ More
In multi-photon microscopy (MPM), a recent in-vivo fluorescence microscopy system, the task of image restoration can be decomposed into two interlinked inverse problems: firstly, the characterization of the Point Spread Function (PSF) and subsequently, the deconvolution (i.e., deblurring) to remove the PSF effect, and reduce noise. The acquired MPM image quality is critically affected by PSF blurring and intense noise. The PSF in MPM is highly spread in 3D and is not well characterized, presenting high variability with respect to the observed objects. This makes the restoration of MPM images challenging. Common PSF estimation methods in fluorescence microscopy, including MPM, involve capturing images of sub-resolution beads, followed by quantifying the resulting ellipsoidal 3D spot. In this work, we revisit this approach, co** with its inherent limitations in terms of accuracy and practicality. We estimate the PSF from the observation of relatively large beads (approximately 1$μ$m in diameter). This goes through the formulation and resolution of an original non-convex minimization problem, for which we propose a proximal alternating method along with convergence guarantees. Following the PSF estimation step, we then introduce an innovative strategy to deal with the high level multiplicative noise degrading the acquisitions. We rely on a heteroscedastic noise model for which we estimate the parameters. We then solve a constrained optimization problem to restore the image, accounting for the estimated PSF and noise, while allowing a minimal hyper-parameter tuning. Theoretical guarantees are given for the restoration algorithm. These algorithmic contributions lead to an end-to-end pipeline for 3D image restoration in MPM, that we share as a publicly available Python software. We demonstrate its effectiveness through several experiments on both simulated and real data.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Deep State-Space Model for Predicting Cryptocurrency Price
Authors:
Shalini Sharma,
Angshul Majumdar,
Emilie Chouzenoux,
Victor Elvira
Abstract:
Our work presents two fundamental contributions. On the application side, we tackle the challenging problem of predicting day-ahead crypto-currency prices. On the methodological side, a new dynamical modeling approach is proposed. Our approach keeps the probabilistic formulation of the state-space model, which provides uncertainty quantification on the estimates, and the function approximation abi…
▽ More
Our work presents two fundamental contributions. On the application side, we tackle the challenging problem of predicting day-ahead crypto-currency prices. On the methodological side, a new dynamical modeling approach is proposed. Our approach keeps the probabilistic formulation of the state-space model, which provides uncertainty quantification on the estimates, and the function approximation ability of deep neural networks. We call the proposed approach the deep state-space model. The experiments are carried out on established cryptocurrencies (obtained from Yahoo Finance). The goal of the work has been to predict the price for the next day. Benchmarking has been done with both state-of-the-art and classical dynamical modeling techniques. Results show that the proposed approach yields the best overall results in terms of accuracy.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Adaptive importance sampling for heavy-tailed distributions via $α$-divergence minimization
Authors:
Thomas Guilmeau,
Nicola Branchini,
Emilie Chouzenoux,
Víctor Elvira
Abstract:
Adaptive importance sampling (AIS) algorithms are widely used to approximate expectations with respect to complicated target probability distributions. When the target has heavy tails, existing AIS algorithms can provide inconsistent estimators or exhibit slow convergence, as they often neglect the target's tail behaviour. To avoid this pitfall, we propose an AIS algorithm that approximates the ta…
▽ More
Adaptive importance sampling (AIS) algorithms are widely used to approximate expectations with respect to complicated target probability distributions. When the target has heavy tails, existing AIS algorithms can provide inconsistent estimators or exhibit slow convergence, as they often neglect the target's tail behaviour. To avoid this pitfall, we propose an AIS algorithm that approximates the target by Student-t proposal distributions. We adapt location and scale parameters by matching the escort moments - which are defined even for heavy-tailed distributions - of the target and the proposal. These updates minimize the $α$-divergence between the target and the proposal, thereby connecting with variational inference. We then show that the $α$-divergence can be approximated by a generalized notion of effective sample size and leverage this new perspective to adapt the tail parameter with Bayesian optimization. We demonstrate the efficacy of our approach through applications to synthetic targets and a Bayesian Student-t regression task on a real example with clinical trial data.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Solution of Mismatched Monotone+Lipschitz Inclusion Problems
Authors:
Emilie Chouzenoux,
Jean-Christophe Pesquet,
Fernando Roldán
Abstract:
In this article, we study the convergence of algorithms for solving monotone inclusions in the presence of adjoint mismatch. The adjoint mismatch arises when the adjoint of a linear operator is replaced by an approximation, due to computational or physical issues. This occurs in inverse problems, particularly in computed tomography. In real Hilbert spaces, monotone inclusion problems involving a m…
▽ More
In this article, we study the convergence of algorithms for solving monotone inclusions in the presence of adjoint mismatch. The adjoint mismatch arises when the adjoint of a linear operator is replaced by an approximation, due to computational or physical issues. This occurs in inverse problems, particularly in computed tomography. In real Hilbert spaces, monotone inclusion problems involving a maximally $ρ$-monotone operator, a cocoercive operator, and a Lipschitzian operator can be solved by the Forward-Backward-Half-Forward and the Forward-Douglas-Rachford-Forward methods. We investigate the case of a mismatched Lipschitzian operator. We propose variants of the two aforementioned methods to cope with the mismatch, and establish conditions under which the weak convergence to a solution is guaranteed for these variants. The proposed algorithms hence enable each iteration to be implemented with a possibly iteration-dependent approximation to the mismatch operator, thus allowing this operator to be modified at each iteration. Finally, we present numerical experiments on a computed tomography example in material science, showing the applicability of our theoretical findings.
△ Less
Submitted 9 November, 2023; v1 submitted 10 October, 2023;
originally announced October 2023.
-
On variational inference and maximum likelihood estimation with the λ-exponential family
Authors:
Thomas Guilmeau,
Emilie Chouzenoux,
Víctor Elvira
Abstract:
The λ-exponential family has recently been proposed to generalize the exponential family. While the exponential family is well-understood and widely used, this it not the case of the λ-exponential family. However, many applications require models that are more general than the exponential family. In this work, we propose a theoretical and algorithmic framework to solve variational inference and ma…
▽ More
The λ-exponential family has recently been proposed to generalize the exponential family. While the exponential family is well-understood and widely used, this it not the case of the λ-exponential family. However, many applications require models that are more general than the exponential family. In this work, we propose a theoretical and algorithmic framework to solve variational inference and maximum likelihood estimation problems over the λ-exponential family. We give new sufficient optimality conditions for variational inference problems. Our conditions take the form of generalized moment-matching conditions and generalize existing similar results for the exponential family. We exhibit novel characterizations of the solutions of maximum likelihood estimation problems, that recover optimality conditions in the case of the exponential family. For the resolution of both problems, we propose novel proximal-like algorithms that exploit the geometry underlying the λ-exponential family. These new theoretical and methodological insights are tested on numerical examples, showcasing their usefulness and interest, especially on heavy-tailed target distributions.
△ Less
Submitted 19 June, 2024; v1 submitted 6 October, 2023;
originally announced October 2023.
-
Aggregated f-average Neural Network for Interpretable Ensembling
Authors:
Mathieu Vu,
Emilie Chouzenoux,
Jean-Christophe Pesquet,
Ismail Ben Ayed
Abstract:
Ensemble learning leverages multiple models (i.e., weak learners) on a common machine learning task to enhance prediction performance. Basic ensembling approaches average the weak learners outputs, while more sophisticated ones stack a machine learning model in between the weak learners outputs and the final prediction. This work fuses both aforementioned frameworks. We introduce an aggregated f-a…
▽ More
Ensemble learning leverages multiple models (i.e., weak learners) on a common machine learning task to enhance prediction performance. Basic ensembling approaches average the weak learners outputs, while more sophisticated ones stack a machine learning model in between the weak learners outputs and the final prediction. This work fuses both aforementioned frameworks. We introduce an aggregated f-average (AFA) shallow neural network which models and combines different types of averages to perform an optimal aggregation of the weak learners predictions. We emphasise its interpretable architecture and simple training strategy, and illustrate its good performance on the problem of few-shot class incremental learning.
△ Less
Submitted 30 November, 2023; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Majorization-Minimization for sparse SVMs
Authors:
Alessandro Benfenati,
Emilie Chouzenoux,
Giorgia Franchini,
Salla Latva-Aijo,
Dominik Narnhofer,
Jean-Christophe Pesquet,
Sebastian J. Scott,
Mahsa Yousefi
Abstract:
Several decades ago, Support Vector Machines (SVMs) were introduced for performing binary classification tasks, under a supervised framework. Nowadays, they often outperform other supervised methods and remain one of the most popular approaches in the machine learning arena. In this work, we investigate the training of SVMs through a smooth sparse-promoting-regularized squared hinge loss minimizat…
▽ More
Several decades ago, Support Vector Machines (SVMs) were introduced for performing binary classification tasks, under a supervised framework. Nowadays, they often outperform other supervised methods and remain one of the most popular approaches in the machine learning arena. In this work, we investigate the training of SVMs through a smooth sparse-promoting-regularized squared hinge loss minimization. This choice paves the way to the application of quick training methods built on majorization-minimization approaches, benefiting from the Lipschitz differentiabililty of the loss function. Moreover, the proposed approach allows us to handle sparsity-preserving regularizers promoting the selection of the most significant features, so enhancing the performance. Numerical tests and comparisons conducted on three different datasets demonstrate the good performance of the proposed methodology in terms of qualitative metrics (accuracy, precision, recall, and F 1 score) as well as computational cost.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Graphs in State-Space Models for Granger Causality in Climate Science
Authors:
Víctor Elvira,
Émilie Chouzenoux,
Jordi Cerdà,
Gustau Camps-Valls
Abstract:
Granger causality (GC) is often considered not an actual form of causality. Still, it is arguably the most widely used method to assess the predictability of a time series from another one. Granger causality has been widely used in many applied disciplines, from neuroscience and econometrics to Earth sciences. We revisit GC under a graphical perspective of state-space models. For that, we use Grap…
▽ More
Granger causality (GC) is often considered not an actual form of causality. Still, it is arguably the most widely used method to assess the predictability of a time series from another one. Granger causality has been widely used in many applied disciplines, from neuroscience and econometrics to Earth sciences. We revisit GC under a graphical perspective of state-space models. For that, we use GraphEM, a recently presented expectation-maximisation algorithm for estimating the linear matrix operator in the state equation of a linear-Gaussian state-space model. Lasso regularisation is included in the M-step, which is solved using a proximal splitting Douglas-Rachford algorithm. Experiments in toy examples and challenging climate problems illustrate the benefits of the proposed model and inference technique over standard Granger causality methods.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
A new non-convex framework to improve asymptotical knowledge on generic stochastic gradient descent
Authors:
Jean-Baptiste Fest,
Audrey Repetti,
Emilie Chouzenoux
Abstract:
Stochastic gradient optimization methods are broadly used to minimize non-convex smooth objective functions, for instance when training deep neural networks. However, theoretical guarantees on the asymptotic behaviour of these methods remain scarce. Especially, ensuring almost-sure convergence of the iterates to a stationary point is quite challenging. In this work, we introduce a new Kurdyka-Loja…
▽ More
Stochastic gradient optimization methods are broadly used to minimize non-convex smooth objective functions, for instance when training deep neural networks. However, theoretical guarantees on the asymptotic behaviour of these methods remain scarce. Especially, ensuring almost-sure convergence of the iterates to a stationary point is quite challenging. In this work, we introduce a new Kurdyka-Lojasiewicz theoretical framework to analyze asymptotic behavior of stochastic gradient descent (SGD) schemes when minimizing non-convex smooth objectives. In particular, our framework provides new almost-sure convergence results, on iterates generated by any SGD method satisfying mild conditional descent conditions. We illustrate the proposed framework by means of several toy simulation examples. We illustrate the role of the considered theoretical assumptions, and investigate how SGD iterates are impacted whether these assumptions are either fully or partially satisfied.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
Sparse Graphical Linear Dynamical Systems
Authors:
Emilie Chouzenoux,
Victor Elvira
Abstract:
Time-series datasets are central in machine learning with applications in numerous fields of science and engineering, such as biomedicine, Earth observation, and network analysis. Extensive research exists on state-space models (SSMs), which are powerful mathematical tools that allow for probabilistic and interpretable learning on time series. Learning the model parameters in SSMs is arguably one…
▽ More
Time-series datasets are central in machine learning with applications in numerous fields of science and engineering, such as biomedicine, Earth observation, and network analysis. Extensive research exists on state-space models (SSMs), which are powerful mathematical tools that allow for probabilistic and interpretable learning on time series. Learning the model parameters in SSMs is arguably one of the most complicated tasks, and the inclusion of prior knowledge is known to both ease the interpretation but also to complicate the inferential tasks. Very recent works have attempted to incorporate a graphical perspective on some of those model parameters, but they present notable limitations that this work addresses. More generally, existing graphical modeling tools are designed to incorporate either static information, focusing on statistical dependencies among independent random variables (e.g., graphical Lasso approach), or dynamic information, emphasizing causal relationships among time series samples (e.g., graphical Granger approaches). However, there are no joint approaches combining static and dynamic graphical modeling within the context of SSMs. This work proposes a novel approach to fill this gap by introducing a joint graphical modeling framework that bridges the graphical Lasso model and a causal-based graphical approach for the linear-Gaussian SSM. We present DGLASSO (Dynamic Graphical Lasso), a new inference method within this framework that implements an efficient block alternating majorization-minimization algorithm. The algorithm's convergence is established by departing from modern tools from nonlinear analysis. Experimental validation on various synthetic data showcases the effectiveness of the proposed model and inference algorithm.
△ Less
Submitted 14 June, 2024; v1 submitted 6 July, 2023;
originally announced July 2023.
-
Démélange, déconvolution et débruitage conjoints d'un modèle convolutif parcimonieux avec dérive instrumentale, par pénalisation de rapports de normes ou quasi-normes lissées (PENDANTSS)
Authors:
Paul Zheng,
Emilie Chouzenoux,
Laurent Duval
Abstract:
Denoising, detrending, deconvolution: usual restoration tasks, traditionally decoupled. Coupled formulations entail complex ill-posed inverse problems. We propose PENDANTSS for joint trend removal and blind deconvolution of sparse peak-like signals. It blends a parsimonious prior with the hypothesis that smooth trend and noise can somewhat be separated by low-pass filtering. We combine the general…
▽ More
Denoising, detrending, deconvolution: usual restoration tasks, traditionally decoupled. Coupled formulations entail complex ill-posed inverse problems. We propose PENDANTSS for joint trend removal and blind deconvolution of sparse peak-like signals. It blends a parsimonious prior with the hypothesis that smooth trend and noise can somewhat be separated by low-pass filtering. We combine the generalized pseudo-norm ratio SOOT/SPOQ sparse penalties $\ell_p/\ell_q$ with the BEADS ternary assisted source separation algorithm. This results in a both convergent and efficient tool, with a novel Trust-Region block alternating variable metric forward-backward approach. It outperforms comparable methods, when applied to typically peaked analytical chemistry signals. Reproducible code is provided: https://github.com/paulzhengfr/PENDANTSS.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
GraphIT: Iterative reweighted $\ell_1$ algorithm for sparse graph inference in state-space models
Authors:
Emilie Chouzenoux,
Victor Elvira
Abstract:
State-space models (SSMs) are a common tool for modeling multi-variate discrete-time signals. The linear-Gaussian (LG) SSM is widely applied as it allows for a closed-form solution at inference, if the model parameters are known. However, they are rarely available in real-world problems and must be estimated. Promoting sparsity of these parameters favours both interpretability and tractable infere…
▽ More
State-space models (SSMs) are a common tool for modeling multi-variate discrete-time signals. The linear-Gaussian (LG) SSM is widely applied as it allows for a closed-form solution at inference, if the model parameters are known. However, they are rarely available in real-world problems and must be estimated. Promoting sparsity of these parameters favours both interpretability and tractable inference. In this work, we propose GraphIT, a majorization-minimization (MM) algorithm for estimating the linear operator in the state equation of an LG-SSM under sparse prior. A versatile family of non-convex regularization potentials is proposed. The MM method relies on tools inherited from the expectation-maximization methodology and the iterated reweighted-l1 approach. In particular, we derive a suitable convex upper bound for the objective function, that we then minimize using a proximal splitting algorithm. Numerical experiments illustrate the benefits of the proposed inference technique.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
A Kurdyka-Lojasiewicz property for stochastic optimization algorithms in a non-convex setting
Authors:
Emilie Chouzenoux,
Jean-Baptiste Fest,
Audrey Repetti
Abstract:
Stochastic differentiable approximation schemes are widely used for solving high dimensional problems. Most of existing methods satisfy some desirable properties, including conditional descent inequalities, and almost sure (a.s.) convergence guarantees on the objective function, or on the involved gradient. However, for non-convex objective functions, a.s. convergence of the iterates, i.e., the st…
▽ More
Stochastic differentiable approximation schemes are widely used for solving high dimensional problems. Most of existing methods satisfy some desirable properties, including conditional descent inequalities, and almost sure (a.s.) convergence guarantees on the objective function, or on the involved gradient. However, for non-convex objective functions, a.s. convergence of the iterates, i.e., the stochastic process, to a critical point is usually not guaranteed, and remains an important challenge. In this article, we develop a framework to bridge the gap between descent-type inequalities and a.s. convergence of the associated stochastic process. Leveraging a novel Kurdyka-Lojasiewicz property, we show convergence guarantees of stochastic processes under mild assumptions on the objective function. We also provide examples of stochastic algorithms benefiting from the proposed framework and derive a.s. convergence guarantees on the iterates.
△ Less
Submitted 23 March, 2023; v1 submitted 13 February, 2023;
originally announced February 2023.
-
PENDANTSS: PEnalized Norm-ratios Disentangling Additive Noise, Trend and Sparse Spikes
Authors:
Paul Zheng,
Emilie Chouzenoux,
Laurent Duval
Abstract:
Denoising, detrending, deconvolution: usual restoration tasks, traditionally decoupled. Coupled formulations entail complex ill-posed inverse problems. We propose PENDANTSS for joint trend removal and blind deconvolution of sparse peak-like signals. It blends a parsimonious prior with the hypothesis that smooth trend and noise can somewhat be separated by low-pass filtering. We combine the general…
▽ More
Denoising, detrending, deconvolution: usual restoration tasks, traditionally decoupled. Coupled formulations entail complex ill-posed inverse problems. We propose PENDANTSS for joint trend removal and blind deconvolution of sparse peak-like signals. It blends a parsimonious prior with the hypothesis that smooth trend and noise can somewhat be separated by low-pass filtering. We combine the generalized quasi-norm ratio SOOT/SPOQ sparse penalties $\ell_p/\ell_q$ with the BEADS ternary assisted source separation algorithm. This results in a both convergent and efficient tool, with a novel Trust-Region block alternating variable metric forward-backward approach. It outperforms comparable methods, when applied to typically peaked analytical chemistry signals. Reproducible code is provided.
△ Less
Submitted 16 February, 2023; v1 submitted 4 January, 2023;
originally announced January 2023.
-
Regularized Rényi divergence minimization through Bregman proximal gradient algorithms
Authors:
Thomas Guilmeau,
Emilie Chouzenoux,
Víctor Elvira
Abstract:
We study the variational inference problem of minimizing a regularized Rényi divergence over an exponential family, and propose a relaxed moment-matching algorithm, which includes a proximal-like step. Using the information-geometric link between Bregman divergences and the Kullback-Leibler divergence, this algorithm is shown to be equivalent to a Bregman proximal gradient algorithm. This novel pe…
▽ More
We study the variational inference problem of minimizing a regularized Rényi divergence over an exponential family, and propose a relaxed moment-matching algorithm, which includes a proximal-like step. Using the information-geometric link between Bregman divergences and the Kullback-Leibler divergence, this algorithm is shown to be equivalent to a Bregman proximal gradient algorithm. This novel perspective allows us to exploit the geometry of our approximate model while using stochastic black-box updates. We use this point of view to prove strong convergence guarantees including monotonic decrease of the objective, convergence to a stationary point or to the minimizer, and geometric convergence rates. These new theoretical insights lead to a versatile, robust, and competitive method, as illustrated by numerical experiments.
△ Less
Submitted 19 June, 2024; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Towards Practical Few-Shot Query Sets: Transductive Minimum Description Length Inference
Authors:
Ségolène Martin,
Malik Boudiaf,
Emilie Chouzenoux,
Jean-Christophe Pesquet,
Ismail Ben Ayed
Abstract:
Standard few-shot benchmarks are often built upon simplifying assumptions on the query sets, which may not always hold in practice. In particular, for each task at testing time, the classes effectively present in the unlabeled query set are known a priori, and correspond exactly to the set of classes represented in the labeled support set. We relax these assumptions and extend current benchmarks,…
▽ More
Standard few-shot benchmarks are often built upon simplifying assumptions on the query sets, which may not always hold in practice. In particular, for each task at testing time, the classes effectively present in the unlabeled query set are known a priori, and correspond exactly to the set of classes represented in the labeled support set. We relax these assumptions and extend current benchmarks, so that the query-set classes of a given task are unknown, but just belong to a much larger set of possible classes. Our setting could be viewed as an instance of the challenging yet practical problem of extremely imbalanced K-way classification, K being much larger than the values typically used in standard benchmarks, and with potentially irrelevant supervision from the support set. Expectedly, our setting incurs drops in the performances of state-of-the-art methods. Motivated by these observations, we introduce a PrimAl Dual Minimum Description LEngth (PADDLE) formulation, which balances data-fitting accuracy and model complexity for a given few-shot task, under supervision constraints from the support set. Our constrained MDL-like objective promotes competition among a large set of possible classes, preserving only effective classes that befit better the data of a few-shot task. It is hyperparameter free, and could be applied on top of any base-class training. Furthermore, we derive a fast block coordinate descent algorithm for optimizing our objective, with convergence guarantee, and a linear computational complexity at each iteration. Comprehensive experiments over the standard few-shot datasets and the more realistic and challenging i-Nat dataset show highly competitive performances of our method, more so when the numbers of possible classes in the tasks increase. Our code is publicly available at https://github.com/SegoleneMartin/PADDLE.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
Gradient-based Adaptive Importance Samplers
Authors:
Víctor Elvira,
Emilie Chouzenoux,
Ömer Deniz Akyildiz,
Luca Martino
Abstract:
Importance sampling (IS) is a powerful Monte Carlo methodology for the approximation of intractable integrals, very often involving a target probability density function. The performance of IS heavily depends on the appropriate selection of the proposal distributions where the samples are simulated from. In this paper, we propose an adaptive importance sampler, called GRAMIS, that iteratively impr…
▽ More
Importance sampling (IS) is a powerful Monte Carlo methodology for the approximation of intractable integrals, very often involving a target probability density function. The performance of IS heavily depends on the appropriate selection of the proposal distributions where the samples are simulated from. In this paper, we propose an adaptive importance sampler, called GRAMIS, that iteratively improves the set of proposals. The algorithm exploits geometric information of the target to adapt the location and scale parameters of those proposals. Moreover, in order to allow for a cooperative adaptation, a repulsion term is introduced that favors a coordinated exploration of the state space. This translates into a more diverse exploration and a better approximation of the target via the mixture of proposals. Moreover, we provide a theoretical justification of the repulsion term. We show the good performance of GRAMIS in two problems where the target has a challenging shape and cannot be easily approximated by a standard uni-modal proposal.
△ Less
Submitted 21 June, 2023; v1 submitted 19 October, 2022;
originally announced October 2022.
-
Graph Regularized Probabilistic Matrix Factorization for Drug-Drug Interactions Prediction
Authors:
Stuti Jain,
Emilie Chouzenoux,
Kriti Kumar,
Angshul Majumdar
Abstract:
Co-administration of two or more drugs simultaneously can result in adverse drug reactions. Identifying drug-drug interactions (DDIs) is necessary, especially for drug development and for repurposing old drugs. DDI prediction can be viewed as a matrix completion task, for which matrix factorization (MF) appears as a suitable solution. This paper presents a novel Graph Regularized Probabilistic Mat…
▽ More
Co-administration of two or more drugs simultaneously can result in adverse drug reactions. Identifying drug-drug interactions (DDIs) is necessary, especially for drug development and for repurposing old drugs. DDI prediction can be viewed as a matrix completion task, for which matrix factorization (MF) appears as a suitable solution. This paper presents a novel Graph Regularized Probabilistic Matrix Factorization (GRPMF) method, which incorporates expert knowledge through a novel graph-based regularization strategy within an MF framework. An efficient and sounded optimization algorithm is proposed to solve the resulting non-convex problem in an alternating fashion. The performance of the proposed method is evaluated through the DrugBank dataset, and comparisons are provided against state-of-the-art techniques. The results demonstrate the superior performance of GRPMF when compared to its counterparts.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
Efficient Bayes Inference in Neural Networks through Adaptive Importance Sampling
Authors:
Yunshi Huang,
Emilie Chouzenoux,
Victor Elvira,
Jean-Christophe Pesquet
Abstract:
Bayesian neural networks (BNNs) have received an increased interest in the last years. In BNNs, a complete posterior distribution of the unknown weight and bias parameters of the network is produced during the training stage. This probabilistic estimation offers several advantages with respect to point-wise estimates, in particular, the ability to provide uncertainty quantification when predicting…
▽ More
Bayesian neural networks (BNNs) have received an increased interest in the last years. In BNNs, a complete posterior distribution of the unknown weight and bias parameters of the network is produced during the training stage. This probabilistic estimation offers several advantages with respect to point-wise estimates, in particular, the ability to provide uncertainty quantification when predicting new data. This feature inherent to the Bayesian paradigm, is useful in countless machine learning applications. It is particularly appealing in areas where decision-making has a crucial impact, such as medical healthcare or autonomous driving. The main challenge of BNNs is the computational cost of the training procedure since Bayesian techniques often face a severe curse of dimensionality. Adaptive importance sampling (AIS) is one of the most prominent Monte Carlo methodologies benefiting from sounded convergence guarantees and ease for adaptation. This work aims to show that AIS constitutes a successful approach for designing BNNs. More precisely, we propose a novel algorithm PMCnet that includes an efficient adaptation mechanism, exploiting geometric information on the complex (often multimodal) posterior distribution. Numerical results illustrate the excellent performance and the improved exploration capabilities of the proposed method for both shallow and deep neural networks.
△ Less
Submitted 13 April, 2023; v1 submitted 3 October, 2022;
originally announced October 2022.
-
Deep Unfolding of the DBFB Algorithm with Application to ROI CT Imaging with Limited Angular Density
Authors:
Marion Savanier,
Emilie Chouzenoux,
Jean-Christophe Pesquet,
Cyril Riddell
Abstract:
This paper presents a new method for reconstructing regions of interest (ROI) from a limited number of computed tomography (CT) measurements. Classical model-based iterative reconstruction methods lead to images with predictable features. Still, they often suffer from tedious parameterization and slow convergence. On the contrary, deep learning methods are fast, and they can reach high reconstruct…
▽ More
This paper presents a new method for reconstructing regions of interest (ROI) from a limited number of computed tomography (CT) measurements. Classical model-based iterative reconstruction methods lead to images with predictable features. Still, they often suffer from tedious parameterization and slow convergence. On the contrary, deep learning methods are fast, and they can reach high reconstruction quality by leveraging information from large datasets, but they lack interpretability. At the crossroads of both methods, deep unfolding networks have been recently proposed. Their design includes the physics of the imaging system and the steps of an iterative optimization algorithm. Motivated by the success of these networks for various applications, we introduce an unfolding neural network called U-RDBFB designed for ROI CT reconstruction from limited data. Few-view truncated data are effectively handled thanks to a robust non-convex data fidelity term combined with a sparsity-inducing regularization function. We unfold the Dual Block coordinate Forward-Backward (DBFB) algorithm, embedded in an iterative reweighted scheme, allowing the learning of key parameters in a supervised manner. Our experiments show an improvement over several state-of-the-art methods, including a model-based iterative scheme, a multi-scale deep learning architecture, and other deep unfolding methods.
△ Less
Submitted 17 May, 2023; v1 submitted 27 September, 2022;
originally announced September 2022.
-
Graphical Inference in Linear-Gaussian State-Space Models
Authors:
Víctor Elvira,
Émilie Chouzenoux
Abstract:
State-space models (SSM) are central to describe time-varying complex systems in countless signal processing applications such as remote sensing, networks, biomedicine, and finance to name a few. Inference and prediction in SSMs are possible when the model parameters are known, which is rarely the case. The estimation of these parameters is crucial, not only for performing statistical analysis, bu…
▽ More
State-space models (SSM) are central to describe time-varying complex systems in countless signal processing applications such as remote sensing, networks, biomedicine, and finance to name a few. Inference and prediction in SSMs are possible when the model parameters are known, which is rarely the case. The estimation of these parameters is crucial, not only for performing statistical analysis, but also for uncovering the underlying structure of complex phenomena. In this paper, we focus on the linear-Gaussian model, arguably the most celebrated SSM, and particularly in the challenging task of estimating the transition matrix that encodes the Markovian dependencies in the evolution of the multi-variate state. We introduce a novel perspective by relating this matrix to the adjacency matrix of a directed graph, also interpreted as the causal relationship among state dimensions in the Granger-causality sense. Under this perspective, we propose a new method called GraphEM based on the well sounded expectation-maximization (EM) methodology for inferring the transition matrix jointly with the smoothing/filtering of the observed data. We propose an advanced convex optimization solver relying on a consensus-based implementation of a proximal splitting strategy for solving the M-step. This approach enables an efficient and versatile processing of various sophisticated priors on the graph structure, such as parsimony constraints, while benefiting from convergence guarantees. We demonstrate the good performance and the interpretable results of GraphEM by means of two sets of numerical examples.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.
-
A CNC approach for Directional Total Variation
Authors:
Gabriele Scrivanti,
Emilie Chouzenoux,
Jean-Christophe Pesquet
Abstract:
The core of many approaches for the resolution of variational inverse problems arising in signal and image processing consists of promoting the sought solution to have a sparse representation in a well-suited space. A crucial task in this context is the choice of a good sparsity prior that can ensure a good trade-off between the quality of the solution and the resulting computational cost. The rec…
▽ More
The core of many approaches for the resolution of variational inverse problems arising in signal and image processing consists of promoting the sought solution to have a sparse representation in a well-suited space. A crucial task in this context is the choice of a good sparsity prior that can ensure a good trade-off between the quality of the solution and the resulting computational cost. The recently introduced Convex-Non-Convex (CNC) strategy appears as a great compromise, as it combines the high qualitative performance of non-convex sparsity-promoting functions with the convenience of dealing with convex optimization problems. This work proposes a new variational formulation to implement CNC approach in the context of image denoising. By suitably exploiting duality properties, our formulation allows to encompass sophisticated directional total variation (DTV) priors. We additionally propose an efficient optimisation strategy for the resulting convex minimisation problem. We illustrate on numerical examples the good performance of the resulting CNC-DTV method, when compared to the standard convex total variation denoiser.
△ Less
Submitted 3 September, 2022;
originally announced September 2022.
-
A Variational Approach for Joint Image Recovery and Feature Extraction Based on Spatially-Varying Generalised Gaussian Models
Authors:
Emilie Chouzenoux,
Marie-Caroline Corbineau,
Jean-Christophe Pesquet,
Gabriele Scrivanti
Abstract:
The joint problem of reconstruction / feature extraction is a challenging task in image processing. It consists in performing, in a joint manner, the restoration of an image and the extraction of its features. In this work, we firstly propose a novel nonsmooth and non-convex variational formulation of the problem. For this purpose, we introduce a versatile generalised Gaussian prior whose paramete…
▽ More
The joint problem of reconstruction / feature extraction is a challenging task in image processing. It consists in performing, in a joint manner, the restoration of an image and the extraction of its features. In this work, we firstly propose a novel nonsmooth and non-convex variational formulation of the problem. For this purpose, we introduce a versatile generalised Gaussian prior whose parameters, including its exponent, are space-variant. Secondly, we design an alternating proximal-based optimisation algorithm that efficiently exploits the structure of the proposed non-convex objective function. We also analyse the convergence of this algorithm. As shown in numerical experiments conducted on joint deblurring/segmentation tasks, the proposed method provides high-quality results.
△ Less
Submitted 5 March, 2024; v1 submitted 3 September, 2022;
originally announced September 2022.
-
Optimized Population Monte Carlo
Authors:
Víctor Elvira,
Émilie Chouzenoux
Abstract:
Adaptive importance sampling (AIS) methods are increasingly used for the approximation of distributions and related intractable integrals in the context of Bayesian inference. Population Monte Carlo (PMC) algorithms are a subclass of AIS methods, widely used due to their ease in the adaptation. In this paper, we propose a novel algorithm that exploits the benefits of the PMC framework and includes…
▽ More
Adaptive importance sampling (AIS) methods are increasingly used for the approximation of distributions and related intractable integrals in the context of Bayesian inference. Population Monte Carlo (PMC) algorithms are a subclass of AIS methods, widely used due to their ease in the adaptation. In this paper, we propose a novel algorithm that exploits the benefits of the PMC framework and includes more efficient adaptive mechanisms, exploiting geometric information of the target distribution. In particular, the novel algorithm adapts the location and scale parameters of a set of importance densities (proposals). At each iteration, the location parameters are adapted by combining a versatile resampling strategy (i.e., using the information of previous weighted samples) with an advanced optimization-based scheme. Local second-order information of the target distribution is incorporated through a preconditioning matrix acting as a scaling metric onto a gradient direction. A damped Newton approach is adopted to ensure robustness of the scheme. The resulting metric is also used to update the scale parameters of the proposals. We discuss several key theoretical foundations for the proposed approach. Finally, we show the successful performance of the proposed method in three numerical examples, involving challenging distributions.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution
Authors:
Yunshi Huang,
Emilie Chouzenoux,
Jean-Christophe Pesquet
Abstract:
In this paper, we introduce a variational Bayesian algorithm (VBA) for image blind deconvolution. Our generic framework incorporates smoothness priors on the unknown blur/image and possible affine constraints (e.g., sum to one) on the blur kernel. One of our main contributions is the integration of VBA within a neural network paradigm, following an unrolling methodology. The proposed architecture…
▽ More
In this paper, we introduce a variational Bayesian algorithm (VBA) for image blind deconvolution. Our generic framework incorporates smoothness priors on the unknown blur/image and possible affine constraints (e.g., sum to one) on the blur kernel. One of our main contributions is the integration of VBA within a neural network paradigm, following an unrolling methodology. The proposed architecture is trained in a supervised fashion, which allows us to optimally set two key hyperparameters of the VBA model and lead to further improvements in terms of resulting visual quality. Various experiments involving grayscale/color images and diverse kernel shapes, are performed. The numerical examples illustrate the high performance of our approach when compared to state-of-the-art techniques based on optimization, Bayesian estimation, or deep learning.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Inversion of Integral Models: a Neural Network Approach
Authors:
Emilie Chouzenoux,
Cecile Della Valle,
Jean-Christophe Pesquet
Abstract:
We introduce a neural network architecture to solve inverse problems linked to a one-dimensional integral operator. This architecture is built by unfolding a forward-backward algorithm derived from the minimization of an objective function which consists of the sum of a data-fidelity function and a Tikhonov-type regularization function. The robustness of this inversion method with respect to a per…
▽ More
We introduce a neural network architecture to solve inverse problems linked to a one-dimensional integral operator. This architecture is built by unfolding a forward-backward algorithm derived from the minimization of an objective function which consists of the sum of a data-fidelity function and a Tikhonov-type regularization function. The robustness of this inversion method with respect to a perturbation of the input is theoretically analyzed. Ensuring robustness is consistent with inverse problem theory since it guarantees both the continuity of the inversion method and its insensitivity to small noise. The latter is a critical property as deep neural networks have been shown to be vulnerable to adversarial perturbations. One of the main novelties of our work is to show that the proposed network is also robust to perturbations of its bias. In our architecture, the bias accounts for the observed data in the inverse problem. We apply our method to the inversion of Abel integral operators, which define a fractional integration involved in wide range of physical processes. The neural network is numerically implemented and tested to illustrate the efficiency of the method. Lipschitz constants after training are computed to measure the robustness of the neural networks.
△ Less
Submitted 31 May, 2021;
originally announced May 2021.
-
Deep Transform and Metric Learning Networks
Authors:
Wen Tang,
Emilie Chouzenoux,
Jean-Christophe Pesquet,
Hamid Krim
Abstract:
Based on its great successes in inference and denosing tasks, Dictionary Learning (DL) and its related sparse optimization formulations have garnered a lot of research interest. While most solutions have focused on single layer dictionaries, the recently improved Deep DL methods have also fallen short on a number of issues. We hence propose a novel Deep DL approach where each DL layer can be formu…
▽ More
Based on its great successes in inference and denosing tasks, Dictionary Learning (DL) and its related sparse optimization formulations have garnered a lot of research interest. While most solutions have focused on single layer dictionaries, the recently improved Deep DL methods have also fallen short on a number of issues. We hence propose a novel Deep DL approach where each DL layer can be formulated and solved as a combination of one linear layer and a Recurrent Neural Network, where the RNN is flexibly regraded as a layer-associated learned metric. Our proposed work unveils new insights between the Neural Networks and Deep DL, and provides a novel, efficient and competitive approach to jointly learn the deep transforms and metrics. Extensive experiments are carried out to demonstrate that the proposed method can not only outperform existing Deep DL, but also state-of-the-art generic Convolutional Neural Networks.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
SuperDeConFuse: A Supervised Deep Convolutional Transform based Fusion Framework for Financial Trading Systems
Authors:
Pooja Gupta,
Angshul Majumdar,
Emilie Chouzenoux,
Giovanni Chierchia
Abstract:
This work proposes a supervised multi-channel time-series learning framework for financial stock trading. Although many deep learning models have recently been proposed in this domain, most of them treat the stock trading time-series data as 2-D image data, whereas its true nature is 1-D time-series data. Since the stock trading systems are multi-channel data, many existing techniques treating the…
▽ More
This work proposes a supervised multi-channel time-series learning framework for financial stock trading. Although many deep learning models have recently been proposed in this domain, most of them treat the stock trading time-series data as 2-D image data, whereas its true nature is 1-D time-series data. Since the stock trading systems are multi-channel data, many existing techniques treating them as 1-D time-series data are not suggestive of any technique to effectively fusion the information carried by the multiple channels. To contribute towards both of these shortcomings, we propose an end-to-end supervised learning framework inspired by the previously established (unsupervised) convolution transform learning framework. Our approach consists of processing the data channels through separate 1-D convolution layers, then fusing the outputs with a series of fully-connected layers, and finally applying a softmax classification layer. The peculiarity of our framework - SuperDeConFuse (SDCF), is that we remove the nonlinear activation located between the multi-channel convolution layers and the fully-connected layers, as well as the one located between the latter and the output layer. We compensate for this removal by introducing a suitable regularization on the aforementioned layer outputs and filters during the training phase. Specifically, we apply a logarithm determinant regularization on the layer filters to break symmetry and force diversity in the learnt transforms, whereas we enforce the non-negativity constraint on the layer outputs to mitigate the issue of dead neurons. This results in the effective learning of a richer set of features and filters with respect to a standard convolutional neural network. Numerical experiments confirm that the proposed model yields considerably better results than state-of-the-art deep learning techniques for real-world problem of stock trading.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
DeConFuse : A Deep Convolutional Transform based Unsupervised Fusion Framework
Authors:
Pooja Gupta,
Jyoti Maggu,
Angshul Majumdar,
Emilie Chouzenoux,
Giovanni Chierchia
Abstract:
This work proposes an unsupervised fusion framework based on deep convolutional transform learning. The great learning ability of convolutional filters for data analysis is well acknowledged. The success of convolutive features owes to convolutional neural network (CNN). However, CNN cannot perform learning tasks in an unsupervised fashion. In a recent work, we show that such shortcoming can be ad…
▽ More
This work proposes an unsupervised fusion framework based on deep convolutional transform learning. The great learning ability of convolutional filters for data analysis is well acknowledged. The success of convolutive features owes to convolutional neural network (CNN). However, CNN cannot perform learning tasks in an unsupervised fashion. In a recent work, we show that such shortcoming can be addressed by adopting a convolutional transform learning (CTL) approach, where convolutional filters are learnt in an unsupervised fashion. The present paper aims at (i) proposing a deep version of CTL; (ii) proposing an unsupervised fusion formulation taking advantage of the proposed deep CTL representation; (iii) develo** a mathematically sounded optimization strategy for performing the learning task. We apply the proposed technique, named DeConFuse, on the problem of stock forecasting and trading. Comparison with state-of-the-art methods (based on CNN and long short-term memory network) shows the superiority of our method for performing a reliable feature extraction.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
ConFuse: Convolutional Transform Learning Fusion Framework For Multi-Channel Data Analysis
Authors:
Pooja Gupta,
Jyoti Maggu,
Angshul Majumdar,
Emilie Chouzenoux,
Giovanni Chierchia
Abstract:
This work addresses the problem of analyzing multi-channel time series data %. In this paper, we by proposing an unsupervised fusion framework based on %the recently proposed convolutional transform learning. Each channel is processed by a separate 1D convolutional transform; the output of all the channels are fused by a fully connected layer of transform learning. The training procedure takes adv…
▽ More
This work addresses the problem of analyzing multi-channel time series data %. In this paper, we by proposing an unsupervised fusion framework based on %the recently proposed convolutional transform learning. Each channel is processed by a separate 1D convolutional transform; the output of all the channels are fused by a fully connected layer of transform learning. The training procedure takes advantage of the proximal interpretation of activation functions. We apply the developed framework to multi-channel financial data for stock forecasting and trading. We compare our proposed formulation with benchmark deep time series analysis networks. The results show that our method yields considerably better results than those compared against.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
Deep Convolutional Transform Learning -- Extended version
Authors:
Jyoti Maggu,
Angshul Majumdar,
Emilie Chouzenoux,
Giovanni Chierchia
Abstract:
This work introduces a new unsupervised representation learning technique called Deep Convolutional Transform Learning (DCTL). By stacking convolutional transforms, our approach is able to learn a set of independent kernels at different layers. The features extracted in an unsupervised manner can then be used to perform machine learning tasks, such as classification and clustering. The learning te…
▽ More
This work introduces a new unsupervised representation learning technique called Deep Convolutional Transform Learning (DCTL). By stacking convolutional transforms, our approach is able to learn a set of independent kernels at different layers. The features extracted in an unsupervised manner can then be used to perform machine learning tasks, such as classification and clustering. The learning technique relies on a well-sounded alternating proximal minimization scheme with established convergence guarantees. Our experimental results show that the proposed DCTL technique outperforms its shallow version CTL, on several benchmark datasets.
△ Less
Submitted 2 October, 2020;
originally announced October 2020.
-
DeepVir -- Graphical Deep Matrix Factorization for "In Silico" Antiviral Repositioning: Application to COVID-19
Authors:
Aanchal Mongia,
Stuti Jain,
Emilie Chouzenoux,
Angshul Majumda
Abstract:
This work formulates antiviral repositioning as a matrix completion problem where the antiviral drugs are along the rows and the viruses along the columns. The input matrix is partially filled, with ones in positions where the antiviral has been known to be effective against a virus. The curated metadata for antivirals (chemical structure and pathways) and viruses (genomic structure and symptoms)…
▽ More
This work formulates antiviral repositioning as a matrix completion problem where the antiviral drugs are along the rows and the viruses along the columns. The input matrix is partially filled, with ones in positions where the antiviral has been known to be effective against a virus. The curated metadata for antivirals (chemical structure and pathways) and viruses (genomic structure and symptoms) is encoded into our matrix completion framework as graph Laplacian regularization. We then frame the resulting multiple graph regularized matrix completion problem as deep matrix factorization. This is solved by using a novel optimization method called HyPALM (Hybrid Proximal Alternating Linearized Minimization). Results on our curated RNA drug virus association (DVA) dataset shows that the proposed approach excels over state-of-the-art graph regularized matrix completion techniques. When applied to "in silico" prediction of antivirals for COVID-19, our approach returns antivirals that are either used for treating patients or are under for trials for the same.
△ Less
Submitted 22 September, 2020;
originally announced September 2020.
-
A computational approach to aid clinicians in selecting anti-viral drugs for COVID-19 trials
Authors:
Aanchal Mongia,
Sanjay Kr. Saha,
Emilie Chouzenoux,
Angshul Majumdar
Abstract:
COVID-19 has fast-paced drug re-positioning for its treatment. This work builds computational models for the same. The aim is to assist clinicians with a tool for selecting prospective antiviral treatments. Since the virus is known to mutate fast, the tool is likely to help clinicians in selecting the right set of antivirals for the mutated isolate.
The main contribution of this work is a manual…
▽ More
COVID-19 has fast-paced drug re-positioning for its treatment. This work builds computational models for the same. The aim is to assist clinicians with a tool for selecting prospective antiviral treatments. Since the virus is known to mutate fast, the tool is likely to help clinicians in selecting the right set of antivirals for the mutated isolate.
The main contribution of this work is a manually curated database publicly shared, comprising of existing associations between viruses and their corresponding antivirals. The database gathers similarity information using the chemical structure of drugs and the genomic structure of viruses. Along with this database, we make available a set of state-of-the-art computational drug re-positioning tools based on matrix completion. The tools are first analysed on a standard set of experimental protocols for drug target interactions. The best performing ones are applied for the task of re-positioning antivirals for COVID-19. These tools select six drugs out of which four are currently under various stages of trial, namely Remdesivir (as a cure), Ribavarin (in combination with others for cure), Umifenovir (as a prophylactic and cure) and Sofosbuvir (as a cure). Another unanimous prediction is Tenofovir alafenamide, which is a novel tenofovir prodrug developed in order to improve renal safety when compared to the counterpart tenofovir disoproxil. Both are under trail, the former as a cure and the latter as a prophylactic. These results establish that the computational methods are in sync with the state-of-practice. We also demonstrate how the selected drugs change as the SARS-Cov-2 mutates over time, suggesting the importance of such a tool in drug prediction.
The dataset and software is available publicly at https://github.com/aanchalMongia/DVA and the prediction tool with a user-friendly interface is available at http://dva.salsa.iiitd.edu.in.
△ Less
Submitted 31 July, 2020; v1 submitted 3 July, 2020;
originally announced July 2020.
-
Deep Transform and Metric Learning Network: Wedding Deep Dictionary Learning and Neural Networks
Authors:
Wen Tang,
Emilie Chouzenoux,
Jean-Christophe Pesquet,
Hamid Krim
Abstract:
On account of its many successes in inference tasks and denoising applications, Dictionary Learning (DL) and its related sparse optimization problems have garnered a lot of research interest. While most solutions have focused on single layer dictionaries, the improved recently proposed Deep DL (DDL) methods have also fallen short on a number of issues. We propose herein, a novel DDL approach where…
▽ More
On account of its many successes in inference tasks and denoising applications, Dictionary Learning (DL) and its related sparse optimization problems have garnered a lot of research interest. While most solutions have focused on single layer dictionaries, the improved recently proposed Deep DL (DDL) methods have also fallen short on a number of issues. We propose herein, a novel DDL approach where each DL layer can be formulated as a combination of one linear layer and a Recurrent Neural Network (RNN). The RNN is shown to flexibly account for the layer-associated and learned metric. Our proposed work unveils new insights into Neural Networks and DDL and provides a new, efficient and competitive approach to jointly learn a deep transform and a metric for inference applications. Extensive experiments are carried out to demonstrate that the proposed method can not only outperform existing DDL but also state-of-the-art generic CNNs.
△ Less
Submitted 20 October, 2020; v1 submitted 18 February, 2020;
originally announced February 2020.
-
Block Distributed Majorize-Minimize Memory Gradient Algorithm and its application to 3D image restoration
Authors:
Mathieu Chalvidal,
Emilie Chouzenoux
Abstract:
Modern 3D image recovery problems require powerful optimization frameworks to handle high dimensionality while providing reliable numerical solutions in a reasonable time. In this perspective, asynchronous parallel optimization algorithms have received an increasing attention by overcoming memory limitation issues and communication bottlenecks. In this work, we propose a block distributed Majorize…
▽ More
Modern 3D image recovery problems require powerful optimization frameworks to handle high dimensionality while providing reliable numerical solutions in a reasonable time. In this perspective, asynchronous parallel optimization algorithms have received an increasing attention by overcoming memory limitation issues and communication bottlenecks. In this work, we propose a block distributed Majorize-Minorize Memory Gradient (BD3MG) optimization algorithm for solving large scale non-convex differentiable optimization problems. Assuming a distributed memory environment, the algorithm casts the efficient 3MG scheme into smaller dimension subproblems where blocks of variables are addressed in an asynchronous manner. Convergence of the sequence built by the proposed BD3MG method is established under mild assumptions. Application to the restoration of 3D images degraded by a depth-variant blur shows that our method yields significant computational time reduction compared to several synchronous and asynchronous competitors, while exhibiting great scalability potential.
△ Less
Submitted 25 June, 2020; v1 submitted 6 February, 2020;
originally announced February 2020.
-
SPOQ $\ell_p$-Over-$\ell_q$ Regularization for Sparse Signal Recovery applied to Mass Spectrometry
Authors:
Afef Cherni,
Emilie Chouzenoux,
Laurent Duval,
Jean-Christophe Pesquet
Abstract:
Underdetermined or ill-posed inverse problems require additional information for \ldd{d} sound solutions with tractable optimization algorithms. Sparsity yields consequent heuristics to that matter, with numerous applications in signal restoration, image recovery, or machine learning. Since the $\ell_0$ count measure is barely tractable, many statistical or learning approaches have invested in com…
▽ More
Underdetermined or ill-posed inverse problems require additional information for \ldd{d} sound solutions with tractable optimization algorithms. Sparsity yields consequent heuristics to that matter, with numerous applications in signal restoration, image recovery, or machine learning. Since the $\ell_0$ count measure is barely tractable, many statistical or learning approaches have invested in computable proxies, such as the $\ell_1$ norm. However, the latter does not exhibit the desirable property of scale invariance for sparse data. Extending the SOOT Euclidean/Taxicab $\ell_1$-over-$\ell_2$ norm-ratio initially introduced for blind deconvolution, we propose SPOQ, a family of smoothed (approximately) scale-invariant penalty functions. It consists of a Lipschitz-differentiable surrogate for $\ell_p$-over-$\ell_q$ quasi-norm/norm ratios with $p\in\,]0,2[$ and $q\ge 2$. This surrogate is embedded into a novel majorize-minimize trust-region approach, generalizing the variable metric forward-backward algorithm. For naturally sparse mass-spectrometry signals, we show that SPOQ significantly outperforms $\ell_0$, $\ell_1$, Cauchy, Welsch, SCAD and Celo penalties on several performance measures. Guidelines on SPOQ hyperparameters tuning are also provided, suggesting simple data-driven choices.
△ Less
Submitted 22 September, 2020; v1 submitted 23 January, 2020;
originally announced January 2020.
-
GraphEM: EM algorithm for blind Kalman filtering under graphical sparsity constraints
Authors:
Émilie Chouzenoux,
Víctor Elvira
Abstract:
Modeling and inference with multivariate sequences is central in a number of signal processing applications such as acoustics, social network analysis, biomedical, and finance, to name a few. The linear-Gaussian state-space model is a common way to describe a time series through the evolution of a hidden state, with the advantage of presenting a simple inference procedure due to the celebrated Kal…
▽ More
Modeling and inference with multivariate sequences is central in a number of signal processing applications such as acoustics, social network analysis, biomedical, and finance, to name a few. The linear-Gaussian state-space model is a common way to describe a time series through the evolution of a hidden state, with the advantage of presenting a simple inference procedure due to the celebrated Kalman filter. A fundamental question when analyzing multivariate sequences is the search for relationships between their entries (or the modeled hidden states), especially when the inherent structure is a non-fully connected graph. In such context, graphical modeling combined with parsimony constraints allows to limit the proliferation of parameters and enables a compact data representation which is easier to interpret by the experts. In this work, we propose a novel expectation-minimization algorithm for estimating the linear matrix operator in the state equation of a linear-Gaussian state-space model. Lasso regularization is included in the M-step, that we solved using a proximal splitting Douglas-Rachford algorithm. Numerical experiments illustrate the benefits of the proposed model and inference technique, named GraphEM, over competitors relying on Granger causality.
△ Less
Submitted 9 January, 2020;
originally announced January 2020.
-
Deep Latent Factor Model for Collaborative Filtering
Authors:
Aanchal Mongia,
Neha Jhamb,
Emilie Chouzenoux,
Angshul Majumdar
Abstract:
Latent factor models have been used widely in collaborative filtering based recommender systems. In recent years, deep learning has been successful in solving a wide variety of machine learning problems. Motivated by the success of deep learning, we propose a deeper version of latent factor model. Experiments on benchmark datasets shows that our proposed technique significantly outperforms all sta…
▽ More
Latent factor models have been used widely in collaborative filtering based recommender systems. In recent years, deep learning has been successful in solving a wide variety of machine learning problems. Motivated by the success of deep learning, we propose a deeper version of latent factor model. Experiments on benchmark datasets shows that our proposed technique significantly outperforms all state-of-the-art collaborative filtering techniques.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
Transformed Subspace Clustering
Authors:
Jyoti Maggu,
Angshul Majumdar,
Emilie Chouzenoux
Abstract:
Subspace clustering assumes that the data is sepa-rable into separate subspaces. Such a simple as-sumption, does not always hold. We assume that, even if the raw data is not separable into subspac-es, one can learn a representation (transform coef-ficients) such that the learnt representation is sep-arable into subspaces. To achieve the intended goal, we embed subspace clustering techniques (local…
▽ More
Subspace clustering assumes that the data is sepa-rable into separate subspaces. Such a simple as-sumption, does not always hold. We assume that, even if the raw data is not separable into subspac-es, one can learn a representation (transform coef-ficients) such that the learnt representation is sep-arable into subspaces. To achieve the intended goal, we embed subspace clustering techniques (locally linear manifold clustering, sparse sub-space clustering and low rank representation) into transform learning. The entire formulation is jointly learnt; giving rise to a new class of meth-ods called transformed subspace clustering (TSC). In order to account for non-linearity, ker-nelized extensions of TSC are also proposed. To test the performance of the proposed techniques, benchmarking is performed on image clustering and document clustering datasets. Comparison with state-of-the-art clustering techniques shows that our formulation improves upon them.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
General risk measures for robust machine learning
Authors:
Emilie Chouzenoux,
Henri Gérard,
Jean-Christophe Pesquet
Abstract:
A wide array of machine learning problems are formulated as the minimization of the expectation of a convex loss function on some parameter space. Since the probability distribution of the data of interest is usually unknown, it is is often estimated from training sets, which may lead to poor out-of-sample performance. In this work, we bring new insights in this problem by using the framework whic…
▽ More
A wide array of machine learning problems are formulated as the minimization of the expectation of a convex loss function on some parameter space. Since the probability distribution of the data of interest is usually unknown, it is is often estimated from training sets, which may lead to poor out-of-sample performance. In this work, we bring new insights in this problem by using the framework which has been developed in quantitative finance for risk measures. We show that the original min-max problem can be recast as a convex minimization problem under suitable assumptions. We discuss several important examples of robust formulations, in particular by defining ambiguity sets based on $\varphi$-divergences and the Wasserstein metric.We also propose an efficient algorithm for solving the corresponding convex optimization problems involving complex convex constraints. Through simulation examples, we demonstrate that this algorithm scales well on real data sets.
△ Less
Submitted 24 May, 2019; v1 submitted 26 April, 2019;
originally announced April 2019.
-
Deep Unfolding of a Proximal Interior Point Method for Image Restoration
Authors:
Carla Bertocchi,
Emilie Chouzenoux,
Marie-Caroline Corbineau,
Jean-Christophe Pesquet,
Marco Prato
Abstract:
Variational methods are widely applied to ill-posed inverse problems for they have the ability to embed prior knowledge about the solution. However, the level of performance of these methods significantly depends on a set of parameters, which can be estimated through computationally expensive and time-consuming methods. In contrast, deep learning offers very generic and efficient architectures, at…
▽ More
Variational methods are widely applied to ill-posed inverse problems for they have the ability to embed prior knowledge about the solution. However, the level of performance of these methods significantly depends on a set of parameters, which can be estimated through computationally expensive and time-consuming methods. In contrast, deep learning offers very generic and efficient architectures, at the expense of explainability, since it is often used as a black-box, without any fine control over its output. Deep unfolding provides a convenient approach to combine variational-based and deep learning approaches. Starting from a variational formulation for image restoration, we develop iRestNet, a neural network architecture obtained by unfolding a proximal interior point algorithm. Hard constraints, encoding desirable properties for the restored image, are incorporated into the network thanks to a logarithmic barrier, while the barrier parameter, the stepsize, and the penalization weight are learned by the network. We derive explicit expressions for the gradient of the proximity operator for various choices of constraints, which allows training iRestNet with gradient descent and backpropagation. In addition, we provide theoretical results regarding the stability of the network for a common inverse problem example. Numerical experiments on image deblurring problems show that the proposed approach compares favorably with both state-of-the-art variational and machine learning methods in terms of image quality.
△ Less
Submitted 21 January, 2020; v1 submitted 11 December, 2018;
originally announced December 2018.
-
A probabilistic incremental proximal gradient method
Authors:
Ömer Deniz Akyildiz,
Émilie Chouzenoux,
Víctor Elvira,
Joaquín Míguez
Abstract:
In this paper, we propose a probabilistic optimization method, named probabilistic incremental proximal gradient (PIPG) method, by develo** a probabilistic interpretation of the incremental proximal gradient algorithm. We explicitly model the update rules of the incremental proximal gradient method and develop a systematic approach to propagate the uncertainty of the solution estimate over itera…
▽ More
In this paper, we propose a probabilistic optimization method, named probabilistic incremental proximal gradient (PIPG) method, by develo** a probabilistic interpretation of the incremental proximal gradient algorithm. We explicitly model the update rules of the incremental proximal gradient method and develop a systematic approach to propagate the uncertainty of the solution estimate over iterations. The PIPG algorithm takes the form of Bayesian filtering updates for a state-space model constructed by using the cost function. Our framework makes it possible to utilize well-known exact or approximate Bayesian filters, such as Kalman or extended Kalman filters, to solve large-scale regularized optimization problems.
△ Less
Submitted 19 June, 2019; v1 submitted 4 December, 2018;
originally announced December 2018.
-
A Two-Stage Subspace Trust Region Approach for Deep Neural Network Training
Authors:
Viacheslav Dudar,
Giovanni Chierchia,
Emilie Chouzenoux,
Jean-Christophe Pesquet,
Vladimir Semenov
Abstract:
In this paper, we develop a novel second-order method for training feed-forward neural nets. At each iteration, we construct a quadratic approximation to the cost function in a low-dimensional subspace. We minimize this approximation inside a trust region through a two-stage procedure: first inside the embedded positive curvature subspace, followed by a gradient descent step. This approach leads t…
▽ More
In this paper, we develop a novel second-order method for training feed-forward neural nets. At each iteration, we construct a quadratic approximation to the cost function in a low-dimensional subspace. We minimize this approximation inside a trust region through a two-stage procedure: first inside the embedded positive curvature subspace, followed by a gradient descent step. This approach leads to a fast objective function decay, prevents convergence to saddle points, and alleviates the need for manually tuning parameters. We show the good performance of the proposed algorithm on benchmark datasets.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.
-
A Proximal Approach for a Class of Matrix Optimization Problems
Authors:
A. Benfenati,
E. Chouzenoux,
J. -C. Pesquet
Abstract:
In recent years, there has been a growing interest in mathematical models leading to the minimization, in a symmetric matrix space, of a Bregman divergence coupled with a regularization term. We address problems of this type within a general framework where the regularization term is split in two parts, one being a spectral function while the other is arbitrary. A Douglas-Rachford approach is prop…
▽ More
In recent years, there has been a growing interest in mathematical models leading to the minimization, in a symmetric matrix space, of a Bregman divergence coupled with a regularization term. We address problems of this type within a general framework where the regularization term is split in two parts, one being a spectral function while the other is arbitrary. A Douglas-Rachford approach is proposed to address such problems and a list of proximity operators is provided allowing us to consider various choices for the fit-to-data functional and for the regularization term. Numerical experiments show the validity of this approach for solving convex optimization problems encountered in the context of sparse covariance matrix estimation. Based on our theoretical results, an algorithm is also proposed for noisy graphical lasso where a precision matrix has to be estimated in the presence of noise. The nonconvexity of the resulting objective function is dealt with a majorization-minimization approach, i.e. by building a sequence of convex surrogates and solving the inner optimization subproblems via the aforementioned Douglas-Rachford procedure. We establish conditions for the convergence of this iterative scheme and we illustrate its good numerical performance with respect to state-of-the-art approaches.
△ Less
Submitted 23 January, 2018;
originally announced January 2018.
-
A Random Block-Coordinate Douglas-Rachford Splitting Method with Low Computational Complexity for Binary Logistic Regression
Authors:
Luis M. Briceno-Arias,
Giovanni Chierchia,
Emilie Chouzenoux,
Jean-Christophe Pesquet
Abstract:
In this paper, we propose a new optimization algorithm for sparse logistic regression based on a stochastic version of the Douglas-Rachford splitting method. Our algorithm sweeps the training set by randomly selecting a mini-batch of data at each iteration, and it allows us to update the variables in a block coordinate manner. Our approach leverages the proximity operator of the logistic loss, whi…
▽ More
In this paper, we propose a new optimization algorithm for sparse logistic regression based on a stochastic version of the Douglas-Rachford splitting method. Our algorithm sweeps the training set by randomly selecting a mini-batch of data at each iteration, and it allows us to update the variables in a block coordinate manner. Our approach leverages the proximity operator of the logistic loss, which is expressed with the generalized Lambert W function. Experiments carried out on standard datasets demonstrate the efficiency of our approach w.r.t. stochastic gradient-like methods.
△ Less
Submitted 25 December, 2017;
originally announced December 2017.
-
A Fast Algorithm Based on a Sylvester-like Equation for LS Regression with GMRF Prior
Authors:
Qi Wei,
Emilie Chouzenoux,
Jean-Yves Tourneret,
Jean-Christophe Pesquet
Abstract:
This paper presents a fast approach for penalized least squares (LS) regression problems using a 2D Gaussian Markov random field (GMRF) prior. More precisely, the computation of the proximity operator of the LS criterion regularized by different GMRF potentials is formulated as solving a Sylvester-like matrix equation. By exploiting the structural properties of GMRFs, this matrix equation is solve…
▽ More
This paper presents a fast approach for penalized least squares (LS) regression problems using a 2D Gaussian Markov random field (GMRF) prior. More precisely, the computation of the proximity operator of the LS criterion regularized by different GMRF potentials is formulated as solving a Sylvester-like matrix equation. By exploiting the structural properties of GMRFs, this matrix equation is solved columnwise in an analytical way. The proposed algorithm can be embedded into a wide range of proximal algorithms to solve LS regression problems including a convex penalty. Experiments carried out in the case of a constrained LS regression problem arising in a multichannel image processing application, provide evidence that an alternating direction method of multipliers performs quite efficiently in this context.
△ Less
Submitted 9 October, 2017; v1 submitted 18 September, 2017;
originally announced September 2017.
-
A Variational Bayesian Approach for Image Restoration. Application to Image Deblurring with Poisson-Gaussian Noise
Authors:
Yosra Marnissi,
Yuling Zheng,
Emilie Chouzenoux,
Jean-Christophe Pesquet
Abstract:
In this paper, a methodology is investigated for signal recovery in the presence of non-Gaussian noise. In contrast with regularized minimization approaches often adopted in the literature, in our algorithm the regularization parameter is reliably estimated from the observations. As the posterior density of the unknown parameters is analytically intractable, the estimation problem is derived in a…
▽ More
In this paper, a methodology is investigated for signal recovery in the presence of non-Gaussian noise. In contrast with regularized minimization approaches often adopted in the literature, in our algorithm the regularization parameter is reliably estimated from the observations. As the posterior density of the unknown parameters is analytically intractable, the estimation problem is derived in a variational Bayesian framework where the goal is to provide a good approximation to the posterior distribution in order to compute posterior mean estimates. Moreover, a majorization technique is employed to circumvent the difficulties raised by the intricate forms of the non-Gaussian likelihood and of the prior density. We demonstrate the potential of the proposed approach through comparisons with state-of-the-art techniques that are specifically tailored to signal recovery in the presence of mixed Poisson-Gaussian noise. Results show that the proposed approach is efficient and achieves performance comparable with other methods where the regularization parameter is manually tuned from the ground truth.
△ Less
Submitted 20 January, 2017; v1 submitted 24 October, 2016;
originally announced October 2016.
-
PALMA, an improved algorithm for DOSY signal processing
Authors:
Afef Cherni,
Emilie Chouzenoux,
Marc-André Delsuc
Abstract:
NMR is a tool of choice for the measure of diffusion coefficients of species in solution. The DOSY experiment, a 2D implementation of this measure, has proven to be particularly useful for the study of complex mixtures, molecular interactions, polymers, etc. However, DOSY data analysis requires to resort to inverse Laplace transform, in particular for polydisperse samples. This is a known difficul…
▽ More
NMR is a tool of choice for the measure of diffusion coefficients of species in solution. The DOSY experiment, a 2D implementation of this measure, has proven to be particularly useful for the study of complex mixtures, molecular interactions, polymers, etc. However, DOSY data analysis requires to resort to inverse Laplace transform, in particular for polydisperse samples. This is a known difficult numerical task, for which we present here a novel approach. A new algorithm based on a splitting scheme and on the use of proximity operators is introduced. Used in conjunction with a Maximum Entropy and $\ell_1$ hybrid regularisation, this algorithm converges rapidly and produces results robust against experimental noise. This method has been called PALMA. It is able to reproduce faithfully monodisperse as well as polydisperse systems, and numerous simulated and experimental examples are presented. It has been implemented on the server http://palma.labo.igbmc.fr where users can have their datasets processed automatically.
△ Less
Submitted 21 December, 2016; v1 submitted 25 August, 2016;
originally announced August 2016.
-
Convergence Rate Analysis of the Majorize-Minimize Subspace Algorithm -- Extended Version
Authors:
Emilie Chouzenoux,
Jean-Christophe Pesquet
Abstract:
State-of-the-art methods for solving smooth optimization problems are nonlinear conjugate gradient, low memory BFGS, and Majorize-Minimize (MM) subspace algorithms. The MM subspace algorithm which has been introduced more recently has shown good practical performance when compared with other methods on various optimization problems arising in signal and image processing. However, to the best of ou…
▽ More
State-of-the-art methods for solving smooth optimization problems are nonlinear conjugate gradient, low memory BFGS, and Majorize-Minimize (MM) subspace algorithms. The MM subspace algorithm which has been introduced more recently has shown good practical performance when compared with other methods on various optimization problems arising in signal and image processing. However, to the best of our knowledge, no general result exists concerning the theoretical convergence rate of the MM subspace algorithm. This paper aims at deriving such convergence rates both for batch and online versions of the algorithm and, in particular, discusses the influence of the choice of the subspace.
△ Less
Submitted 19 July, 2016; v1 submitted 23 March, 2016;
originally announced March 2016.