Search | arXiv e-print repository

Transductive Zero-Shot and Few-Shot CLIP

Authors: Ségolène Martin, Yunshi Huang, Fereshteh Shakeri, Jean-Christophe Pesquet, Ismail Ben Ayed

Abstract: Transductive inference has been widely investigated in few-shot image classification, but completely overlooked in the recent, fast growing literature on adapting vision-langage models like CLIP. This paper addresses the transductive zero-shot and few-shot CLIP classification challenge, in which inference is performed jointly across a mini-batch of unlabeled query samples, rather than treating eac… ▽ More Transductive inference has been widely investigated in few-shot image classification, but completely overlooked in the recent, fast growing literature on adapting vision-langage models like CLIP. This paper addresses the transductive zero-shot and few-shot CLIP classification challenge, in which inference is performed jointly across a mini-batch of unlabeled query samples, rather than treating each instance independently. We initially construct informative vision-text probability features, leading to a classification problem on the unit simplex set. Inspired by Expectation-Maximization (EM), our optimization-based classification objective models the data probability distribution for each class using a Dirichlet law. The minimization problem is then tackled with a novel block Majorization-Minimization algorithm, which simultaneously estimates the distribution parameters and class assignments. Extensive numerical experiments on 11 datasets underscore the benefits and efficacy of our batch inference approach.On zero-shot tasks with test batches of 75 samples, our approach yields near 20% improvement in ImageNet accuracy over CLIP's zero-shot performance. Additionally, we outperform state-of-the-art methods in the few-shot setting. The code is available at: https://github.com/SegoleneMartin/transductive-CLIP. △ Less

Submitted 8 April, 2024; originally announced May 2024.

Comments: 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2024, Seattle (USA), Washington, United States

arXiv:2404.04983 [pdf]

doi 10.1016/j.jhepr.2024.101008

Primary liver cancer classification from routine tumour biopsy using weakly supervised deep learning

Authors: Aurélie Beaufrère, Nora Ouzir, Paul Emile Zafar, Astrid Laurent-Bellue, Miguel Albuquerque, Gwladys Lubuela, Jules Grégory, Catherine Guettier, Kévin Mondet, Jean-Christophe Pesquet, Valérie Paradis

Abstract: The diagnosis of primary liver cancers (PLCs) can be challenging, especially on biopsies and for combined hepatocellular-cholangiocarcinoma (cHCC-CCA). We automatically classified PLCs on routine-stained biopsies using a weakly supervised learning method. Weak tumour/non-tumour annotations served as labels for training a Resnet18 neural network, and the network's last convolutional layer was used… ▽ More The diagnosis of primary liver cancers (PLCs) can be challenging, especially on biopsies and for combined hepatocellular-cholangiocarcinoma (cHCC-CCA). We automatically classified PLCs on routine-stained biopsies using a weakly supervised learning method. Weak tumour/non-tumour annotations served as labels for training a Resnet18 neural network, and the network's last convolutional layer was used to extract new tumour tile features. Without knowledge of the precise labels of the malignancies, we then applied an unsupervised clustering algorithm. Our model identified specific features of hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (iCCA). Despite no specific features of cHCC-CCA being recognized, the identification of HCC and iCCA tiles within a slide could facilitate the diagnosis of primary liver cancers, particularly cHCC-CCA. Method and results: 166 PLC biopsies were divided into training, internal and external validation sets: 90, 29 and 47 samples. Two liver pathologists reviewed each whole-slide hematein eosin saffron (HES)-stained image (WSI). After annotating the tumour/non-tumour areas, 256x256 pixel tiles were extracted from the WSIs and used to train a ResNet18. The network was used to extract new tile features. An unsupervised clustering algorithm was then applied to the new tile features. In a two-cluster model, Clusters 0 and 1 contained mainly HCC and iCCA histological features. The diagnostic agreement between the pathological diagnosis and the model predictions in the internal and external validation sets was 100% (11/11) and 96% (25/26) for HCC and 78% (7/9) and 87% (13/15) for iCCA, respectively. For cHCC-CCA, we observed a highly variable proportion of tiles from each cluster (Cluster 0: 5-97%; Cluster 1: 2-94%). △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: https://www.sciencedirect.com/science/article/pii/S2589555924000090

ACM Class: I.5; I.4; I.2

Journal ref: JHEP Reports, Volume 6, Issue 3, 2024

arXiv:2404.00390 [pdf, other]

Learning truly monotone operators with applications to nonlinear inverse problems

Authors: Younes Belkouchi, Jean-Christophe Pesquet, Audrey Repetti, Hugues Talbot

Abstract: This article introduces a novel approach to learning monotone neural networks through a newly defined penalization loss. The proposed method is particularly effective in solving classes of variational problems, specifically monotone inclusion problems, commonly encountered in image processing tasks. The Forward-Backward-Forward (FBF) algorithm is employed to address these problems, offering a solu… ▽ More This article introduces a novel approach to learning monotone neural networks through a newly defined penalization loss. The proposed method is particularly effective in solving classes of variational problems, specifically monotone inclusion problems, commonly encountered in image processing tasks. The Forward-Backward-Forward (FBF) algorithm is employed to address these problems, offering a solution even when the Lipschitz constant of the neural network is unknown. Notably, the FBF algorithm provides convergence guarantees under the condition that the learned operator is monotone. Building on plug-and-play methodologies, our objective is to apply these newly learned operators to solving non-linear inverse problems. To achieve this, we initially formulate the problem as a variational inclusion problem. Subsequently, we train a monotone neural network to approximate an operator that may not inherently be monotone. Leveraging the FBF algorithm, we then show simulation examples where the non-linear inverse problem is successfully solved. △ Less

Submitted 30 March, 2024; originally announced April 2024.

arXiv:2312.07479 [pdf, ps, other]

Convex Parameter Estimation of Perturbed Multivariate Generalized Gaussian Distributions

Authors: Nora Ouzir, Frédéric Pascal, Jean-Christophe Pesquet

Abstract: The multivariate generalized Gaussian distribution (MGGD), also known as the multivariate exponential power (MEP) distribution, is widely used in signal and image processing. However, estimating MGGD parameters, which is required in practical applications, still faces specific theoretical challenges. In particular, establishing convergence properties for the standard fixed-point approach when both… ▽ More The multivariate generalized Gaussian distribution (MGGD), also known as the multivariate exponential power (MEP) distribution, is widely used in signal and image processing. However, estimating MGGD parameters, which is required in practical applications, still faces specific theoretical challenges. In particular, establishing convergence properties for the standard fixed-point approach when both the distribution mean and the scatter (or the precision) matrix are unknown is still an open problem. In robust estimation, imposing classical constraints on the precision matrix, such as sparsity, has been limited by the non-convexity of the resulting cost function. This paper tackles these issues from an optimization viewpoint by proposing a convex formulation with well-established convergence properties. We embed our analysis in a noisy scenario where robustness is induced by modelling multiplicative perturbations. The resulting framework is flexible as it combines a variety of regularizations for the precision matrix, the mean and model perturbations. This paper presents proof of the desired theoretical properties, specifies the conditions preserving these properties for different regularization choices and designs a general proximal primal-dual optimization strategy. The experiments show a more accurate precision and covariance matrix estimation with similar performance for the mean vector parameter compared to Tyler's M-estimator. In a high-dimensional setting, the proposed method outperforms the classical GLASSO, one of its robust extensions, and the regularized Tyler's estimator. △ Less

Submitted 12 December, 2023; originally announced December 2023.

arXiv:2311.18386 [pdf, other]

A Novel Variational Approach for Multiphoton Microscopy Image Restoration: from PSF Estimation to 3D Deconvolution

Authors: Julien Ajdenbaum, Emilie Chouzenoux, Claire Lefort, Ségolène Martin, Jean-Christophe Pesquet

Abstract: In multi-photon microscopy (MPM), a recent in-vivo fluorescence microscopy system, the task of image restoration can be decomposed into two interlinked inverse problems: firstly, the characterization of the Point Spread Function (PSF) and subsequently, the deconvolution (i.e., deblurring) to remove the PSF effect, and reduce noise. The acquired MPM image quality is critically affected by PSF blurr… ▽ More In multi-photon microscopy (MPM), a recent in-vivo fluorescence microscopy system, the task of image restoration can be decomposed into two interlinked inverse problems: firstly, the characterization of the Point Spread Function (PSF) and subsequently, the deconvolution (i.e., deblurring) to remove the PSF effect, and reduce noise. The acquired MPM image quality is critically affected by PSF blurring and intense noise. The PSF in MPM is highly spread in 3D and is not well characterized, presenting high variability with respect to the observed objects. This makes the restoration of MPM images challenging. Common PSF estimation methods in fluorescence microscopy, including MPM, involve capturing images of sub-resolution beads, followed by quantifying the resulting ellipsoidal 3D spot. In this work, we revisit this approach, co** with its inherent limitations in terms of accuracy and practicality. We estimate the PSF from the observation of relatively large beads (approximately 1$μ$m in diameter). This goes through the formulation and resolution of an original non-convex minimization problem, for which we propose a proximal alternating method along with convergence guarantees. Following the PSF estimation step, we then introduce an innovative strategy to deal with the high level multiplicative noise degrading the acquisitions. We rely on a heteroscedastic noise model for which we estimate the parameters. We then solve a constrained optimization problem to restore the image, accounting for the estimated PSF and noise, while allowing a minimal hyper-parameter tuning. Theoretical guarantees are given for the restoration algorithm. These algorithmic contributions lead to an end-to-end pipeline for 3D image restoration in MPM, that we share as a publicly available Python software. We demonstrate its effectiveness through several experiments on both simulated and real data. △ Less

Submitted 30 November, 2023; originally announced November 2023.

arXiv:2311.17740 [pdf, other]

A transductive few-shot learning approach for classification of digital histopathological slides from liver cancer

Authors: Aymen Sadraoui, Ségolène Martin, Eliott Barbot, Astrid Laurent-Bellue, Jean-Christophe Pesquet, Catherine Guettier, Ismail Ben Ayed

Abstract: This paper presents a new approach for classifying 2D histopathology patches using few-shot learning. The method is designed to tackle a significant challenge in histopathology, which is the limited availability of labeled data. By applying a sliding window technique to histopathology slides, we illustrate the practical benefits of transductive learning (i.e., making joint predictions on patches)… ▽ More This paper presents a new approach for classifying 2D histopathology patches using few-shot learning. The method is designed to tackle a significant challenge in histopathology, which is the limited availability of labeled data. By applying a sliding window technique to histopathology slides, we illustrate the practical benefits of transductive learning (i.e., making joint predictions on patches) to achieve consistent and accurate classification. Our approach involves an optimization-based strategy that actively penalizes the prediction of a large number of distinct classes within each window. We conducted experiments on histopathological data to classify tissue classes in digital slides of liver cancer, specifically hepatocellular carcinoma. The initial results show the effectiveness of our method and its potential to enhance the process of automated cancer diagnosis and treatment, all while reducing the time and effort required for expert annotation. △ Less

Submitted 11 March, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

Journal ref: ISBI 2024 - 21st IEEE International Symposium on Biomedical Imaging, May 2024, Ath{è}nes, Greece

arXiv:2310.06402 [pdf, other]

Solution of Mismatched Monotone+Lipschitz Inclusion Problems

Authors: Emilie Chouzenoux, Jean-Christophe Pesquet, Fernando Roldán

Abstract: In this article, we study the convergence of algorithms for solving monotone inclusions in the presence of adjoint mismatch. The adjoint mismatch arises when the adjoint of a linear operator is replaced by an approximation, due to computational or physical issues. This occurs in inverse problems, particularly in computed tomography. In real Hilbert spaces, monotone inclusion problems involving a m… ▽ More In this article, we study the convergence of algorithms for solving monotone inclusions in the presence of adjoint mismatch. The adjoint mismatch arises when the adjoint of a linear operator is replaced by an approximation, due to computational or physical issues. This occurs in inverse problems, particularly in computed tomography. In real Hilbert spaces, monotone inclusion problems involving a maximally $ρ$-monotone operator, a cocoercive operator, and a Lipschitzian operator can be solved by the Forward-Backward-Half-Forward and the Forward-Douglas-Rachford-Forward methods. We investigate the case of a mismatched Lipschitzian operator. We propose variants of the two aforementioned methods to cope with the mismatch, and establish conditions under which the weak convergence to a solution is guaranteed for these variants. The proposed algorithms hence enable each iteration to be implemented with a possibly iteration-dependent approximation to the mismatch operator, thus allowing this operator to be modified at each iteration. Finally, we present numerical experiments on a computed tomography example in material science, showing the applicability of our theoretical findings. △ Less

Submitted 9 November, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

MSC Class: 47H05; 47H10; 65K05; 90C25

arXiv:2310.05566 [pdf, ps, other]

Aggregated f-average Neural Network for Interpretable Ensembling

Authors: Mathieu Vu, Emilie Chouzenoux, Jean-Christophe Pesquet, Ismail Ben Ayed

Abstract: Ensemble learning leverages multiple models (i.e., weak learners) on a common machine learning task to enhance prediction performance. Basic ensembling approaches average the weak learners outputs, while more sophisticated ones stack a machine learning model in between the weak learners outputs and the final prediction. This work fuses both aforementioned frameworks. We introduce an aggregated f-a… ▽ More Ensemble learning leverages multiple models (i.e., weak learners) on a common machine learning task to enhance prediction performance. Basic ensembling approaches average the weak learners outputs, while more sophisticated ones stack a machine learning model in between the weak learners outputs and the final prediction. This work fuses both aforementioned frameworks. We introduce an aggregated f-average (AFA) shallow neural network which models and combines different types of averages to perform an optimal aggregation of the weak learners predictions. We emphasise its interpretable architecture and simple training strategy, and illustrate its good performance on the problem of few-shot class incremental learning. △ Less

Submitted 30 November, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

Comments: 6 pages

arXiv:2308.16858 [pdf, other]

Majorization-Minimization for sparse SVMs

Authors: Alessandro Benfenati, Emilie Chouzenoux, Giorgia Franchini, Salla Latva-Aijo, Dominik Narnhofer, Jean-Christophe Pesquet, Sebastian J. Scott, Mahsa Yousefi

Abstract: Several decades ago, Support Vector Machines (SVMs) were introduced for performing binary classification tasks, under a supervised framework. Nowadays, they often outperform other supervised methods and remain one of the most popular approaches in the machine learning arena. In this work, we investigate the training of SVMs through a smooth sparse-promoting-regularized squared hinge loss minimizat… ▽ More Several decades ago, Support Vector Machines (SVMs) were introduced for performing binary classification tasks, under a supervised framework. Nowadays, they often outperform other supervised methods and remain one of the most popular approaches in the machine learning arena. In this work, we investigate the training of SVMs through a smooth sparse-promoting-regularized squared hinge loss minimization. This choice paves the way to the application of quick training methods built on majorization-minimization approaches, benefiting from the Lipschitz differentiabililty of the loss function. Moreover, the proposed approach allows us to handle sparsity-preserving regularizers promoting the selection of the most significant features, so enhancing the performance. Numerical tests and comparisons conducted on three different datasets demonstrate the good performance of the proposed methodology in terms of qualitative metrics (accuracy, precision, recall, and F 1 score) as well as computational cost. △ Less

Submitted 31 August, 2023; originally announced August 2023.

arXiv:2306.11679 [pdf, other]

A primal-dual data-driven method for computational optical imaging with a photonic lantern

Authors: Carlos Santos Garcia, Mathilde Larchevêque, Solal O'Sullivan, Martin Van Waerebeke, Robert R. Thomson, Audrey Repetti, Jean-Christophe Pesquet

Abstract: Optical fibres aim to image in-vivo biological processes. In this context, high spatial resolution and stability to fibre movements are key to enable decision-making processes (e.g., for microendoscopy). Recently, a single-pixel imaging technique based on a multicore fibre photonic lantern has been designed, named computational optical imaging using a lantern (COIL). A proximal algorithm based on… ▽ More Optical fibres aim to image in-vivo biological processes. In this context, high spatial resolution and stability to fibre movements are key to enable decision-making processes (e.g., for microendoscopy). Recently, a single-pixel imaging technique based on a multicore fibre photonic lantern has been designed, named computational optical imaging using a lantern (COIL). A proximal algorithm based on a sparsity prior, dubbed SARA-COIL, has been further proposed to solve the associated inverse problem, to enable image reconstructions for high resolution COIL microendoscopy. In this work, we develop a data-driven approach for COIL. We replace the sparsity prior in the proximal algorithm by a learned denoiser, leading to a plug-and-play (PnP) algorithm. The resulting PnP method, based on a proximal primal-dual algorithm, enables to solve the Morozov formulation of the inverse problem. We use recent results in learning theory to train a network with desirable Lipschitz properties, and we show that the resulting primal-dual PnP algorithm converges to a solution to a monotone inclusion problem. Our simulations highlight that the proposed data-driven approach improves the reconstruction quality over variational SARA-COIL method on both simulated and real data. △ Less

Submitted 17 April, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

arXiv:2210.15064 [pdf, ps, other]

A Variational Inequality Model for Learning Neural Networks

Authors: Patrick L. Combettes, Jean-Christophe Pesquet, Audrey Repetti

Abstract: Neural networks have become ubiquitous tools for solving signal and image processing problems, and they often outperform standard approaches. Nevertheless, training neural networks is a challenging task in many applications. The prevalent training procedure consists of minimizing highly non-convex objectives based on data sets of huge dimension. In this context, current methodologies are not guara… ▽ More Neural networks have become ubiquitous tools for solving signal and image processing problems, and they often outperform standard approaches. Nevertheless, training neural networks is a challenging task in many applications. The prevalent training procedure consists of minimizing highly non-convex objectives based on data sets of huge dimension. In this context, current methodologies are not guaranteed to produce global solutions. We present an alternative approach which foregoes the optimization framework and adopts a variational inequality formalism. The associated algorithm guarantees convergence of the iterates to a true solution of the variational inequality and it possesses an efficient block-iterative structure. A numerical application is presented. △ Less

Submitted 26 October, 2022; originally announced October 2022.

arXiv:2210.14545 [pdf, other]

Towards Practical Few-Shot Query Sets: Transductive Minimum Description Length Inference

Authors: Ségolène Martin, Malik Boudiaf, Emilie Chouzenoux, Jean-Christophe Pesquet, Ismail Ben Ayed

Abstract: Standard few-shot benchmarks are often built upon simplifying assumptions on the query sets, which may not always hold in practice. In particular, for each task at testing time, the classes effectively present in the unlabeled query set are known a priori, and correspond exactly to the set of classes represented in the labeled support set. We relax these assumptions and extend current benchmarks,… ▽ More Standard few-shot benchmarks are often built upon simplifying assumptions on the query sets, which may not always hold in practice. In particular, for each task at testing time, the classes effectively present in the unlabeled query set are known a priori, and correspond exactly to the set of classes represented in the labeled support set. We relax these assumptions and extend current benchmarks, so that the query-set classes of a given task are unknown, but just belong to a much larger set of possible classes. Our setting could be viewed as an instance of the challenging yet practical problem of extremely imbalanced K-way classification, K being much larger than the values typically used in standard benchmarks, and with potentially irrelevant supervision from the support set. Expectedly, our setting incurs drops in the performances of state-of-the-art methods. Motivated by these observations, we introduce a PrimAl Dual Minimum Description LEngth (PADDLE) formulation, which balances data-fitting accuracy and model complexity for a given few-shot task, under supervision constraints from the support set. Our constrained MDL-like objective promotes competition among a large set of possible classes, preserving only effective classes that befit better the data of a few-shot task. It is hyperparameter free, and could be applied on top of any base-class training. Furthermore, we derive a fast block coordinate descent algorithm for optimizing our objective, with convergence guarantee, and a linear computational complexity at each iteration. Comprehensive experiments over the standard few-shot datasets and the more realistic and challenging i-Nat dataset show highly competitive performances of our method, more so when the numbers of possible classes in the tasks increase. Our code is publicly available at https://github.com/SegoleneMartin/PADDLE. △ Less

Submitted 26 October, 2022; originally announced October 2022.

arXiv:2210.00993 [pdf, other]

Efficient Bayes Inference in Neural Networks through Adaptive Importance Sampling

Authors: Yunshi Huang, Emilie Chouzenoux, Victor Elvira, Jean-Christophe Pesquet

Abstract: Bayesian neural networks (BNNs) have received an increased interest in the last years. In BNNs, a complete posterior distribution of the unknown weight and bias parameters of the network is produced during the training stage. This probabilistic estimation offers several advantages with respect to point-wise estimates, in particular, the ability to provide uncertainty quantification when predicting… ▽ More Bayesian neural networks (BNNs) have received an increased interest in the last years. In BNNs, a complete posterior distribution of the unknown weight and bias parameters of the network is produced during the training stage. This probabilistic estimation offers several advantages with respect to point-wise estimates, in particular, the ability to provide uncertainty quantification when predicting new data. This feature inherent to the Bayesian paradigm, is useful in countless machine learning applications. It is particularly appealing in areas where decision-making has a crucial impact, such as medical healthcare or autonomous driving. The main challenge of BNNs is the computational cost of the training procedure since Bayesian techniques often face a severe curse of dimensionality. Adaptive importance sampling (AIS) is one of the most prominent Monte Carlo methodologies benefiting from sounded convergence guarantees and ease for adaptation. This work aims to show that AIS constitutes a successful approach for designing BNNs. More precisely, we propose a novel algorithm PMCnet that includes an efficient adaptation mechanism, exploiting geometric information on the complex (often multimodal) posterior distribution. Numerical results illustrate the excellent performance and the improved exploration capabilities of the proposed method for both shallow and deep neural networks. △ Less

Submitted 13 April, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

arXiv:2209.13264 [pdf, other]

Deep Unfolding of the DBFB Algorithm with Application to ROI CT Imaging with Limited Angular Density

Authors: Marion Savanier, Emilie Chouzenoux, Jean-Christophe Pesquet, Cyril Riddell

Abstract: This paper presents a new method for reconstructing regions of interest (ROI) from a limited number of computed tomography (CT) measurements. Classical model-based iterative reconstruction methods lead to images with predictable features. Still, they often suffer from tedious parameterization and slow convergence. On the contrary, deep learning methods are fast, and they can reach high reconstruct… ▽ More This paper presents a new method for reconstructing regions of interest (ROI) from a limited number of computed tomography (CT) measurements. Classical model-based iterative reconstruction methods lead to images with predictable features. Still, they often suffer from tedious parameterization and slow convergence. On the contrary, deep learning methods are fast, and they can reach high reconstruction quality by leveraging information from large datasets, but they lack interpretability. At the crossroads of both methods, deep unfolding networks have been recently proposed. Their design includes the physics of the imaging system and the steps of an iterative optimization algorithm. Motivated by the success of these networks for various applications, we introduce an unfolding neural network called U-RDBFB designed for ROI CT reconstruction from limited data. Few-view truncated data are effectively handled thanks to a robust non-convex data fidelity term combined with a sparsity-inducing regularization function. We unfold the Dual Block coordinate Forward-Backward (DBFB) algorithm, embedded in an iterative reweighted scheme, allowing the learning of key parameters in a supervised manner. Our experiments show an improvement over several state-of-the-art methods, including a model-based iterative scheme, a multi-scale deep learning architecture, and other deep unfolding methods. △ Less

Submitted 17 May, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

arXiv:2209.01376 [pdf, other]

A CNC approach for Directional Total Variation

Authors: Gabriele Scrivanti, Emilie Chouzenoux, Jean-Christophe Pesquet

Abstract: The core of many approaches for the resolution of variational inverse problems arising in signal and image processing consists of promoting the sought solution to have a sparse representation in a well-suited space. A crucial task in this context is the choice of a good sparsity prior that can ensure a good trade-off between the quality of the solution and the resulting computational cost. The rec… ▽ More The core of many approaches for the resolution of variational inverse problems arising in signal and image processing consists of promoting the sought solution to have a sparse representation in a well-suited space. A crucial task in this context is the choice of a good sparsity prior that can ensure a good trade-off between the quality of the solution and the resulting computational cost. The recently introduced Convex-Non-Convex (CNC) strategy appears as a great compromise, as it combines the high qualitative performance of non-convex sparsity-promoting functions with the convenience of dealing with convex optimization problems. This work proposes a new variational formulation to implement CNC approach in the context of image denoising. By suitably exploiting duality properties, our formulation allows to encompass sophisticated directional total variation (DTV) priors. We additionally propose an efficient optimisation strategy for the resulting convex minimisation problem. We illustrate on numerical examples the good performance of the resulting CNC-DTV method, when compared to the standard convex total variation denoiser. △ Less

Submitted 3 September, 2022; originally announced September 2022.

Comments: Accepted for EUSIPCO 2022 - 30th European Signal Processing Conference, Aug 2022, Belgrade, Serbia

arXiv:2209.01375 [pdf, other]

A Variational Approach for Joint Image Recovery and Feature Extraction Based on Spatially-Varying Generalised Gaussian Models

Authors: Emilie Chouzenoux, Marie-Caroline Corbineau, Jean-Christophe Pesquet, Gabriele Scrivanti

Abstract: The joint problem of reconstruction / feature extraction is a challenging task in image processing. It consists in performing, in a joint manner, the restoration of an image and the extraction of its features. In this work, we firstly propose a novel nonsmooth and non-convex variational formulation of the problem. For this purpose, we introduce a versatile generalised Gaussian prior whose paramete… ▽ More The joint problem of reconstruction / feature extraction is a challenging task in image processing. It consists in performing, in a joint manner, the restoration of an image and the extraction of its features. In this work, we firstly propose a novel nonsmooth and non-convex variational formulation of the problem. For this purpose, we introduce a versatile generalised Gaussian prior whose parameters, including its exponent, are space-variant. Secondly, we design an alternating proximal-based optimisation algorithm that efficiently exploits the structure of the proposed non-convex objective function. We also analyse the convergence of this algorithm. As shown in numerical experiments conducted on joint deblurring/segmentation tasks, the proposed method provides high-quality results. △ Less

Submitted 5 March, 2024; v1 submitted 3 September, 2022; originally announced September 2022.

arXiv:2206.07179 [pdf, other]

Proximal Splitting Adversarial Attacks for Semantic Segmentation

Authors: Jérôme Rony, Jean-Christophe Pesquet, Ismail Ben Ayed

Abstract: Classification has been the focal point of research on adversarial attacks, but only a few works investigate methods suited to denser prediction tasks, such as semantic segmentation. The methods proposed in these works do not accurately solve the adversarial segmentation problem and, therefore, overestimate the size of the perturbations required to fool models. Here, we propose a white-box attack… ▽ More Classification has been the focal point of research on adversarial attacks, but only a few works investigate methods suited to denser prediction tasks, such as semantic segmentation. The methods proposed in these works do not accurately solve the adversarial segmentation problem and, therefore, overestimate the size of the perturbations required to fool models. Here, we propose a white-box attack for these models based on a proximal splitting to produce adversarial perturbations with much smaller $\ell_\infty$ norms. Our attack can handle large numbers of constraints within a nonconvex minimization framework via an Augmented Lagrangian approach, coupled with adaptive constraint scaling and masking strategies. We demonstrate that our attack significantly outperforms previously proposed ones, as well as classification attacks that we adapted for segmentation, providing a first comprehensive benchmark for this dense task. △ Less

Submitted 31 March, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

Comments: CVPR 2023. Code available at: https://github.com/jeromerony/alma_prox_segmentation

arXiv:2110.07202 [pdf, ps, other]

Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution

Authors: Yunshi Huang, Emilie Chouzenoux, Jean-Christophe Pesquet

Abstract: In this paper, we introduce a variational Bayesian algorithm (VBA) for image blind deconvolution. Our generic framework incorporates smoothness priors on the unknown blur/image and possible affine constraints (e.g., sum to one) on the blur kernel. One of our main contributions is the integration of VBA within a neural network paradigm, following an unrolling methodology. The proposed architecture… ▽ More In this paper, we introduce a variational Bayesian algorithm (VBA) for image blind deconvolution. Our generic framework incorporates smoothness priors on the unknown blur/image and possible affine constraints (e.g., sum to one) on the blur kernel. One of our main contributions is the integration of VBA within a neural network paradigm, following an unrolling methodology. The proposed architecture is trained in a supervised fashion, which allows us to optimally set two key hyperparameters of the VBA model and lead to further improvements in terms of resulting visual quality. Various experiments involving grayscale/color images and diverse kernel shapes, are performed. The numerical examples illustrate the high performance of our approach when compared to state-of-the-art techniques based on optimization, Bayesian estimation, or deep learning. △ Less

Submitted 14 October, 2021; originally announced October 2021.

Comments: 13 pages

arXiv:2105.15044 [pdf, other]

Inversion of Integral Models: a Neural Network Approach

Authors: Emilie Chouzenoux, Cecile Della Valle, Jean-Christophe Pesquet

Abstract: We introduce a neural network architecture to solve inverse problems linked to a one-dimensional integral operator. This architecture is built by unfolding a forward-backward algorithm derived from the minimization of an objective function which consists of the sum of a data-fidelity function and a Tikhonov-type regularization function. The robustness of this inversion method with respect to a per… ▽ More We introduce a neural network architecture to solve inverse problems linked to a one-dimensional integral operator. This architecture is built by unfolding a forward-backward algorithm derived from the minimization of an objective function which consists of the sum of a data-fidelity function and a Tikhonov-type regularization function. The robustness of this inversion method with respect to a perturbation of the input is theoretically analyzed. Ensuring robustness is consistent with inverse problem theory since it guarantees both the continuity of the inversion method and its insensitivity to small noise. The latter is a critical property as deep neural networks have been shown to be vulnerable to adversarial perturbations. One of the main novelties of our work is to show that the proposed network is also robust to perturbations of its bias. In our architecture, the bias accounts for the observed data in the inverse problem. We apply our method to the inversion of Abel integral operators, which define a fractional integration involved in wide range of physical processes. The neural network is numerically implemented and tested to illustrate the efficiency of the method. Lipschitz constants after training are computed to measure the robustness of the neural networks. △ Less

Submitted 31 May, 2021; originally announced May 2021.

Comments: 35 pages, 10 figures

arXiv:2104.10329 [pdf, ps, other]

Deep Transform and Metric Learning Networks

Authors: Wen Tang, Emilie Chouzenoux, Jean-Christophe Pesquet, Hamid Krim

Abstract: Based on its great successes in inference and denosing tasks, Dictionary Learning (DL) and its related sparse optimization formulations have garnered a lot of research interest. While most solutions have focused on single layer dictionaries, the recently improved Deep DL methods have also fallen short on a number of issues. We hence propose a novel Deep DL approach where each DL layer can be formu… ▽ More Based on its great successes in inference and denosing tasks, Dictionary Learning (DL) and its related sparse optimization formulations have garnered a lot of research interest. While most solutions have focused on single layer dictionaries, the recently improved Deep DL methods have also fallen short on a number of issues. We hence propose a novel Deep DL approach where each DL layer can be formulated and solved as a combination of one linear layer and a Recurrent Neural Network, where the RNN is flexibly regraded as a layer-associated learned metric. Our proposed work unveils new insights between the Neural Networks and Deep DL, and provides a novel, efficient and competitive approach to jointly learn the deep transforms and metrics. Extensive experiments are carried out to demonstrate that the proposed method can not only outperform existing Deep DL, but also state-of-the-art generic Convolutional Neural Networks. △ Less

Submitted 20 April, 2021; originally announced April 2021.

Comments: Accepted by ICASSP 2021. arXiv admin note: substantial text overlap with arXiv:2002.07898

arXiv:2012.13247 [pdf, other]

Learning Maximally Monotone Operators for Image Recovery

Authors: Jean-Christophe Pesquet, Audrey Repetti, Matthieu Terris, Yves Wiaux

Abstract: We introduce a new paradigm for solving regularized variational problems. These are typically formulated to address ill-posed inverse problems encountered in signal and image processing. The objective function is traditionally defined by adding a regularization function to a data fit term, which is subsequently minimized by using iterative optimization algorithms. Recently, several works have prop… ▽ More We introduce a new paradigm for solving regularized variational problems. These are typically formulated to address ill-posed inverse problems encountered in signal and image processing. The objective function is traditionally defined by adding a regularization function to a data fit term, which is subsequently minimized by using iterative optimization algorithms. Recently, several works have proposed to replace the operator related to the regularization by a more sophisticated denoiser. These approaches, known as plug-and-play (PnP) methods, have shown excellent performance. Although it has been noticed that, under some Lipschitz properties on the denoisers, the convergence of the resulting algorithm is guaranteed, little is known about characterizing the asymptotically delivered solution. In the current article, we propose to address this limitation. More specifically, instead of employing a functional regularization, we perform an operator regularization, where a maximally monotone operator (MMO) is learned in a supervised manner. This formulation is flexible as it allows the solution to be characterized through a broad range of variational inequalities, and it includes convex regularizations as special cases. From an algorithmic standpoint, the proposed approach consists in replacing the resolvent of the MMO by a neural network (NN). We present a universal approximation theorem proving that nonexpansive NNs are suitable models for the resolvent of a wide class of MMOs. The proposed approach thus provides a sound theoretical framework for analyzing the asymptotic behavior of first-order PnP algorithms. In addition, we propose a numerical strategy to train NNs corresponding to resolvents of MMOs. We apply our approach to image restoration problems and demonstrate its validity in terms of both convergence and quality. △ Less

Submitted 21 April, 2021; v1 submitted 24 December, 2020; originally announced December 2020.

MSC Class: 47H05; 90C25; 90C59; 65K10; 49M27; 68T07; 68U10; 94A08

arXiv:2010.15427 [pdf, ps, other]

doi 10.1016/j.sigpro.2020.107835

Sparse Signal Reconstruction for Nonlinear Models via Piecewise Rational Optimization

Authors: Arthur Marmin, Marc Castella, Jean-Christophe Pesquet, Laurent Duval

Abstract: We propose a method to reconstruct sparse signals degraded by a nonlinear distortion and acquired at a limited sampling rate. Our method formulates the reconstruction problem as a nonconvex minimization of the sum of a data fitting term and a penalization term. In contrast with most previous works which settle for approximated local solutions, we seek for a global solution to the obtained challeng… ▽ More We propose a method to reconstruct sparse signals degraded by a nonlinear distortion and acquired at a limited sampling rate. Our method formulates the reconstruction problem as a nonconvex minimization of the sum of a data fitting term and a penalization term. In contrast with most previous works which settle for approximated local solutions, we seek for a global solution to the obtained challenging nonconvex problem. Our global approach relies on the so-called Lasserre relaxation of polynomial optimization. We here specifically include in our approach the case of piecewise rational functions, which makes it possible to address a wide class of nonconvex exact and continuous relaxations of the $\ell_0$ penalization function. Additionally, we study the complexity of the optimization problem. It is shown how to use the structure of the problem to lighten the computational burden efficiently. Finally, numerical simulations illustrate the benefits of our method in terms of both global optimality and signal reconstruction. △ Less

Submitted 25 November, 2020; v1 submitted 29 October, 2020; originally announced October 2020.

MSC Class: 46N10 ACM Class: G.1; I.6; G.1.2; G.1.6; I.4.5

Journal ref: Signal Processing, Volume 179, February 2021, 107835 Signal Processing Volume 179, February 2021, 107835

arXiv:2010.05771 [pdf, other]

Modeling Electrical Motor Dynamics using Encoder-Decoder with Recurrent Skip Connection

Authors: Sagar Verma, Nicolas Henwood, Marc Castella, Francois Malrait, Jean-Christophe Pesquet

Abstract: Electrical motors are the most important source of mechanical energy in the industrial world. Their modeling traditionally relies on a physics-based approach, which aims at taking their complex internal dynamics into account. In this paper, we explore the feasibility of modeling the dynamics of an electrical motor by following a data-driven approach, which uses only its inputs and outputs and does… ▽ More Electrical motors are the most important source of mechanical energy in the industrial world. Their modeling traditionally relies on a physics-based approach, which aims at taking their complex internal dynamics into account. In this paper, we explore the feasibility of modeling the dynamics of an electrical motor by following a data-driven approach, which uses only its inputs and outputs and does not make any assumption on its internal behaviour. We propose a novel encoder-decoder architecture which benefits from recurrent skip connections. We also propose a novel loss function that takes into account the complexity of electrical motor quantities and helps in avoiding model bias. We show that the proposed architecture can achieve a good learning performance on our high-frequency high-variance datasets. Two datasets are considered: the first one is generated using a simulator based on the physics of an induction motor and the second one is recorded from an industrial electrical motor. We benchmark our solution using variants of traditional neural networks like feedforward, convolutional, and recurrent networks. We evaluate various design choices of our architecture and compare it to the baselines. We show the domain adaptation capability of our model to learn dynamics just from simulated data by testing it on the raw sensor data. We finally show the effect of signal complexity on the proposed method ability to model temporal dynamics. △ Less

Submitted 8 October, 2020; originally announced October 2020.

Comments: 8 pages, AAAI2020

arXiv:2008.02260 [pdf, ps, other]

doi 10.1109/TSP.2021.3069677

Fixed Point Strategies in Data Science

Authors: Patrick L. Combettes, Jean-Christophe Pesquet

Abstract: The goal of this paper is to promote the use of fixed point strategies in data science by showing that they provide a simplifying and unifying framework to model, analyze, and solve a great variety of problems. They are seen to constitute a natural environment to explain the behavior of advanced convex optimization methods as well as of recent nonlinear methods in data science which are formulated… ▽ More The goal of this paper is to promote the use of fixed point strategies in data science by showing that they provide a simplifying and unifying framework to model, analyze, and solve a great variety of problems. They are seen to constitute a natural environment to explain the behavior of advanced convex optimization methods as well as of recent nonlinear methods in data science which are formulated in terms of paradigms that go beyond minimization concepts and involve constructs such as Nash equilibria or monotone inclusions. We review the pertinent tools of fixed point theory and describe the main state-of-the-art algorithms for provably convergent fixed point construction. We also incorporate additional ingredients such as stochasticity, block-implementations, and non-Euclidean metrics, which provide further enhancements. Applications to signal and image processing, machine learning, statistics, neural networks, and inverse problems are discussed. △ Less

Submitted 23 March, 2021; v1 submitted 5 August, 2020; originally announced August 2020.

arXiv:2002.07898 [pdf, other]

Deep Transform and Metric Learning Network: Wedding Deep Dictionary Learning and Neural Networks

Authors: Wen Tang, Emilie Chouzenoux, Jean-Christophe Pesquet, Hamid Krim

Abstract: On account of its many successes in inference tasks and denoising applications, Dictionary Learning (DL) and its related sparse optimization problems have garnered a lot of research interest. While most solutions have focused on single layer dictionaries, the improved recently proposed Deep DL (DDL) methods have also fallen short on a number of issues. We propose herein, a novel DDL approach where… ▽ More On account of its many successes in inference tasks and denoising applications, Dictionary Learning (DL) and its related sparse optimization problems have garnered a lot of research interest. While most solutions have focused on single layer dictionaries, the improved recently proposed Deep DL (DDL) methods have also fallen short on a number of issues. We propose herein, a novel DDL approach where each DL layer can be formulated as a combination of one linear layer and a Recurrent Neural Network (RNN). The RNN is shown to flexibly account for the layer-associated and learned metric. Our proposed work unveils new insights into Neural Networks and DDL and provides a new, efficient and competitive approach to jointly learn a deep transform and a metric for inference applications. Extensive experiments are carried out to demonstrate that the proposed method can not only outperform existing DDL but also state-of-the-art generic CNNs. △ Less

Submitted 20 October, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

arXiv:2001.08496 [pdf, ps, other]

doi 10.1109/TSP.2020.3025731

SPOQ $\ell_p$-Over-$\ell_q$ Regularization for Sparse Signal Recovery applied to Mass Spectrometry

Authors: Afef Cherni, Emilie Chouzenoux, Laurent Duval, Jean-Christophe Pesquet

Abstract: Underdetermined or ill-posed inverse problems require additional information for \ldd{d} sound solutions with tractable optimization algorithms. Sparsity yields consequent heuristics to that matter, with numerous applications in signal restoration, image recovery, or machine learning. Since the $\ell_0$ count measure is barely tractable, many statistical or learning approaches have invested in com… ▽ More Underdetermined or ill-posed inverse problems require additional information for \ldd{d} sound solutions with tractable optimization algorithms. Sparsity yields consequent heuristics to that matter, with numerous applications in signal restoration, image recovery, or machine learning. Since the $\ell_0$ count measure is barely tractable, many statistical or learning approaches have invested in computable proxies, such as the $\ell_1$ norm. However, the latter does not exhibit the desirable property of scale invariance for sparse data. Extending the SOOT Euclidean/Taxicab $\ell_1$-over-$\ell_2$ norm-ratio initially introduced for blind deconvolution, we propose SPOQ, a family of smoothed (approximately) scale-invariant penalty functions. It consists of a Lipschitz-differentiable surrogate for $\ell_p$-over-$\ell_q$ quasi-norm/norm ratios with $p\in\,]0,2[$ and $q\ge 2$. This surrogate is embedded into a novel majorize-minimize trust-region approach, generalizing the variable metric forward-backward algorithm. For naturally sparse mass-spectrometry signals, we show that SPOQ significantly outperforms $\ell_0$, $\ell_1$, Cauchy, Welsch, SCAD and Celo penalties on several performance measures. Guidelines on SPOQ hyperparameters tuning are also provided, suggesting simple data-driven choices. △ Less

Submitted 22 September, 2020; v1 submitted 23 January, 2020; originally announced January 2020.

Journal ref: IEEE Transactions on Signal Processing, 2020, Volume 68, pages 6070--6084

arXiv:1904.11707 [pdf, other]

General risk measures for robust machine learning

Authors: Emilie Chouzenoux, Henri Gérard, Jean-Christophe Pesquet

Abstract: A wide array of machine learning problems are formulated as the minimization of the expectation of a convex loss function on some parameter space. Since the probability distribution of the data of interest is usually unknown, it is is often estimated from training sets, which may lead to poor out-of-sample performance. In this work, we bring new insights in this problem by using the framework whic… ▽ More A wide array of machine learning problems are formulated as the minimization of the expectation of a convex loss function on some parameter space. Since the probability distribution of the data of interest is usually unknown, it is is often estimated from training sets, which may lead to poor out-of-sample performance. In this work, we bring new insights in this problem by using the framework which has been developed in quantitative finance for risk measures. We show that the original min-max problem can be recast as a convex minimization problem under suitable assumptions. We discuss several important examples of robust formulations, in particular by defining ambiguity sets based on $\varphi$-divergences and the Wasserstein metric.We also propose an efficient algorithm for solving the corresponding convex optimization problems involving complex convex constraints. Through simulation examples, we demonstrate that this algorithm scales well on real data sets. △ Less

Submitted 24 May, 2019; v1 submitted 26 April, 2019; originally announced April 2019.

arXiv:1903.01014 [pdf, ps, other]

Lipschitz Certificates for Layered Network Structures Driven by Averaged Activation Operators

Authors: Patrick L. Combettes, Jean-Christophe Pesquet

Abstract: Obtaining sharp Lipschitz constants for feed-forward neural networks is essential to assess their robustness in the face of perturbations of their inputs. We derive such constants in the context of a general layered network model involving compositions of nonexpansive averaged operators and affine operators. By exploiting this architecture, our analysis finely captures the interactions between the… ▽ More Obtaining sharp Lipschitz constants for feed-forward neural networks is essential to assess their robustness in the face of perturbations of their inputs. We derive such constants in the context of a general layered network model involving compositions of nonexpansive averaged operators and affine operators. By exploiting this architecture, our analysis finely captures the interactions between the layers, yielding tighter Lipschitz constants than those resulting from the product of individual bounds for groups of layers. The proposed framework is shown to cover in particular many practical instances encountered in feed-forward neural networks. Our Lipschitz constant estimates are further improved in the case of structures employing scalar nonlinear functions, which include standard convolutional networks as special cases. △ Less

Submitted 20 June, 2020; v1 submitted 3 March, 2019; originally announced March 2019.

arXiv:1812.04276 [pdf, other]

Deep Unfolding of a Proximal Interior Point Method for Image Restoration

Authors: Carla Bertocchi, Emilie Chouzenoux, Marie-Caroline Corbineau, Jean-Christophe Pesquet, Marco Prato

Abstract: Variational methods are widely applied to ill-posed inverse problems for they have the ability to embed prior knowledge about the solution. However, the level of performance of these methods significantly depends on a set of parameters, which can be estimated through computationally expensive and time-consuming methods. In contrast, deep learning offers very generic and efficient architectures, at… ▽ More Variational methods are widely applied to ill-posed inverse problems for they have the ability to embed prior knowledge about the solution. However, the level of performance of these methods significantly depends on a set of parameters, which can be estimated through computationally expensive and time-consuming methods. In contrast, deep learning offers very generic and efficient architectures, at the expense of explainability, since it is often used as a black-box, without any fine control over its output. Deep unfolding provides a convenient approach to combine variational-based and deep learning approaches. Starting from a variational formulation for image restoration, we develop iRestNet, a neural network architecture obtained by unfolding a proximal interior point algorithm. Hard constraints, encoding desirable properties for the restored image, are incorporated into the network thanks to a logarithmic barrier, while the barrier parameter, the stepsize, and the penalization weight are learned by the network. We derive explicit expressions for the gradient of the proximity operator for various choices of constraints, which allows training iRestNet with gradient descent and backpropagation. In addition, we provide theoretical results regarding the stability of the network for a common inverse problem example. Numerical experiments on image deblurring problems show that the proposed approach compares favorably with both state-of-the-art variational and machine learning methods in terms of image quality. △ Less

Submitted 21 January, 2020; v1 submitted 11 December, 2018; originally announced December 2018.

Journal ref: Inverse Problems (2019)

arXiv:1808.07526 [pdf, ps, other]

Deep Neural Network Structures Solving Variational Inequalities

Authors: Patrick L. Combettes, Jean-Christophe Pesquet

Abstract: Motivated by structures that appear in deep neural networks, we investigate nonlinear composite models alternating proximity and affine operators defined on different spaces. We first show that a wide range of activation operators used in neural networks are actually proximity operators. We then establish conditions for the averagedness of the proposed composite constructs and investigate their as… ▽ More Motivated by structures that appear in deep neural networks, we investigate nonlinear composite models alternating proximity and affine operators defined on different spaces. We first show that a wide range of activation operators used in neural networks are actually proximity operators. We then establish conditions for the averagedness of the proposed composite constructs and investigate their asymptotic properties. It is shown that the limit of the resulting process solves a variational inequality which, in general, does not derive from a minimization problem. △ Less

Submitted 15 March, 2019; v1 submitted 22 August, 2018; originally announced August 2018.

arXiv:1808.00724 [pdf, ps, other]

doi 10.1109/TSP.2018.2890065

Rational Optimization for Nonlinear Reconstruction with Approximate $\ell_0$ Penalization

Authors: Marc Castella, Jean-Christophe Pesquet, Arthur Marmin

Abstract: Recovering nonlinearly degraded signal in the presence of noise is a challenging problem. In this work, this problem is tackled by minimizing the sum of a non convex least-squares fit criterion and a penalty term. We assume that the nonlinearity of the model can be accounted for by a rational function. In addition, we suppose that the signal to be sought is sparse and a rational approximation of t… ▽ More Recovering nonlinearly degraded signal in the presence of noise is a challenging problem. In this work, this problem is tackled by minimizing the sum of a non convex least-squares fit criterion and a penalty term. We assume that the nonlinearity of the model can be accounted for by a rational function. In addition, we suppose that the signal to be sought is sparse and a rational approximation of the $\ell_0$ pseudo-norm thus constitutes a suitable penalization. The resulting composite cost function belongs to the broad class of semi-algebraic functions. To find a globally optimal solution to such an optimization problem, it can be transformed into a generalized moment problem, for which a hierarchy of semidefinite programming relaxations can be built. Global optimality comes at the expense of an increased dimension and, to overcome computational limitations concerning the number of involved variables, the structure of the problem has to be carefully addressed. A situation of practical interest is when the nonlinear model consists of a convolutive transform followed by a componentwise nonlinear rational saturation. We then propose to use a sparse relaxation able to deal with up to several hundreds of optimized variables. In contrast with the naive approach consisting of linearizing the model, our experiments show that the proposed approach offers good performance. △ Less

Submitted 21 December, 2018; v1 submitted 2 August, 2018; originally announced August 2018.

Journal ref: IEEE Transactions Signal Processing 2019

arXiv:1805.09430 [pdf, ps, other]

A Two-Stage Subspace Trust Region Approach for Deep Neural Network Training

Authors: Viacheslav Dudar, Giovanni Chierchia, Emilie Chouzenoux, Jean-Christophe Pesquet, Vladimir Semenov

Abstract: In this paper, we develop a novel second-order method for training feed-forward neural nets. At each iteration, we construct a quadratic approximation to the cost function in a low-dimensional subspace. We minimize this approximation inside a trust region through a two-stage procedure: first inside the embedded positive curvature subspace, followed by a gradient descent step. This approach leads t… ▽ More In this paper, we develop a novel second-order method for training feed-forward neural nets. At each iteration, we construct a quadratic approximation to the cost function in a low-dimensional subspace. We minimize this approximation inside a trust region through a two-stage procedure: first inside the embedded positive curvature subspace, followed by a gradient descent step. This approach leads to a fast objective function decay, prevents convergence to saddle points, and alleviates the need for manually tuning parameters. We show the good performance of the proposed algorithm on benchmark datasets. △ Less

Submitted 23 May, 2018; originally announced May 2018.

Comments: EUSIPCO 2017

arXiv:1801.07452 [pdf, other]

doi 10.1016/j.sigpro.2019.107417

A Proximal Approach for a Class of Matrix Optimization Problems

Authors: A. Benfenati, E. Chouzenoux, J. -C. Pesquet

Abstract: In recent years, there has been a growing interest in mathematical models leading to the minimization, in a symmetric matrix space, of a Bregman divergence coupled with a regularization term. We address problems of this type within a general framework where the regularization term is split in two parts, one being a spectral function while the other is arbitrary. A Douglas-Rachford approach is prop… ▽ More In recent years, there has been a growing interest in mathematical models leading to the minimization, in a symmetric matrix space, of a Bregman divergence coupled with a regularization term. We address problems of this type within a general framework where the regularization term is split in two parts, one being a spectral function while the other is arbitrary. A Douglas-Rachford approach is proposed to address such problems and a list of proximity operators is provided allowing us to consider various choices for the fit-to-data functional and for the regularization term. Numerical experiments show the validity of this approach for solving convex optimization problems encountered in the context of sparse covariance matrix estimation. Based on our theoretical results, an algorithm is also proposed for noisy graphical lasso where a precision matrix has to be estimated in the presence of noise. The nonconvexity of the resulting objective function is dealt with a majorization-minimization approach, i.e. by building a sequence of convex surrogates and solving the inner optimization subproblems via the aforementioned Douglas-Rachford procedure. We establish conditions for the convergence of this iterative scheme and we illustrate its good numerical performance with respect to state-of-the-art approaches. △ Less

Submitted 23 January, 2018; originally announced January 2018.

MSC Class: 15A18 15B48 62J10 65K10 90C06 90C25 90C26 90C35

arXiv:1712.09131 [pdf, other]

doi 10.1007/s10589-019-00060-6

A Random Block-Coordinate Douglas-Rachford Splitting Method with Low Computational Complexity for Binary Logistic Regression

Authors: Luis M. Briceno-Arias, Giovanni Chierchia, Emilie Chouzenoux, Jean-Christophe Pesquet

Abstract: In this paper, we propose a new optimization algorithm for sparse logistic regression based on a stochastic version of the Douglas-Rachford splitting method. Our algorithm sweeps the training set by randomly selecting a mini-batch of data at each iteration, and it allows us to update the variables in a block coordinate manner. Our approach leverages the proximity operator of the logistic loss, whi… ▽ More In this paper, we propose a new optimization algorithm for sparse logistic regression based on a stochastic version of the Douglas-Rachford splitting method. Our algorithm sweeps the training set by randomly selecting a mini-batch of data at each iteration, and it allows us to update the variables in a block coordinate manner. Our approach leverages the proximity operator of the logistic loss, which is expressed with the generalized Lambert W function. Experiments carried out on standard datasets demonstrate the efficiency of our approach w.r.t. stochastic gradient-like methods. △ Less

Submitted 25 December, 2017; originally announced December 2017.

arXiv:1709.06178 [pdf, other]

A Fast Algorithm Based on a Sylvester-like Equation for LS Regression with GMRF Prior

Authors: Qi Wei, Emilie Chouzenoux, Jean-Yves Tourneret, Jean-Christophe Pesquet

Abstract: This paper presents a fast approach for penalized least squares (LS) regression problems using a 2D Gaussian Markov random field (GMRF) prior. More precisely, the computation of the proximity operator of the LS criterion regularized by different GMRF potentials is formulated as solving a Sylvester-like matrix equation. By exploiting the structural properties of GMRFs, this matrix equation is solve… ▽ More This paper presents a fast approach for penalized least squares (LS) regression problems using a 2D Gaussian Markov random field (GMRF) prior. More precisely, the computation of the proximity operator of the LS criterion regularized by different GMRF potentials is formulated as solving a Sylvester-like matrix equation. By exploiting the structural properties of GMRFs, this matrix equation is solved columnwise in an analytical way. The proposed algorithm can be embedded into a wide range of proximal algorithms to solve LS regression problems including a convex penalty. Experiments carried out in the case of a constrained LS regression problem arising in a multichannel image processing application, provide evidence that an alternating direction method of multipliers performs quite efficiently in this context. △ Less

Submitted 9 October, 2017; v1 submitted 18 September, 2017; originally announced September 2017.

arXiv:1707.09858 [pdf, other]

Spatially variant PSF modeling in confocal macroscopy

Authors: Anna Jezierska, Hugues Talbot, Jean-Christophe Pesquet, Gilbert Engler

Abstract: Point spread function (PSF) plays an essential role in image reconstruction. In the context of confocal microscopy, optical performance degrades towards the edge of the field of view as astigmatism, coma and vignetting. Thus, one should expect the related artifacts to be even stronger in macroscopy, where the field of view is much larger. The field aberrations in macroscopy fluorescence imaging sy… ▽ More Point spread function (PSF) plays an essential role in image reconstruction. In the context of confocal microscopy, optical performance degrades towards the edge of the field of view as astigmatism, coma and vignetting. Thus, one should expect the related artifacts to be even stronger in macroscopy, where the field of view is much larger. The field aberrations in macroscopy fluorescence imaging system was observed to be symmetrical and to increase with the distance from the center of the field of view. In this paper we propose an experiment and an optimization method for assessing the center of the field of view. The obtained results constitute a step towards reducing the number of parameters in macroscopy PSF model. △ Less

Submitted 14 July, 2017; originally announced July 2017.

arXiv:1704.08083 [pdf, ps, other]

Stochastic Quasi-Fejér Block-Coordinate Fixed Point Iterations With Random Swee** II: Mean-Square and Linear Convergence

Authors: Patrick L. Combettes, Jean-Christophe Pesquet

Abstract: Reference [11] investigated the almost sure weak convergence of block-coordinate fixed point algorithms and discussed their applications to nonlinear analysis and optimization. This algorithmic framework features random swee** rules to select arbitrarily the blocks of variables that are activated over the course of the iterations and it allows for stochastic errors in the evaluation of the opera… ▽ More Reference [11] investigated the almost sure weak convergence of block-coordinate fixed point algorithms and discussed their applications to nonlinear analysis and optimization. This algorithmic framework features random swee** rules to select arbitrarily the blocks of variables that are activated over the course of the iterations and it allows for stochastic errors in the evaluation of the operators. The present paper establishes results on the mean-square and linear convergence of the iterates. Applications to monotone operator splitting and proximal optimization algorithms are presented. △ Less

Submitted 16 April, 2018; v1 submitted 26 April, 2017; originally announced April 2017.

arXiv:1702.08534 [pdf, ps, other]

doi 10.1109/TIP.2006.875178

Image Analysis Using a Dual-Tree $M$-Band Wavelet Transform

Authors: Caroline Chaux, Laurent Duval, Jean-Christophe Pesquet

Abstract: We propose a 2D generalization to the $M$-band case of the dual-tree decomposition structure (initially proposed by N. Kingsbury and further investigated by I. Selesnick) based on a Hilbert pair of wavelets. We particularly address (\textit{i}) the construction of the dual basis and (\textit{ii}) the resulting directional analysis. We also revisit the necessary pre-processing stage in the $M$-band… ▽ More We propose a 2D generalization to the $M$-band case of the dual-tree decomposition structure (initially proposed by N. Kingsbury and further investigated by I. Selesnick) based on a Hilbert pair of wavelets. We particularly address (\textit{i}) the construction of the dual basis and (\textit{ii}) the resulting directional analysis. We also revisit the necessary pre-processing stage in the $M$-band case. While several reconstructions are possible because of the redundancy of the representation, we propose a new optimal signal reconstruction technique, which minimizes potential estimation errors. The effectiveness of the proposed $M$-band decomposition is demonstrated via denoising comparisons on several image types (natural, texture, seismics), with various $M$-band wavelets and thresholding strategies. Significant improvements in terms of both overall noise reduction and direction preservation are observed. △ Less

Submitted 27 February, 2017; originally announced February 2017.

Journal ref: IEEE Transactions on Image Processing, August 2006, Volume 15, Issue 8, p. 2397-2412

arXiv:1610.07519 [pdf, ps, other]

A Variational Bayesian Approach for Image Restoration. Application to Image Deblurring with Poisson-Gaussian Noise

Authors: Yosra Marnissi, Yuling Zheng, Emilie Chouzenoux, Jean-Christophe Pesquet

Abstract: In this paper, a methodology is investigated for signal recovery in the presence of non-Gaussian noise. In contrast with regularized minimization approaches often adopted in the literature, in our algorithm the regularization parameter is reliably estimated from the observations. As the posterior density of the unknown parameters is analytically intractable, the estimation problem is derived in a… ▽ More In this paper, a methodology is investigated for signal recovery in the presence of non-Gaussian noise. In contrast with regularized minimization approaches often adopted in the literature, in our algorithm the regularization parameter is reliably estimated from the observations. As the posterior density of the unknown parameters is analytically intractable, the estimation problem is derived in a variational Bayesian framework where the goal is to provide a good approximation to the posterior distribution in order to compute posterior mean estimates. Moreover, a majorization technique is employed to circumvent the difficulties raised by the intricate forms of the non-Gaussian likelihood and of the prior density. We demonstrate the potential of the proposed approach through comparisons with state-of-the-art techniques that are specifically tailored to signal recovery in the presence of mixed Poisson-Gaussian noise. Results show that the proposed approach is efficient and achieves performance comparable with other methods where the regularization parameter is manually tuned from the ground truth. △ Less

Submitted 20 January, 2017; v1 submitted 24 October, 2016; originally announced October 2016.

arXiv:1606.09552 [pdf, other]

doi 10.1109/TIT.2017.2782789

Proximity Operators of Discrete Information Divergences

Authors: Mireille El Gheche, Giovanni Chierchia, Jean-Christophe Pesquet

Abstract: Information divergences allow one to assess how close two distributions are from each other. Among the large panel of available measures, a special attention has been paid to convex $\varphi$-divergences, such as Kullback-Leibler, Jeffreys-Kullback, Hellinger, Chi-Square, Renyi, and I$_α$ divergences. While $\varphi$-divergences have been extensively studied in convex analysis, their use in optimi… ▽ More Information divergences allow one to assess how close two distributions are from each other. Among the large panel of available measures, a special attention has been paid to convex $\varphi$-divergences, such as Kullback-Leibler, Jeffreys-Kullback, Hellinger, Chi-Square, Renyi, and I$_α$ divergences. While $\varphi$-divergences have been extensively studied in convex analysis, their use in optimization problems often remains challenging. In this regard, one of the main shortcomings of existing methods is that the minimization of $\varphi$-divergences is usually performed with respect to one of their arguments, possibly within alternating optimization techniques. In this paper, we overcome this limitation by deriving new closed-form expressions for the proximity operator of such two-variable functions. This makes it possible to employ standard proximal methods for efficiently solving a wide range of convex optimization problems involving $\varphi$-divergences. In addition, we show that these proximity operators are useful to compute the epigraphical projection of several functions of practical interest. The proposed proximal tools are numerically validated in the context of optimal query execution within database management systems, where the problem of selectivity estimation plays a central role. Experiments are carried out on small to large scale scenarios. △ Less

Submitted 26 April, 2017; v1 submitted 30 June, 2016; originally announced June 2016.

arXiv:1603.07301 [pdf, ps, other]

doi 10.1109/LSP.2016.2593589

Convergence Rate Analysis of the Majorize-Minimize Subspace Algorithm -- Extended Version

Authors: Emilie Chouzenoux, Jean-Christophe Pesquet

Abstract: State-of-the-art methods for solving smooth optimization problems are nonlinear conjugate gradient, low memory BFGS, and Majorize-Minimize (MM) subspace algorithms. The MM subspace algorithm which has been introduced more recently has shown good practical performance when compared with other methods on various optimization problems arising in signal and image processing. However, to the best of ou… ▽ More State-of-the-art methods for solving smooth optimization problems are nonlinear conjugate gradient, low memory BFGS, and Majorize-Minimize (MM) subspace algorithms. The MM subspace algorithm which has been introduced more recently has shown good practical performance when compared with other methods on various optimization problems arising in signal and image processing. However, to the best of our knowledge, no general result exists concerning the theoretical convergence rate of the MM subspace algorithm. This paper aims at deriving such convergence rates both for batch and online versions of the algorithm and, in particular, discusses the influence of the choice of the subspace. △ Less

Submitted 19 July, 2016; v1 submitted 23 March, 2016; originally announced March 2016.

Comments: 14 pages

arXiv:1602.08021 [pdf, other]

Stochastic forward-backward and primal-dual approximation algorithms with application to online image restoration

Authors: Patrick L. Combettes, Jean-Christophe Pesquet

Abstract: Stochastic approximation techniques have been used in various contexts in data science. We propose a stochastic version of the forward-backward algorithm for minimizing the sum of two convex functions, one of which is not necessarily smooth. Our framework can handle stochastic approximations of the gradient of the smooth function and allows for stochastic errors in the evaluation of the proximity… ▽ More Stochastic approximation techniques have been used in various contexts in data science. We propose a stochastic version of the forward-backward algorithm for minimizing the sum of two convex functions, one of which is not necessarily smooth. Our framework can handle stochastic approximations of the gradient of the smooth function and allows for stochastic errors in the evaluation of the proximity operator of the nonsmooth function. The almost sure convergence of the iterates generated by the algorithm to a minimizer is established under relatively mild assumptions. We also propose a stochastic version of a popular primal-dual proximal splitting algorithm, establish its convergence, and apply it to an online image restoration problem. △ Less

Submitted 25 February, 2016; originally announced February 2016.

Comments: 5 Figures

MSC Class: 90C25; 90C15; 94A08

arXiv:1601.04026 [pdf, other]

doi 10.1093/mnras/stw1859

Scalable splitting algorithms for big-data interferometric imaging in the SKA era

Authors: Alexandru Onose, Rafael E. Carrillo, Audrey Repetti, Jason D. McEwen, Jean-Philippe Thiran, Jean-Christophe Pesquet, Yves Wiaux

Abstract: In the context of next generation radio telescopes, like the Square Kilometre Array, the efficient processing of large-scale datasets is extremely important. Convex optimisation tasks under the compressive sensing framework have recently emerged and provide both enhanced image reconstruction quality and scalability to increasingly larger data sets. We focus herein mainly on scalability and propose… ▽ More In the context of next generation radio telescopes, like the Square Kilometre Array, the efficient processing of large-scale datasets is extremely important. Convex optimisation tasks under the compressive sensing framework have recently emerged and provide both enhanced image reconstruction quality and scalability to increasingly larger data sets. We focus herein mainly on scalability and propose two new convex optimisation algorithmic structures able to solve the convex optimisation tasks arising in radio-interferometric imaging. They rely on proximal splitting and forward-backward iterations and can be seen, by analogy with the CLEAN major-minor cycle, as running sophisticated CLEAN-like iterations in parallel in multiple data, prior, and image spaces. Both methods support any convex regularisation function, in particular the well studied l1 priors promoting image sparsity in an adequate domain. Tailored for big-data, they employ parallel and distributed computations to achieve scalability, in terms of memory and computational requirements. One of them also exploits randomisation, over data blocks at each iteration, offering further flexibility. We present simulation results showing the feasibility of the proposed methods as well as their advantages compared to state-of-the-art algorithmic solvers. Our Matlab code is available online on GitHub. △ Less

Submitted 9 August, 2016; v1 submitted 15 January, 2016; originally announced January 2016.

Comments: Monthly Notices of the Royal Astronomical Society (2016)

arXiv:1507.07095 [pdf, ps, other]

Stochastic Approximations and Perturbations in Forward-Backward Splitting for Monotone Operators

Authors: Patrick L. Combettes, Jean-Christophe Pesquet

Abstract: We investigate the asymptotic behavior of a stochastic version of the forward-backward splitting algorithm for finding a zero of the sum of a maximally monotone set-valued operator and a cocoercive operator in Hilbert spaces. Our general setting features stochastic approximations of the cocoercive operator and stochastic perturbations in the evaluation of the resolvents of the set-valued operator.… ▽ More We investigate the asymptotic behavior of a stochastic version of the forward-backward splitting algorithm for finding a zero of the sum of a maximally monotone set-valued operator and a cocoercive operator in Hilbert spaces. Our general setting features stochastic approximations of the cocoercive operator and stochastic perturbations in the evaluation of the resolvents of the set-valued operator. In addition, relaxations and not necessarily vanishing proximal parameters are allowed. Weak and strong almost sure convergence properties of the iterates is established under mild conditions on the underlying stochastic processes. Leveraging these results, we also establish the almost sure convergence of the iterates of a stochastic variant of a primal-dual proximal splitting method for composite minimization problems. △ Less

Submitted 25 July, 2015; originally announced July 2015.

MSC Class: Primary 47H05; Secondary 65K05; 90C25; 94A08

arXiv:1505.00273 [pdf, ps, other]

doi 10.1109/JSTSP.2015.2496908

A Survey of Stochastic Simulation and Optimization Methods in Signal Processing

Authors: Marcelo Pereyra, Philip Schniter, Emilie Chouzenoux, Jean-Christophe Pesquet, Jean-Yves Tourneret, Alfred Hero, Steve McLaughlin

Abstract: Modern signal processing (SP) methods rely very heavily on probability and statistics to solve challenging SP problems. SP methods are now expected to deal with ever more complex models, requiring ever more sophisticated computational inference techniques. This has driven the development of statistical SP methods based on stochastic simulation and optimization. Stochastic simulation and optimizati… ▽ More Modern signal processing (SP) methods rely very heavily on probability and statistics to solve challenging SP problems. SP methods are now expected to deal with ever more complex models, requiring ever more sophisticated computational inference techniques. This has driven the development of statistical SP methods based on stochastic simulation and optimization. Stochastic simulation and optimization algorithms are computationally intensive tools for performing statistical inference in models that are analytically intractable and beyond the scope of deterministic inference methods. They have been recently successfully applied to many difficult problems involving complex statistical models and sophisticated (often Bayesian) statistical inference techniques. This survey paper offers an introduction to stochastic simulation and optimization methods in signal and image processing. The paper addresses a variety of high-dimensional Markov chain Monte Carlo (MCMC) methods as well as deterministic surrogate methods, such as variational Bayes, the Bethe approach, belief and expectation propagation and approximate message passing algorithms. It also discusses a range of optimization methods that have been adopted to solve stochastic problems, as well as stochastic methods for deterministic optimization. Subsequently, areas of overlap between simulation and optimization, in particular optimization-within-MCMC and MCMC-driven optimization are discussed. △ Less

Submitted 20 November, 2015; v1 submitted 1 May, 2015; originally announced May 2015.

Comments: To appear in the IEEE Journal of Selected Topics in Signal Processing special issue on Stochastic Simulation and Optimisation in Signal Processing, March 2016

arXiv:1501.03669 [pdf, other]

A Proximal Approach for Sparse Multiclass SVM

Authors: G. Chierchia, Nelly Pustelnik, Jean-Christophe Pesquet, B. Pesquet-Popescu

Abstract: Sparsity-inducing penalties are useful tools to design multiclass support vector machines (SVMs). In this paper, we propose a convex optimization approach for efficiently and exactly solving the multiclass SVM learning problem involving a sparse regularization and the multiclass hinge loss formulated by Crammer and Singer. We provide two algorithms: the first one dealing with the hinge loss as a p… ▽ More Sparsity-inducing penalties are useful tools to design multiclass support vector machines (SVMs). In this paper, we propose a convex optimization approach for efficiently and exactly solving the multiclass SVM learning problem involving a sparse regularization and the multiclass hinge loss formulated by Crammer and Singer. We provide two algorithms: the first one dealing with the hinge loss as a penalty term, and the other one addressing the case when the hinge loss is enforced through a constraint. The related convex optimization problems can be efficiently solved thanks to the flexibility offered by recent primal-dual proximal algorithms and epigraphical splitting techniques. Experiments carried out on several datasets demonstrate the interest of considering the exact expression of the hinge loss rather than a smooth approximation. The efficiency of the proposed algorithms w.r.t. several state-of-the-art methods is also assessed through comparisons of execution times. △ Less

Submitted 14 December, 2015; v1 submitted 15 January, 2015; originally announced January 2015.

arXiv:1407.5465 [pdf, ps, other]

doi 10.1109/LSP.2014.2362861

Euclid in a Taxicab: Sparse Blind Deconvolution with Smoothed l1/l2 Regularization

Authors: Audrey Repetti, Mai Quyen Pham, Laurent Duval, Emilie Chouzenoux, Jean-Christophe Pesquet

Abstract: The l1/l2 ratio regularization function has shown good performance for retrieving sparse signals in a number of recent works, in the context of blind deconvolution. Indeed, it benefits from a scale invariance property much desirable in the blind context. However, the l1/l2 function raises some difficulties when solving the nonconvex and nonsmooth minimization problems resulting from the use of suc… ▽ More The l1/l2 ratio regularization function has shown good performance for retrieving sparse signals in a number of recent works, in the context of blind deconvolution. Indeed, it benefits from a scale invariance property much desirable in the blind context. However, the l1/l2 function raises some difficulties when solving the nonconvex and nonsmooth minimization problems resulting from the use of such a penalty term in current restoration methods. In this paper, we propose a new penalty based on a smooth approximation to the l1/l2 function. In addition, we develop a proximal-based algorithm to solve variational problems involving this function and we derive theoretical convergence results. We demonstrate the effectiveness of our method through a comparison with a recent alternating optimization strategy dealing with the exact l1/l2 term, on an application to seismic data blind deconvolution. △ Less

Submitted 8 November, 2014; v1 submitted 21 July, 2014; originally announced July 2014.

Comments: 5 pages

Journal ref: IEEE Signal Processing Letters, May 2015, Volume 22, Number 5, pages 539-543

arXiv:1406.6404 [pdf, ps, other]

A Class of Randomized Primal-Dual Algorithms for Distributed Optimization

Authors: Jean-Christophe Pesquet, Audrey Repetti

Abstract: Based on a preconditioned version of the randomized block-coordinate forward-backward algorithm recently proposed in [Combettes,Pesquet,2014], several variants of block-coordinate primal-dual algorithms are designed in order to solve a wide array of monotone inclusion problems. These methods rely on a sweep of blocks of variables which are activated at each iteration according to a random rule, an… ▽ More Based on a preconditioned version of the randomized block-coordinate forward-backward algorithm recently proposed in [Combettes,Pesquet,2014], several variants of block-coordinate primal-dual algorithms are designed in order to solve a wide array of monotone inclusion problems. These methods rely on a sweep of blocks of variables which are activated at each iteration according to a random rule, and they allow stochastic errors in the evaluation of the involved operators. Then, this framework is employed to derive block-coordinate primal-dual proximal algorithms for solving composite convex variational problems. The resulting algorithm implementations may be useful for reducing computational complexity and memory requirements. Furthermore, we show that the proposed approach can be used to develop novel asynchronous distributed primal-dual algorithms in a multi-agent context. △ Less

Submitted 25 October, 2014; v1 submitted 24 June, 2014; originally announced June 2014.

MSC Class: 47H05; 49M29; 49M27; 65K10; 90C25

arXiv:1406.5439 [pdf, ps, other]

A forward-backward view of some primal-dual optimization methods in image recovery

Authors: Patrick L. Combettes, Laurent Condat, Jean-Christophe Pesquet, Bang Cong Vu

Abstract: A wide array of image recovery problems can be abstracted into the problem of minimizing a sum of composite convex functions in a Hilbert space. To solve such problems, primal-dual proximal approaches have been developed which provide efficient solutions to large-scale optimization problems. The objective of this paper is to show that a number of existing algorithms can be derived from a general f… ▽ More A wide array of image recovery problems can be abstracted into the problem of minimizing a sum of composite convex functions in a Hilbert space. To solve such problems, primal-dual proximal approaches have been developed which provide efficient solutions to large-scale optimization problems. The objective of this paper is to show that a number of existing algorithms can be derived from a general form of the forward-backward algorithm applied in a suitable product space. Our approach also allows us to develop useful extensions of existing algorithms by introducing a variable metric. An illustration to image restoration is provided. △ Less

Submitted 20 June, 2014; originally announced June 2014.

arXiv:1406.5429 [pdf, ps, other]

Playing with Duality: An Overview of Recent Primal-Dual Approaches for Solving Large-Scale Optimization Problems

Authors: Nikos Komodakis, Jean-Christophe Pesquet

Abstract: Optimization methods are at the core of many problems in signal/image processing, computer vision, and machine learning. For a long time, it has been recognized that looking at the dual of an optimization problem may drastically simplify its solution. Deriving efficient strategies which jointly brings into play the primal and the dual problems is however a more recent idea which has generated many… ▽ More Optimization methods are at the core of many problems in signal/image processing, computer vision, and machine learning. For a long time, it has been recognized that looking at the dual of an optimization problem may drastically simplify its solution. Deriving efficient strategies which jointly brings into play the primal and the dual problems is however a more recent idea which has generated many important new contributions in the last years. These novel developments are grounded on recent advances in convex analysis, discrete optimization, parallel processing, and non-smooth optimization with emphasis on sparsity issues. In this paper, we aim at presenting the principles of primal-dual approaches, while giving an overview of numerical methods which have been proposed in different contexts. We show the benefits which can be drawn from primal-dual algorithms both for solving large-scale convex optimization problems and discrete ones, and we provide various application examples to illustrate their usefulness. △ Less

Submitted 3 December, 2014; v1 submitted 20 June, 2014; originally announced June 2014.

ACM Class: G.1.6; I.4; I.5

Showing 1–50 of 70 results for author: Pesquet, J