Search | arXiv e-print repository

Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy

Authors: Yingyu Lin, Yi-An Ma, Yu-Xiang Wang, Rachel Redberg, Zhiqi Bu

Abstract: Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,δ)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the… ▽ More Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,δ)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the unappealing $δ$-approximation error into the privacy guarantees. To bridge this gap, we propose the Approximate SAample Perturbation (abbr. ASAP) algorithm which perturbs an MCMC sample with noise proportional to its Wasserstein-infinity ($W_\infty$) distance from a reference distribution that satisfies pure DP or pure Gaussian DP (i.e., $δ=0$). We then leverage a Metropolis-Hastings algorithm to generate the sample and prove that the algorithm converges in $W_\infty$ distance. We show that by combining our new techniques with a localization step, we obtain the first nearly linear-time algorithm that achieves the optimal rates in the DP-ERM problem with strongly convex and smooth losses. △ Less

Submitted 1 May, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

arXiv:2305.01794 [pdf, other]

MISNN: Multiple Imputation via Semi-parametric Neural Networks

Authors: Zhiqi Bu, Zongyu Dai, Yiliang Zhang, Qi Long

Abstract: Multiple imputation (MI) has been widely applied to missing value problems in biomedical, social and econometric research, in order to avoid improper inference in the downstream data analysis. In the presence of high-dimensional data, imputation models that include feature selection, especially $\ell_1$ regularized regression (such as Lasso, adaptive Lasso, and Elastic Net), are common choices to… ▽ More Multiple imputation (MI) has been widely applied to missing value problems in biomedical, social and econometric research, in order to avoid improper inference in the downstream data analysis. In the presence of high-dimensional data, imputation models that include feature selection, especially $\ell_1$ regularized regression (such as Lasso, adaptive Lasso, and Elastic Net), are common choices to prevent the model from underdetermination. However, conducting MI with feature selection is difficult: existing methods are often computationally inefficient and poor in performance. We propose MISNN, a novel and efficient algorithm that incorporates feature selection for MI. Leveraging the approximation power of neural networks, MISNN is a general and flexible framework, compatible with any feature selection method, any neural network architecture, high/low-dimensional data and general missing patterns. Through empirical experiments, MISNN has demonstrated great advantages over state-of-the-art imputation methods (e.g. Bayesian Lasso and matrix completion), in terms of imputation accuracy, statistical consistency and computation speed. △ Less

Submitted 2 May, 2023; originally announced May 2023.

arXiv:2211.13297 [pdf, other]

Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data

Authors: Zongyu Dai, Zhiqi Bu, Qi Long

Abstract: Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample size, such as analysis of multi-omics data, present daunting challenges. Imputation is arguably the most popular method for handling missing data, though existing… ▽ More Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample size, such as analysis of multi-omics data, present daunting challenges. Imputation is arguably the most popular method for handling missing data, though existing imputation methods have a number of limitations. Single imputation methods such as matrix completion methods do not adequately account for imputation uncertainty and hence would yield improper statistical inference. In contrast, multiple imputation (MI) methods allow for proper inference but existing methods do not perform well in high-dimensional settings. Our work aims to address these significant methodological gaps, leveraging recent advances in neural network Gaussian process (NNGP) from a Bayesian viewpoint. We propose two NNGP-based MI methods, namely MI-NNGP, that can apply multiple imputations for missing values from a joint (posterior predictive) distribution. The MI-NNGP methods are shown to significantly outperform existing state-of-the-art methods on synthetic and real datasets, in terms of imputation error, statistical inference, robustness to missing rates, and computation costs, under three missing data mechanisms, MCAR, MAR, and MNAR. △ Less

Submitted 21 December, 2022; v1 submitted 23 November, 2022; originally announced November 2022.

arXiv:2207.00306 [pdf, other]

CEDAR: Communication Efficient Distributed Analysis for Regressions

Authors: Changgee Chang, Zhiqi Bu, Qi Long

Abstract: Electronic health records (EHRs) offer great promises for advancing precision medicine and, at the same time, present significant analytical challenges. Particularly, it is often the case that patient-level data in EHRs cannot be shared across institutions (data sources) due to government regulations and/or institutional policies. As a result, there are growing interests about distributed learning… ▽ More Electronic health records (EHRs) offer great promises for advancing precision medicine and, at the same time, present significant analytical challenges. Particularly, it is often the case that patient-level data in EHRs cannot be shared across institutions (data sources) due to government regulations and/or institutional policies. As a result, there are growing interests about distributed learning over multiple EHRs databases without sharing patient-level data. To tackle such challenges, we propose a novel communication efficient method that aggregates the local optimal estimates, by turning the problem into a missing data problem. In addition, we propose incorporating posterior samples of remote sites, which can provide partial information on the missing quantities and improve efficiency of parameter estimates while having the differential privacy property and thus reducing the risk of information leaking. The proposed approach, without sharing the raw patient level data, allows for proper statistical inference and can accommodate sparse regressions. We provide theoretical investigation for the asymptotic properties of the proposed method for statistical inference as well as differential privacy, and evaluate its performance in simulations and real data analyses in comparison with several recently developed methods. △ Less

Submitted 1 July, 2022; originally announced July 2022.

arXiv:2202.12482 [pdf, other]

Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity

Authors: Shiyun Xu, Zhiqi Bu, Pratik Chaudhari, Ian J. Barnett

Abstract: Interpretable machine learning has demonstrated impressive performance while preserving explainability. In particular, neural additive models (NAM) offer the interpretability to the black-box deep learning and achieve state-of-the-art accuracy among the large family of generalized additive models. In order to empower NAM with feature selection and improve the generalization, we propose the sparse… ▽ More Interpretable machine learning has demonstrated impressive performance while preserving explainability. In particular, neural additive models (NAM) offer the interpretability to the black-box deep learning and achieve state-of-the-art accuracy among the large family of generalized additive models. In order to empower NAM with feature selection and improve the generalization, we propose the sparse neural additive models (SNAM) that employ the group sparsity regularization (e.g. Group LASSO), where each feature is learned by a sub-network whose trainable parameters are clustered as a group. We study the theoretical properties for SNAM with novel techniques to tackle the non-parametric truth, thus extending from classical sparse linear models such as the LASSO, which only works on the parametric truth. Specifically, we show that SNAM with subgradient and proximal gradient descents provably converges to zero training loss as $t\to\infty$, and that the estimation error of SNAM vanishes asymptotically as $n\to\infty$. We also prove that SNAM, similar to LASSO, can have exact support recovery, i.e. perfect feature selection, with appropriate regularization. Moreover, we show that the SNAM can generalize well and preserve the `identifiability', recovering each feature's effect. We validate our theories via extensive experiments and further testify to the good accuracy and efficiency of SNAM. △ Less

Submitted 24 February, 2022; originally announced February 2022.

arXiv:2112.11507 [pdf, other]

Multiple Imputation via Generative Adversarial Network for High-dimensional Blockwise Missing Value Problems

Authors: Zongyu Dai, Zhiqi Bu, Qi Long

Abstract: Missing data are present in most real world problems and need careful handling to preserve the prediction accuracy and statistical consistency in the downstream analysis. As the gold standard of handling missing data, multiple imputation (MI) methods are proposed to account for the imputation uncertainty and provide proper statistical inference. In this work, we propose Multiple Imputation via G… ▽ More Missing data are present in most real world problems and need careful handling to preserve the prediction accuracy and statistical consistency in the downstream analysis. As the gold standard of handling missing data, multiple imputation (MI) methods are proposed to account for the imputation uncertainty and provide proper statistical inference. In this work, we propose Multiple Imputation via Generative Adversarial Network (MI-GAN), a deep learning-based (in specific, a GAN-based) multiple imputation method, that can work under missing at random (MAR) mechanism with theoretical support. MI-GAN leverages recent progress in conditional generative adversarial neural works and shows strong performance matching existing state-of-the-art imputation methods on high-dimensional datasets, in terms of imputation error. In particular, MI-GAN significantly outperforms other imputation methods in the sense of statistical inference and computational speed. △ Less

Submitted 21 December, 2021; originally announced December 2021.

arXiv:2107.08461 [pdf, other]

Differentially Private Bayesian Neural Networks on Accuracy, Privacy and Reliability

Authors: Qiyiwen Zhang, Zhiqi Bu, Kan Chen, Qi Long

Abstract: Bayesian neural network (BNN) allows for uncertainty quantification in prediction, offering an advantage over regular neural networks that has not been explored in the differential privacy (DP) framework. We fill this important gap by leveraging recent development in Bayesian deep learning and privacy accounting to offer a more precise analysis of the trade-off between privacy and accuracy in BNN.… ▽ More Bayesian neural network (BNN) allows for uncertainty quantification in prediction, offering an advantage over regular neural networks that has not been explored in the differential privacy (DP) framework. We fill this important gap by leveraging recent development in Bayesian deep learning and privacy accounting to offer a more precise analysis of the trade-off between privacy and accuracy in BNN. We propose three DP-BNNs that characterize the weight uncertainty for the same network architecture in distinct ways, namely DP-SGLD (via the noisy gradient method), DP-BBP (via changing the parameters of interest) and DP-MC Dropout (via the model architecture). Interestingly, we show a new equivalence between DP-SGD and DP-SGLD, implying that some non-Bayesian DP training naturally allows for uncertainty quantification. However, the hyperparameters such as learning rate and batch size, can have different or even opposite effects in DP-SGD and DP-SGLD. Extensive experiments are conducted to compare DP-BNNs, in terms of privacy guarantee, prediction accuracy, uncertainty quantification, calibration, computation speed, and generalizability to network architecture. As a result, we observe a new tradeoff between the privacy and the reliability. When compared to non-DP and non-Bayesian approaches, DP-SGLD is remarkably accurate under strong privacy guarantee, demonstrating the great potential of DP-BNN in real-world tasks. △ Less

Submitted 18 February, 2023; v1 submitted 18 July, 2021; originally announced July 2021.

arXiv:2107.01266 [pdf, other]

Asymptotic Statistical Analysis of Sparse Group LASSO via Approximate Message Passing Algorithm

Authors: Kan Chen, Zhiqi Bu, Shiyun Xu

Abstract: Sparse Group LASSO (SGL) is a regularized model for high-dimensional linear regression problems with grouped covariates. SGL applies $l_1$ and $l_2$ penalties on the individual predictors and group predictors, respectively, to guarantee sparse effects both on the inter-group and within-group levels. In this paper, we apply the approximate message passing (AMP) algorithm to efficiently solve the SG… ▽ More Sparse Group LASSO (SGL) is a regularized model for high-dimensional linear regression problems with grouped covariates. SGL applies $l_1$ and $l_2$ penalties on the individual predictors and group predictors, respectively, to guarantee sparse effects both on the inter-group and within-group levels. In this paper, we apply the approximate message passing (AMP) algorithm to efficiently solve the SGL problem under Gaussian random designs. We further use the recently developed state evolution analysis of AMP to derive an asymptotically exact characterization of SGL solution. This allows us to conduct multiple fine-grained statistical analyses of SGL, through which we investigate the effects of the group information and $γ$ (proportion of $\ell_1$ penalty). With the lens of various performance measures, we show that SGL with small $γ$ benefits significantly from the group information and can outperform other SGL (including LASSO) or regularized models which do not exploit the group information, in terms of the recovery rate of signal, false discovery rate and mean squared error. △ Less

Submitted 21 February, 2022; v1 submitted 2 July, 2021; originally announced July 2021.

Journal ref: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Cham, 2021

arXiv:2106.11767 [pdf, other]

Privacy Amplification via Iteration for Shuffled and Online PNSGD

Authors: Matteo Sordello, Zhiqi Bu, **shuo Dong

Abstract: In this paper, we consider the framework of privacy amplification via iteration, which is originally proposed by Feldman et al. and subsequently simplified by Asoodeh et al. in their analysis via the contraction coefficient. This line of work focuses on the study of the privacy guarantees obtained by the projected noisy stochastic gradient descent (PNSGD) algorithm with hidden intermediate updates… ▽ More In this paper, we consider the framework of privacy amplification via iteration, which is originally proposed by Feldman et al. and subsequently simplified by Asoodeh et al. in their analysis via the contraction coefficient. This line of work focuses on the study of the privacy guarantees obtained by the projected noisy stochastic gradient descent (PNSGD) algorithm with hidden intermediate updates. A limitation in the existing literature is that only the early stopped PNSGD has been studied, while no result has been proved on the more widely-used PNSGD applied on a shuffled dataset. Moreover, no scheme has been yet proposed regarding how to decrease the injected noise when new data are received in an online fashion. In this work, we first prove a privacy guarantee for shuffled PNSGD, which is investigated asymptotically when the noise is fixed for each sample size $n$ but reduced at a predetermined rate when $n$ increases, in order to achieve the convergence of privacy loss. We then analyze the online setting and provide a faster decaying scheme for the magnitude of the injected noise that also guarantees the convergence of privacy loss. △ Less

Submitted 20 June, 2021; originally announced June 2021.

arXiv:2106.07830 [pdf, other]

On the Convergence and Calibration of Deep Learning with Differential Privacy

Authors: Zhiqi Bu, Hua Wang, Zongyu Dai, Qi Long

Abstract: Differentially private (DP) training preserves the data privacy usually at the cost of slower convergence (and thus lower accuracy), as well as more severe mis-calibration than its non-private counterpart. To analyze the convergence of DP training, we formulate a continuous time analysis through the lens of neural tangent kernel (NTK), which characterizes the per-sample gradient clip** and the n… ▽ More Differentially private (DP) training preserves the data privacy usually at the cost of slower convergence (and thus lower accuracy), as well as more severe mis-calibration than its non-private counterpart. To analyze the convergence of DP training, we formulate a continuous time analysis through the lens of neural tangent kernel (NTK), which characterizes the per-sample gradient clip** and the noise addition in DP training, for arbitrary network architectures and loss functions. Interestingly, we show that the noise addition only affects the privacy risk but not the convergence or calibration, whereas the per-sample gradient clip** (under both flat and layerwise clip** styles) only affects the convergence and calibration. Furthermore, we observe that while DP models trained with small clip** norm usually achieve the best accurate, but are poorly calibrated and thus unreliable. In sharp contrast, DP models trained with large clip** norm enjoy the same privacy guarantee and similar accuracy, but are significantly more \textit{calibrated}. Our code can be found at \url{https://github.com/woodyx218/opacus_global_clip**}. △ Less

Submitted 19 June, 2023; v1 submitted 14 June, 2021; originally announced June 2021.

arXiv:2105.13302 [pdf, other]

Characterizing the SLOPE Trade-off: A Variational Perspective and the Donoho-Tanner Limit

Authors: Zhiqi Bu, Jason Klusowski, Cynthia Rush, Weijie J. Su

Abstract: Sorted l1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression. In this paper, we study how this relatively new regularization technique improves variable selection by characterizing the optimal SLOPE trade-off between the false discovery proportion (FDP) and true positive proportion… ▽ More Sorted l1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression. In this paper, we study how this relatively new regularization technique improves variable selection by characterizing the optimal SLOPE trade-off between the false discovery proportion (FDP) and true positive proportion (TPP) or, equivalently, between measures of type I error and power. Assuming a regime of linear sparsity and working under Gaussian random designs, we obtain an upper bound on the optimal trade-off for SLOPE, showing its capability of breaking the Donoho-Tanner power limit. To put it into perspective, this limit is the highest possible power that the Lasso, which is perhaps the most popular l1-based method, can achieve even with arbitrarily strong effect sizes. Next, we derive a tight lower bound that delineates the fundamental limit of sorted l1 regularization in optimally trading the FDP off for the TPP. Finally, we show that on any problem instance, SLOPE with a certain regularization sequence outperforms the Lasso, in the sense of having a smaller FDP, larger TPP and smaller l2 estimation risk simultaneously. Our proofs are based on a novel technique that reduces a calculus of variations problem to a class of infinite-dimensional convex optimization problems and a very recent result from approximate message passing theory. △ Less

Submitted 5 June, 2022; v1 submitted 27 May, 2021; originally announced May 2021.

Journal ref: Annals of Statistics 2022

arXiv:2102.07211 [pdf, other]

Efficient Designs of SLOPE Penalty Sequences in Finite Dimension

Authors: Yiliang Zhang, Zhiqi Bu

Abstract: In linear regression, SLOPE is a new convex analysis method that generalizes the Lasso via the sorted L1 penalty: larger fitted coefficients are penalized more heavily. This magnitude-dependent regularization requires an input of penalty sequence $λ$, instead of a scalar penalty as in the Lasso case, thus making the design extremely expensive in computation. In this paper, we propose two efficient… ▽ More In linear regression, SLOPE is a new convex analysis method that generalizes the Lasso via the sorted L1 penalty: larger fitted coefficients are penalized more heavily. This magnitude-dependent regularization requires an input of penalty sequence $λ$, instead of a scalar penalty as in the Lasso case, thus making the design extremely expensive in computation. In this paper, we propose two efficient algorithms to design the possibly high-dimensional SLOPE penalty, in order to minimize the mean squared error. For Gaussian data matrices, we propose a first order Projected Gradient Descent (PGD) under the Approximate Message Passing regime. For general data matrices, we present a zero-th order Coordinate Descent (CD) to design a sub-class of SLOPE, referred to as the k-level SLOPE. Our CD allows a useful trade-off between the accuracy and the computation speed. We demonstrate the performance of SLOPE with our designs via extensive experiments on synthetic data and real-world datasets. △ Less

Submitted 12 December, 2021; v1 submitted 14 February, 2021; originally announced February 2021.

Comments: Accepted to AISTATS 2021

arXiv:2011.00417 [pdf, other]

DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural Networks

Authors: Shiyun Xu, Zhiqi Bu

Abstract: Recent years have witnessed strong empirical performance of over-parameterized neural networks on various tasks and many advances in the theory, e.g. the universal approximation and provable convergence to global minimum. In this paper, we incorporate over-parameterized neural networks into semi-parametric models to bridge the gap between inference and prediction, especially in the high dimensiona… ▽ More Recent years have witnessed strong empirical performance of over-parameterized neural networks on various tasks and many advances in the theory, e.g. the universal approximation and provable convergence to global minimum. In this paper, we incorporate over-parameterized neural networks into semi-parametric models to bridge the gap between inference and prediction, especially in the high dimensional linear problem. By doing so, we can exploit a wide class of networks to approximate the nuisance functions and to estimate the parameters of interest consistently. Therefore, we may offer the best of two worlds: the universal approximation ability from neural networks and the interpretability from classic ordinary linear model, leading to both valid inference and accurate prediction. We show the theoretical foundations that make this possible and demonstrate with numerical experiments. Furthermore, we propose a framework, DebiNet, in which we plug-in arbitrary feature selection methods to our semi-parametric neural network. DebiNet can debias the regularized estimators (e.g. Lasso) and perform well, in terms of the post-selection inference and the generalization error. △ Less

Submitted 24 January, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

arXiv:2010.13165 [pdf, other]

A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks

Authors: Zhiqi Bu, Shiyun Xu, Kan Chen

Abstract: When equipped with efficient optimization algorithms, the over-parameterized neural networks have demonstrated high level of performance even though the loss function is non-convex and non-smooth. While many works have been focusing on understanding the loss dynamics by training neural networks with the gradient descent (GD), in this work, we consider a broad class of optimization algorithms that… ▽ More When equipped with efficient optimization algorithms, the over-parameterized neural networks have demonstrated high level of performance even though the loss function is non-convex and non-smooth. While many works have been focusing on understanding the loss dynamics by training neural networks with the gradient descent (GD), in this work, we consider a broad class of optimization algorithms that are commonly used in practice. For example, we show from a dynamical system perspective that the Heavy Ball (HB) method can converge to global minimum on mean squared error (MSE) at a linear rate (similar to GD); however, the Nesterov accelerated gradient descent (NAG) may only converges to global minimum sublinearly. Our results rely on the connection between neural tangent kernel (NTK) and finite over-parameterized neural networks with ReLU activation, which leads to analyzing the limiting ordinary differential equations (ODE) for optimization algorithms. We show that, optimizing the non-convex loss over the weights corresponds to optimizing some strongly convex loss over the prediction error. As a consequence, we can leverage the classical convex optimization theory to understand the convergence behavior of neural networks. We believe our approach can also be extended to other optimization algorithms and network architectures. △ Less

Submitted 10 March, 2021; v1 submitted 25 October, 2020; originally announced October 2020.

Comments: Accepted to AISTATS 2021

arXiv:1911.11607 [pdf, other]

Deep Learning with Gaussian Differential Privacy

Authors: Zhiqi Bu, **shuo Dong, Qi Long, Weijie J. Su

Abstract: Deep learning models are often trained on datasets that contain sensitive information such as individuals' shop** transactions, personal contacts, and medical records. An increasingly important line of work therefore has sought to train neural networks subject to privacy constraints that are specified by differential privacy or its divergence-based relaxations. These privacy definitions, however… ▽ More Deep learning models are often trained on datasets that contain sensitive information such as individuals' shop** transactions, personal contacts, and medical records. An increasingly important line of work therefore has sought to train neural networks subject to privacy constraints that are specified by differential privacy or its divergence-based relaxations. These privacy definitions, however, have weaknesses in handling certain important primitives (composition and subsampling), thereby giving loose or complicated privacy analyses of training neural networks. In this paper, we consider a recently proposed privacy definition termed \textit{$f$-differential privacy} [18] for a refined privacy analysis of training neural networks. Leveraging the appealing properties of $f$-differential privacy in handling composition and subsampling, this paper derives analytically tractable expressions for the privacy guarantees of both stochastic gradient descent and Adam used in training deep neural networks, without the need of develo** sophisticated techniques as [3] did. Our results demonstrate that the $f$-differential privacy framework allows for a new privacy analysis that improves on the prior analysis~[3], which in turn suggests tuning certain parameters of neural networks for a better prediction accuracy without violating the privacy budget. These theoretically derived improvements are confirmed by our experiments in a range of tasks in image classification, text classification, and recommender systems. Python code to calculate the privacy cost for these experiments is publicly available in the \texttt{TensorFlow Privacy} library. △ Less

Submitted 22 July, 2020; v1 submitted 26 November, 2019; originally announced November 2019.

Comments: To appear in Harvard Data Science Review

arXiv:1907.07502 [pdf, other]

Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing

Authors: Zhiqi Bu, Jason Klusowski, Cynthia Rush, Weijie Su

Abstract: SLOPE is a relatively new convex optimization procedure for high-dimensional linear regression via the sorted l1 penalty: the larger the rank of the fitted coefficient, the larger the penalty. This non-separable penalty renders many existing techniques invalid or inconclusive in analyzing the SLOPE solution. In this paper, we develop an asymptotically exact characterization of the SLOPE solution u… ▽ More SLOPE is a relatively new convex optimization procedure for high-dimensional linear regression via the sorted l1 penalty: the larger the rank of the fitted coefficient, the larger the penalty. This non-separable penalty renders many existing techniques invalid or inconclusive in analyzing the SLOPE solution. In this paper, we develop an asymptotically exact characterization of the SLOPE solution under Gaussian random designs through solving the SLOPE problem using approximate message passing (AMP). This algorithmic approach allows us to approximate the SLOPE solution via the much more amenable AMP iterates. Explicitly, we characterize the asymptotic dynamics of the AMP iterates relying on a recently developed state evolution analysis for non-separable penalties, thereby overcoming the difficulty caused by the sorted l1 penalty. Moreover, we prove that the AMP iterates converge to the SLOPE solution in an asymptotic sense, and numerical simulations show that the convergence is surprisingly fast. Our proof rests on a novel technique that specifically leverages the SLOPE problem. In contrast to prior literature, our work not only yields an asymptotically sharp analysis but also offers an algorithmic, flexible, and constructive approach to understanding the SLOPE problem. △ Less

Submitted 17 July, 2019; originally announced July 2019.

Showing 1–16 of 16 results for author: Bu, Z