Skip to main content

Showing 1–41 of 41 results for author: Bu, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02772  [pdf, other

    cs.LG cs.CL cs.CV

    Automatic gradient descent with generalized Newton's method

    Authors: Zhiqi Bu, Shiyun Xu

    Abstract: We propose the generalized Newton's method (GeN) -- a Hessian-informed approach that applies to any optimizer such as SGD and Adam, and covers the Newton-Raphson method as a sub-case. Our method automatically and dynamically selects the learning rate that accelerates the convergence, without the intensive tuning of the learning rate scheduler. In practice, out method is easily implementable, since… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2406.07529  [pdf, other

    cs.LG

    MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation

    Authors: Lu Li, Tianyu Zhang, Zhiqi Bu, Suyuchen Wang, Huan He, Jie Fu, Yonghui Wu, Jiang Bian, Yong Chen, Yoshua Bengio

    Abstract: Model merging has emerged as an effective approach to combine multiple single-task models, fine-tuned from the same pre-trained model, into a multitask model. This process typically involves computing a weighted average of the model parameters without any additional training. Existing model-merging methods focus on enhancing average task accuracy. However, interference and conflicts between the ob… ▽ More

    Submitted 18 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  3. arXiv:2403.12787  [pdf, other

    cs.CV

    DDSB: An Unsupervised and Training-free Method for Phase Detection in Echocardiography

    Authors: Zhenyu Bu, Yang Liu, Jiayu Huo, **g**g Peng, Kaini Wang, Guangquan Zhou, Rachel Sparks, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin

    Abstract: Accurate identification of End-Diastolic (ED) and End-Systolic (ES) frames is key for cardiac function assessment through echocardiography. However, traditional methods face several limitations: they require extensive amounts of data, extensive annotations by medical experts, significant training resources, and often lack robustness. Addressing these challenges, we proposed an unsupervised and tra… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  4. arXiv:2402.18925  [pdf, other

    cs.CV

    PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds

    Authors: Haotian Liu, Sanqing Qu, Fan Lu, Zongtao Bu, Florian Roehrbein, Alois Knoll, Guang Chen

    Abstract: Event cameras can record scene dynamics with high temporal resolution, providing rich scene details for monocular depth estimation (MDE) even at low-level illumination. Therefore, existing complementary learning approaches for MDE fuse intensity information from images and scene details from event data for better scene understanding. However, most methods directly fuse two modalities at pixel leve… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Under Review

  5. arXiv:2402.18752  [pdf, other

    cs.LG cs.CR

    Pre-training Differentially Private Models with Limited Public Data

    Authors: Zhiqi Bu, Xinwei Zhang, Mingyi Hong, Sheng Zha, George Karypis

    Abstract: The superior performance of large foundation models relies on the use of massive amounts of high-quality data, which often contain sensitive, private and copyrighted material that requires formal protection. While differential privacy (DP) is a prominent method to gauge the degree of security provided to the models, its application is commonly limited to the model fine-tuning stage, due to the per… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  6. arXiv:2311.14632  [pdf, other

    cs.LG cs.CR

    Differentially Private SGD Without Clip** Bias: An Error-Feedback Approach

    Authors: Xinwei Zhang, Zhiqi Bu, Zhiwei Steven Wu, Mingyi Hong

    Abstract: Differentially Private Stochastic Gradient Descent with Gradient Clip** (DPSGD-GC) is a powerful tool for training deep learning models using sensitive data, providing both a solid theoretical privacy guarantee and high efficiency. However, using DPSGD-GC to ensure Differential Privacy (DP) comes at the cost of model performance degradation due to DP noise injection and gradient clip**. Existi… ▽ More

    Submitted 17 April, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

  7. arXiv:2311.11822  [pdf, other

    cs.LG cs.CC cs.CR cs.DC

    Zero redundancy distributed learning with differential privacy

    Authors: Zhiqi Bu, Justin Chiu, Ruixuan Liu, Sheng Zha, George Karypis

    Abstract: Deep learning using large models have achieved great success in a wide range of domains. However, training these models on billions of parameters is very challenging in terms of the training speed, memory cost, and communication efficiency, especially under the privacy-preserving regime with differential privacy (DP). On the one hand, DP optimization has comparable efficiency to the standard non-p… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  8. arXiv:2310.19215  [pdf, other

    cs.LG cs.CC cs.CR

    On the accuracy and efficiency of group-wise clip** in differentially private optimization

    Authors: Zhiqi Bu, Ruixuan Liu, Yu-Xiang Wang, Sheng Zha, George Karypis

    Abstract: Recent advances have substantially improved the accuracy, memory cost, and training speed of differentially private (DP) deep learning, especially on large vision and language models with millions to billions of parameters. In this work, we thoroughly study the per-sample gradient clip** style, a key component in DP optimization. We show that different clip** styles have the same time complexi… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

  9. arXiv:2310.14661  [pdf, other

    cs.LG stat.ML

    Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy

    Authors: Yingyu Lin, Yi-An Ma, Yu-Xiang Wang, Rachel Redberg, Zhiqi Bu

    Abstract: Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,δ)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the… ▽ More

    Submitted 1 May, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

  10. arXiv:2310.03402  [pdf, other

    cs.CV eess.IV

    A Complementary Global and Local Knowledge Network for Ultrasound denoising with Fine-grained Refinement

    Authors: Zhenyu Bu, Kai-Ni Wang, Fuxing Zhao, Shengxiao Li, Guang-Quan Zhou

    Abstract: Ultrasound imaging serves as an effective and non-invasive diagnostic tool commonly employed in clinical examinations. However, the presence of speckle noise in ultrasound images invariably degrades image quality, impeding the performance of subsequent tasks, such as segmentation and classification. Existing methods for speckle noise reduction frequently induce excessive image smoothing or fail to… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: Submitted to ICASSP 2024

  11. arXiv:2310.01304  [pdf, other

    cs.LG

    Coupling public and private gradient provably helps optimization

    Authors: Ruixuan Liu, Zhiqi Bu, Yu-xiang Wang, Sheng Zha, George Karypis

    Abstract: The success of large neural networks is crucially determined by the availability of data. It has been observed that training only on a small amount of public data, or privately on the abundant private data can lead to undesirable degradation of accuracy. In this work, we leverage both private and public data to improve the optimization, by coupling their gradients via a weighted linear combination… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: 12 pages

  12. arXiv:2305.01794  [pdf, other

    stat.ME cs.LG

    MISNN: Multiple Imputation via Semi-parametric Neural Networks

    Authors: Zhiqi Bu, Zongyu Dai, Yiliang Zhang, Qi Long

    Abstract: Multiple imputation (MI) has been widely applied to missing value problems in biomedical, social and econometric research, in order to avoid improper inference in the downstream data analysis. In the presence of high-dimensional data, imputation models that include feature selection, especially $\ell_1$ regularized regression (such as Lasso, adaptive Lasso, and Elastic Net), are common choices to… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  13. arXiv:2211.13297  [pdf, other

    cs.LG stat.ME

    Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data

    Authors: Zongyu Dai, Zhiqi Bu, Qi Long

    Abstract: Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample size, such as analysis of multi-omics data, present daunting challenges. Imputation is arguably the most popular method for handling missing data, though existing… ▽ More

    Submitted 21 December, 2022; v1 submitted 23 November, 2022; originally announced November 2022.

  14. arXiv:2211.08942  [pdf, other

    cs.LG cs.CR cs.CV

    Differentially Private Optimizers Can Learn Adversarially Robust Models

    Authors: Yuan Zhang, Zhiqi Bu

    Abstract: Machine learning models have shone in a variety of domains and attracted increasing attention from both the security and the privacy communities. One important yet worrying question is: Will training models under the differential privacy (DP) constraint have an unfavorable impact on their adversarial robustness? While previous works have postulated that privacy comes at the cost of worse robustnes… ▽ More

    Submitted 21 November, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

  15. arXiv:2211.04973  [pdf, other

    cs.LG math.OC

    Accelerating Adversarial Perturbation by 50% with Semi-backward Propagation

    Authors: Zhiqi Bu

    Abstract: Adversarial perturbation plays a significant role in the field of adversarial robustness, which solves a maximization problem over the input data. We show that the backward propagation of such optimization can accelerate $2\times$ (and thus the overall optimization including the forward propagation can accelerate $1.5\times$), without any utility drop, if we only compute the output gradient but no… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  16. arXiv:2211.02396  [pdf, other

    cs.SI cs.AI

    Rethinking the positive role of cluster structure in complex networks for link prediction tasks

    Authors: Shanfan Zhang, Wenjiao Zhang, Zhan Bu

    Abstract: Clustering is a fundamental problem in network analysis that finds closely connected groups of nodes and separates them from other nodes in the graph, while link prediction is to predict whether two nodes in a network are likely to have a link. The definition of both naturally determines that clustering must play a positive role in obtaining accurate link prediction tasks. Yet researchers have lon… ▽ More

    Submitted 27 November, 2022; v1 submitted 4 November, 2022; originally announced November 2022.

    Comments: 15 pages, 6 figures

  17. arXiv:2210.12690   

    cs.SI cs.AI

    DyCSC: Modeling the Evolutionary Process of Dynamic Networks Based on Cluster Structure

    Authors: Shanfan Zhang, Zhan Bu

    Abstract: Temporal networks are an important type of network whose topological structure changes over time. Compared with methods on static networks, temporal network embedding (TNE) methods are facing three challenges: 1) it cannot describe the temporal dependence across network snapshots; 2) the node embedding in the latent space fails to indicate changes in the network topology; and 3) it cannot avoid a… ▽ More

    Submitted 13 December, 2022; v1 submitted 23 October, 2022; originally announced October 2022.

    Comments: The experimental results are incorrect, we need to update them

  18. arXiv:2210.00038  [pdf, other

    cs.LG cs.CL cs.CR cs.CV

    Differentially Private Optimization on Large Model at Small Cost

    Authors: Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

    Abstract: Differentially private (DP) optimization is the standard paradigm to learn large neural networks that are accurate and privacy-preserving. The computational cost for DP deep learning, however, is notoriously heavy due to the per-sample gradient clip**. Existing DP implementations are 2-1000X more costly in time and space complexity than the standard (non-private) training. In this work, we devel… ▽ More

    Submitted 18 September, 2023; v1 submitted 30 September, 2022; originally announced October 2022.

  19. arXiv:2210.00036  [pdf, other

    cs.LG cs.CL cs.CR cs.CV

    Differentially Private Bias-Term Fine-tuning of Foundation Models

    Authors: Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

    Abstract: We study the problem of differentially private (DP) fine-tuning of large pre-trained models -- a recent privacy-preserving approach suitable for solving downstream tasks with sensitive data. Existing work has demonstrated that high accuracy is possible under strong privacy constraint, yet requires significant computational overhead or modifications to the network architecture. We propose different… ▽ More

    Submitted 18 June, 2024; v1 submitted 30 September, 2022; originally announced October 2022.

    Comments: Accepted at ICML 2024

  20. arXiv:2208.05241  [pdf, other

    eess.IV cs.CV

    CANet: Channel Extending and Axial Attention Catching Network for Multi-structure Kidney Segmentation

    Authors: Zhenyu Bu, Kai-Ni Wang, Guang-Quan Zhou

    Abstract: Renal cancer is one of the most prevalent cancers worldwide. Clinical signs of kidney cancer include hematuria and low back discomfort, which are quite distressing to the patient. Some surgery-based renal cancer treatments like laparoscopic partial nephrectomy relys on the 3D kidney parsing on computed tomography angiography (CTA) images. Many automatic segmentation techniques have been put forwar… ▽ More

    Submitted 5 October, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: KiPA2022 Challenge

  21. arXiv:2207.09145  [pdf, other

    eess.AS cs.AI cs.SD

    GAFX: A General Audio Feature eXtractor

    Authors: Zhaoyang Bu, Hanhaodi Zhang, Xiaohu Zhu

    Abstract: Most machine learning models for audio tasks are dealing with a handcrafted feature, the spectrogram. However, it is still unknown whether the spectrogram could be replaced with deep learning based features. In this paper, we answer this question by comparing the different learnable neural networks extracting features with a successful spectrogram model and proposed a General Audio Feature eXtract… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

  22. arXiv:2206.07136  [pdf, other

    cs.LG cs.CL cs.CR cs.CV

    Automatic Clip**: Differentially Private Deep Learning Made Easier and Stronger

    Authors: Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

    Abstract: Per-example gradient clip** is a key algorithmic step that enables practical differential private (DP) training for deep learning models. The choice of clip** threshold R, however, is vital for achieving high accuracy under DP. We propose an easy-to-use replacement, called automatic clip**, that eliminates the need to tune R for any DP optimizers, including DP-SGD, DP-Adam, DP-LAMB and many… ▽ More

    Submitted 3 October, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: accepted to NeurIPS 2023

  23. arXiv:2205.10683  [pdf, other

    cs.LG cs.CC cs.CV

    Scalable and Efficient Training of Large Convolutional Neural Networks with Differential Privacy

    Authors: Zhiqi Bu, Jialin Mao, Shiyun Xu

    Abstract: Large convolutional neural networks (CNN) can be difficult to train in the differentially private (DP) regime, since the optimization algorithms require a computationally expensive operation, known as the per-sample gradient clip**. We propose an efficient and scalable implementation of this clip** on convolutional layers, termed as the mixed ghost clip**, that significantly eases the privat… ▽ More

    Submitted 29 November, 2022; v1 submitted 21 May, 2022; originally announced May 2022.

    Comments: Accepted to NeurIPS 2022

  24. arXiv:2202.12482  [pdf, other

    stat.ML cs.LG math.ST

    Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity

    Authors: Shiyun Xu, Zhiqi Bu, Pratik Chaudhari, Ian J. Barnett

    Abstract: Interpretable machine learning has demonstrated impressive performance while preserving explainability. In particular, neural additive models (NAM) offer the interpretability to the black-box deep learning and achieve state-of-the-art accuracy among the large family of generalized additive models. In order to empower NAM with feature selection and improve the generalization, we propose the sparse… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

  25. arXiv:2112.11507  [pdf, other

    cs.LG stat.AP

    Multiple Imputation via Generative Adversarial Network for High-dimensional Blockwise Missing Value Problems

    Authors: Zongyu Dai, Zhiqi Bu, Qi Long

    Abstract: Missing data are present in most real world problems and need careful handling to preserve the prediction accuracy and statistical consistency in the downstream analysis. As the gold standard of handling missing data, multiple imputation (MI) methods are proposed to account for the imputation uncertainty and provide proper statistical inference. In this work, we propose Multiple Imputation via G… ▽ More

    Submitted 21 December, 2021; originally announced December 2021.

  26. arXiv:2109.06583  [pdf, ps, other

    cs.CV cs.IR

    A Semantic Indexing Structure for Image Retrieval

    Authors: Ying Wang, Tingzhen Liu, Zepeng Bu, Yuhui Huang, Lizhong Gao, Qiao Wang

    Abstract: In large-scale image retrieval, many indexing methods have been proposed to narrow down the searching scope of retrieval. The features extracted from images usually are of high dimensions or unfixed sizes due to the existence of key points. Most of existing index structures suffer from the dimension curse, the unfixed feature size and/or the loss of semantic similarity. In this paper a new classif… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: 12 pages, 6 figures

    MSC Class: 68T20 ACM Class: H.3.1

  27. arXiv:2107.14458  [pdf, ps, other

    cs.ET cs.IT eess.SP

    High-Efficiency Resonant Beam Charging and Communication

    Authors: Yunfeng Bai, Qingwen Liu, Xin Wang, Yudan Gou, Bin Zhou, Zhiyong Bu

    Abstract: With the development of Internet of Things (IoT), demands of power and data for IoT devices increase drastically. In order to resolve the supply-demand contradiction, simultaneous wireless information and power transfer (SWIPT) has been envisioned as an enabling technology by providing high-power energy transfer and high-rate data delivering concurrently. In this paper, we introduce a high-efficie… ▽ More

    Submitted 4 January, 2024; v1 submitted 30 July, 2021; originally announced July 2021.

  28. arXiv:2107.08461  [pdf, other

    cs.LG cs.CR stat.ML

    Differentially Private Bayesian Neural Networks on Accuracy, Privacy and Reliability

    Authors: Qiyiwen Zhang, Zhiqi Bu, Kan Chen, Qi Long

    Abstract: Bayesian neural network (BNN) allows for uncertainty quantification in prediction, offering an advantage over regular neural networks that has not been explored in the differential privacy (DP) framework. We fill this important gap by leveraging recent development in Bayesian deep learning and privacy accounting to offer a more precise analysis of the trade-off between privacy and accuracy in BNN.… ▽ More

    Submitted 18 February, 2023; v1 submitted 18 July, 2021; originally announced July 2021.

  29. arXiv:2107.05078  [pdf, other

    cs.CV cs.HC eess.SP

    A Cloud-Edge-Terminal Collaborative System for Temperature Measurement in COVID-19 Prevention

    Authors: Zheyi Ma, Hao Li, Wen Fang, Qingwen Liu, Bin Zhou, Zhiyong Bu

    Abstract: To prevent the spread of coronavirus disease 2019 (COVID-19), preliminary temperature measurement and mask detection in public areas are conducted. However, the existing temperature measurement methods face the problems of safety and deployment. In this paper, to realize safe and accurate temperature measurement even when a person's face is partially obscured, we propose a cloud-edge-terminal coll… ▽ More

    Submitted 11 July, 2021; originally announced July 2021.

    Comments: 6 pages, 8 figures, INFOCOMW ICCN 2021

  30. arXiv:2106.11767  [pdf, other

    cs.CR cs.LG math.ST stat.ML

    Privacy Amplification via Iteration for Shuffled and Online PNSGD

    Authors: Matteo Sordello, Zhiqi Bu, **shuo Dong

    Abstract: In this paper, we consider the framework of privacy amplification via iteration, which is originally proposed by Feldman et al. and subsequently simplified by Asoodeh et al. in their analysis via the contraction coefficient. This line of work focuses on the study of the privacy guarantees obtained by the projected noisy stochastic gradient descent (PNSGD) algorithm with hidden intermediate updates… ▽ More

    Submitted 20 June, 2021; originally announced June 2021.

  31. arXiv:2106.09680  [pdf, other

    cs.LG cs.CR

    Accuracy, Interpretability, and Differential Privacy via Explainable Boosting

    Authors: Harsha Nori, Rich Caruana, Zhiqi Bu, Judy Hanwen Shen, Janardhan Kulkarni

    Abstract: We show that adding differential privacy to Explainable Boosting Machines (EBMs), a recent method for training interpretable ML models, yields state-of-the-art accuracy while protecting privacy. Our experiments on multiple classification and regression datasets show that DP-EBM models suffer surprisingly little accuracy loss even with strong differential privacy guarantees. In addition to high acc… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Comments: To be published in ICML 2021. 12 pages, 6 figures

  32. arXiv:2106.07830  [pdf, other

    cs.LG stat.ML

    On the Convergence and Calibration of Deep Learning with Differential Privacy

    Authors: Zhiqi Bu, Hua Wang, Zongyu Dai, Qi Long

    Abstract: Differentially private (DP) training preserves the data privacy usually at the cost of slower convergence (and thus lower accuracy), as well as more severe mis-calibration than its non-private counterpart. To analyze the convergence of DP training, we formulate a continuous time analysis through the lens of neural tangent kernel (NTK), which characterizes the per-sample gradient clip** and the n… ▽ More

    Submitted 19 June, 2023; v1 submitted 14 June, 2021; originally announced June 2021.

  33. arXiv:2105.13302  [pdf, other

    math.ST cs.IT cs.LG eess.SP stat.ML

    Characterizing the SLOPE Trade-off: A Variational Perspective and the Donoho-Tanner Limit

    Authors: Zhiqi Bu, Jason Klusowski, Cynthia Rush, Weijie J. Su

    Abstract: Sorted l1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression. In this paper, we study how this relatively new regularization technique improves variable selection by characterizing the optimal SLOPE trade-off between the false discovery proportion (FDP) and true positive proportion… ▽ More

    Submitted 5 June, 2022; v1 submitted 27 May, 2021; originally announced May 2021.

    Journal ref: Annals of Statistics 2022

  34. arXiv:2103.12701  [pdf, ps, other

    cs.AI

    A*+BFHS: A Hybrid Heuristic Search Algorithm

    Authors: Zhaoxing Bu, Richard E. Korf

    Abstract: We present a new algorithm A*+BFHS for solving problems with unit-cost operators where A* and IDA* fail due to memory limitations and/or the existence of many distinct paths between the same pair of nodes. A*+BFHS is based on A* and breadth-first heuristic search (BFHS). A*+BFHS combines advantages from both algorithms, namely A*'s node ordering, BFHS's memory savings, and both algorithms' duplica… ▽ More

    Submitted 16 December, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: 8 pages, 5 figures, 1 table

    ACM Class: I.2.8

  35. arXiv:2102.07211  [pdf, other

    stat.ML cs.LG stat.ME

    Efficient Designs of SLOPE Penalty Sequences in Finite Dimension

    Authors: Yiliang Zhang, Zhiqi Bu

    Abstract: In linear regression, SLOPE is a new convex analysis method that generalizes the Lasso via the sorted L1 penalty: larger fitted coefficients are penalized more heavily. This magnitude-dependent regularization requires an input of penalty sequence $λ$, instead of a scalar penalty as in the Lasso case, thus making the design extremely expensive in computation. In this paper, we propose two efficient… ▽ More

    Submitted 12 December, 2021; v1 submitted 14 February, 2021; originally announced February 2021.

    Comments: Accepted to AISTATS 2021

  36. arXiv:2102.03013  [pdf, other

    cs.LG cs.CR

    Fast and Memory Efficient Differentially Private-SGD via JL Projections

    Authors: Zhiqi Bu, Sivakanth Gopi, Janardhan Kulkarni, Yin Tat Lee, Judy Hanwen Shen, Uthaipon Tantipongpipat

    Abstract: Differentially Private-SGD (DP-SGD) of Abadi et al. (2016) and its variations are the only known algorithms for private training of large scale neural networks. This algorithm requires computation of per-sample gradients norms which is extremely slow and memory intensive in practice. In this paper, we present a new framework to design differentially private optimizers called DP-SGD-JL and DP-Adam-… ▽ More

    Submitted 5 February, 2021; originally announced February 2021.

  37. arXiv:2011.00417  [pdf, other

    stat.ML cs.LG

    DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural Networks

    Authors: Shiyun Xu, Zhiqi Bu

    Abstract: Recent years have witnessed strong empirical performance of over-parameterized neural networks on various tasks and many advances in the theory, e.g. the universal approximation and provable convergence to global minimum. In this paper, we incorporate over-parameterized neural networks into semi-parametric models to bridge the gap between inference and prediction, especially in the high dimensiona… ▽ More

    Submitted 24 January, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

  38. arXiv:2010.13165  [pdf, other

    cs.LG math.DS math.OC stat.ML

    A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks

    Authors: Zhiqi Bu, Shiyun Xu, Kan Chen

    Abstract: When equipped with efficient optimization algorithms, the over-parameterized neural networks have demonstrated high level of performance even though the loss function is non-convex and non-smooth. While many works have been focusing on understanding the loss dynamics by training neural networks with the gradient descent (GD), in this work, we consider a broad class of optimization algorithms that… ▽ More

    Submitted 10 March, 2021; v1 submitted 25 October, 2020; originally announced October 2020.

    Comments: Accepted to AISTATS 2021

  39. arXiv:2007.11078  [pdf, other

    math.ST cs.IT

    The Complete Lasso Tradeoff Diagram

    Authors: Hua Wang, Yachong Yang, Zhiqi Bu, Weijie J. Su

    Abstract: A fundamental problem in the high-dimensional regression is to understand the tradeoff between type I and type II errors or, equivalently, false discovery rate (FDR) and power in variable selection. To address this important problem, we offer the first complete tradeoff diagram that distinguishes all pairs of FDR and power that can be asymptotically realized by the Lasso with some choice of its pe… ▽ More

    Submitted 28 October, 2020; v1 submitted 21 July, 2020; originally announced July 2020.

    Comments: To appear in the 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

  40. arXiv:1911.11607  [pdf, other

    cs.LG cs.CR stat.ML

    Deep Learning with Gaussian Differential Privacy

    Authors: Zhiqi Bu, **shuo Dong, Qi Long, Weijie J. Su

    Abstract: Deep learning models are often trained on datasets that contain sensitive information such as individuals' shop** transactions, personal contacts, and medical records. An increasingly important line of work therefore has sought to train neural networks subject to privacy constraints that are specified by differential privacy or its divergence-based relaxations. These privacy definitions, however… ▽ More

    Submitted 22 July, 2020; v1 submitted 26 November, 2019; originally announced November 2019.

    Comments: To appear in Harvard Data Science Review

  41. arXiv:1907.07502  [pdf, other

    stat.ML cs.LG eess.SP math.ST

    Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing

    Authors: Zhiqi Bu, Jason Klusowski, Cynthia Rush, Weijie Su

    Abstract: SLOPE is a relatively new convex optimization procedure for high-dimensional linear regression via the sorted l1 penalty: the larger the rank of the fitted coefficient, the larger the penalty. This non-separable penalty renders many existing techniques invalid or inconclusive in analyzing the SLOPE solution. In this paper, we develop an asymptotically exact characterization of the SLOPE solution u… ▽ More

    Submitted 17 July, 2019; originally announced July 2019.