Skip to main content

Showing 1–50 of 155 results for author: Zhu, L

Searching in archive math. Search in all archives.
.
  1. arXiv:2406.09795  [pdf, other

    cs.LG math.NA

    DeltaPhi: Learning Physical Trajectory Residual for PDE Solving

    Authors: Xihang Yue, Linchao Zhu, Yi Yang

    Abstract: Although neural operator networks theoretically approximate any operator map**, the limited generalization capability prevents them from learning correct physical dynamics when potential data biases exist, particularly in the practical PDE solving scenario where the available data amount is restricted or the resolution is extremely low. To address this issue, we propose and formulate the Physica… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  2. arXiv:2406.08954  [pdf, other

    math.OC

    S-SOS: Stochastic Sum-Of-Squares for Parametric Polynomial Optimization

    Authors: Richard L. Zhu, Mathias Oster, Yuehaw Khoo

    Abstract: Global polynomial optimization is an important tool across applied mathematics, with many applications in operations research, engineering, and physical sciences. In various settings, the polynomials depend on external parameters that may be random. We discuss a stochastic sum-of-squares (S-SOS) algorithm based on the sum-of squares hierarchy that constructs a series of semidefinite programs to jo… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2405.13266  [pdf, other

    math.ST

    Nonparametric estimation of FBSDEs with random terminal time

    Authors: Shaolin Ji, Chenyao Yu, Linlin Zhu

    Abstract: This paper investigates the nonparametric estimation of the functional coefficients of the FBSDEs with random terminal time, including the local constant and local linear estimators. We provide complete two-dimensional asymptotics in both the time span and the sampling interval, allowing for the precise characterization of their distribution. Moreover, the empirical likelihood (EL) method to const… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  4. arXiv:2404.17290  [pdf, ps, other

    math.NA

    Efficient Orthogonal Decomposition with Automatic Basis Extraction for Low-Rank Matrix Approximation

    Authors: Weijie Shen, Weiwei Xu, Lei Zhu

    Abstract: Low-rank matrix approximation play a ubiquitous role in various applications such as image processing, signal processing, and data analysis. Recently, random algorithms of low-rank matrix approximation have gained widespread adoption due to their speed, accuracy, and robustness, particularly in their improved implementation on modern computer architectures. Existing low-rank approximation algorith… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  5. arXiv:2403.12697  [pdf, other

    math.AP

    Optimal estimate of electromagnetic field concentration between two nearly-touching inclusions in the quasi-static regime

    Authors: Youjun Deng, Hongyu Liu, Liyan Zhu

    Abstract: We investigate the electromagnetic field concentration between two nearly-touching inclusions that possess high-contrast electric permittivities in the quasi-static regime. By using layer potential techniques and asymptotic analysis in the low-frequency regime, we derive low-frequency expansions that provide integral representations for the solutions of the Maxwell equations. For the leading-order… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  6. arXiv:2403.02051  [pdf, other

    stat.ML cs.CR cs.LG math.ST

    Differential Privacy of Noisy (S)GD under Heavy-Tailed Perturbations

    Authors: Umut Şimşekli, Mert Gürbüzbalaban, Sinan Yıldırım, Lingjiong Zhu

    Abstract: Injecting heavy-tailed noise to the iterates of stochastic gradient descent (SGD) has received increasing attention over the past few years. While various theoretical properties of the resulting algorithm have been analyzed mainly from learning theory and optimization perspectives, their privacy preservation properties have not yet been established. Aiming to bridge this gap, we provide differenti… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  7. arXiv:2402.12502  [pdf, ps, other

    math.PR

    Euler-Maruyama schemes for stochastic differential equations driven by stable Lévy processes with i.i.d. stable components

    Authors: Thanh Dang, Lingjiong Zhu

    Abstract: We study Euler-Maruyama numerical schemes of stochastic differential equations driven by stable Lévy processes with i.i.d. stable components. We obtain a uniform-in-time approximation error in Wasserstein distance. Our approximation error has a linear dependence on the stepsize, which is expected to be tight, as can be seen from an explicit calculation for the case of an Ornstein-Uhlenbeck process… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 73 pages

  8. arXiv:2401.17958  [pdf, ps, other

    stat.ML cs.LG math.PR

    Convergence Analysis for General Probability Flow ODEs of Diffusion Models in Wasserstein Distances

    Authors: Xuefeng Gao, Lingjiong Zhu

    Abstract: Score-based generative modeling with probability flow ordinary differential equations (ODEs) has achieved remarkable success in a variety of applications. While various fast ODE-based samplers have been proposed in the literature and employed in practice, the theoretical understandings about convergence properties of the probability flow ODE are still quite limited. In this paper, we provide the f… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 47 pages, 3 tables. arXiv admin note: text overlap with arXiv:2311.11003

  9. arXiv:2312.02421  [pdf, ps, other

    math.AP

    Inverse conductivity problem with one measurement: Uniqueness of multi-layer structures

    Authors: Lingzheng Kong, Youjun Deng, Liyan Zhu

    Abstract: In this paper, we study the recovery of multi-layer structures in inverse conductivity problem by using one measurement. First, we define the concept of Generalized Polarization Tensors (GPTs) for multi-layered medium and show some important properties of the proposed GPTs. With the help of GPTs, we present the perturbation formula for general multi-layered medium. Then we derive the perturbed ele… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    MSC Class: 31A25; 35J05; 86A20

  10. arXiv:2311.11003  [pdf, other

    cs.LG math.PR stat.ML

    Wasserstein Convergence Guarantees for a General Class of Score-Based Generative Models

    Authors: Xuefeng Gao, Hoang M. Nguyen, Lingjiong Zhu

    Abstract: Score-based generative models (SGMs) is a recent class of deep generative models with state-of-the-art performance in many applications. In this paper, we establish convergence guarantees for a general class of SGMs in 2-Wasserstein distance, assuming accurate score estimates and smooth log-concave data distribution. We specialize our result to several concrete SGMs with specific choices of forwar… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  11. arXiv:2307.15903  [pdf, ps, other

    math.PR

    Fluctuations and moderate deviations for the mean fields of Hawkes processes

    Authors: Fuqing Gao, Yunshi Gao, Lingjiong Zhu

    Abstract: The Hawkes process is a counting process that has self- and mutually-exciting features with many applications in various fields. In recent years, there have been many interests in the mean-field results of the Hawkes process and its extensions. It is known that the mean-field limit of a multivariate nonlinear Hawkes process is a time-inhomogeneous Poisson process. In this paper, we study the fluct… ▽ More

    Submitted 29 July, 2023; originally announced July 2023.

    Comments: 38 pages

  12. arXiv:2307.07767  [pdf, ps, other

    math.ST

    Byzantine-robust distributed one-step estimation

    Authors: Chuhan Wang, Xuehu Zhu, Lixing Zhu

    Abstract: This paper proposes a Robust One-Step Estimator(ROSE) to solve the Byzantine failure problem in distributed M-estimation when a moderate fraction of node machines experience Byzantine failures. To define ROSE, the algorithms use the robust Variance Reduced Median Of the Local(VRMOL) estimator to determine the initial parameter value for iteration, and communicate between the node machines and the… ▽ More

    Submitted 15 July, 2023; originally announced July 2023.

  13. arXiv:2306.12730  [pdf, other

    math.OC eess.SP

    Rotation Group Synchronization via Quotient Manifold

    Authors: Linglingzhi Zhu, Chong Li, Anthony Man-Cho So

    Abstract: Rotation group $\mathcal{SO}(d)$ synchronization is an important inverse problem and has attracted intense attention from numerous application fields such as graph realization, computer vision, and robotics. In this paper, we focus on the least-squares estimator of rotation group synchronization with general additive noise models, which is a nonconvex optimization problem with manifold constraints… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  14. arXiv:2306.09084  [pdf, other

    q-fin.PR math.PR

    Asymptotics for the Laplace transform of the time integral of the geometric Brownian motion

    Authors: Dan Pirjol, Lingjiong Zhu

    Abstract: We present an asymptotic result for the Laplace transform of the time integral of the geometric Brownian motion $F(θ,T) = \mathbb{E}[e^{-θX_T}]$ with $X_T = \int_0^T e^{σW_s + ( a - \frac12 σ^2)s} ds$, which is exact in the limit $σ^2 T \to 0$ at fixed $σ^2 θT^2$ and $aT$. This asymptotic result is applied to pricing zero coupon bonds in the Dothan model of stochastic interest rates. The asymptoti… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 17 pages, 2 figures, 2 tables

    Journal ref: Operations Research Letters 2023, Volume 51, 346-352

  15. arXiv:2306.04815  [pdf, other

    cs.LG math.OC stat.ML

    Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning

    Authors: Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

    Abstract: In this paper, we first present an explanation regarding the common occurrence of spikes in the training loss when neural networks are trained with stochastic gradient descent (SGD). We provide evidence that the spikes in the training loss of SGD are "catapults", an optimization phenomenon originally observed in GD with large learning rates in [Lewkowycz et al. 2020]. We empirically show that thes… ▽ More

    Submitted 5 June, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: ICML 2024

  16. arXiv:2305.12056  [pdf, ps, other

    stat.ML cs.LG math.OC

    Uniform-in-Time Wasserstein Stability Bounds for (Noisy) Stochastic Gradient Descent

    Authors: Lingjiong Zhu, Mert Gurbuzbalaban, Anant Raj, Umut Simsekli

    Abstract: Algorithmic stability is an important notion that has proven powerful for deriving generalization bounds for practical algorithms. The last decade has witnessed an increasing number of stability bounds for different algorithms applied on different classes of loss functions. While these bounds have illuminated various properties of optimization algorithms, the analysis of each case typically requir… ▽ More

    Submitted 28 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: 49 pages, NeurIPS 2023

  17. arXiv:2305.01379  [pdf, ps, other

    stat.ML cs.LG eess.SP math.OC

    LogSpecT: Feasible Graph Learning Model from Stationary Signals with Recovery Guarantees

    Authors: Shangyuan Liu, Linglingzhi Zhu, Anthony Man-Cho So

    Abstract: Graph learning from signals is a core task in Graph Signal Processing (GSP). One of the most commonly used models to learn graphs from stationary signals is SpecT. However, its practical formulation rSpecT is known to be sensitive to hyperparameter selection and, even worse, to suffer from infeasibility. In this paper, we give the first condition that guarantees the infeasibility of rSpecT and des… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  18. arXiv:2304.05602  [pdf, ps, other

    math.RA

    Ore Extension of Group-cograded Hopf Coquasigroups

    Authors: Lingli Zhu, Bingbing **, Huili Liu, Tao Yang

    Abstract: The aim of this paper is the Ore extension of group-cograded Hopf coquasigroups. This paper first shows a categorical interpretation and some examples of group-cograded Hopf coquasigroups, and then gives a necessary and sufficient conditions for the Ore extensions of group-cograded Hopf coquasigroups to be group-cograded Hopf coquasigroups. Finally, a certain isomorphism between Ore extensions are… ▽ More

    Submitted 11 July, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: 15pages

    MSC Class: 16T05; 16S36

  19. arXiv:2304.04434  [pdf, ps, other

    math.NA math.AP

    Finite element and integral equation methods to conical diffraction by imperfectly conducting gratings

    Authors: Guanghui Hu, Jiayi Zhang, Linlin Zhu

    Abstract: In this paper we study the variational method and integral equation methods for a conical diffraction problem for imperfectly conducting gratings modeled by the impedance boundary value problem of the Helmholtz equation in periodic structures. We justify the strong ellipticity of the sesquilinear form corresponding to the variational formulation and prove the uniqueness of solutions at any frequen… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  20. arXiv:2304.04204  [pdf, ps, other

    math.AP

    Well-posedness of grating diffraction problems for plane wave incidence: explicit dependence on wavenumbers and incident angles

    Authors: Linlin Zhu, Guanghui Hu

    Abstract: Suppose that a plane wave is incident onto an impenetrable grating profile of Dirichlet or Impedance type or a penetrable grating. The grating interface is assumed to be given by a Lipschitz function in two dimensions. We derive stability estimate of the grating diffraction problem via variational method with an explicit dependence of solutions on the incident wavenumber and incident angle.

    Submitted 9 April, 2023; originally announced April 2023.

  21. arXiv:2302.13983  [pdf, other

    physics.class-ph math-ph math.AP

    Elastostatics with multi-layer metamaterial structures and an algebraic framework for polariton resonances

    Authors: Youjun Deng, Lingzheng Kong, Hongyu Liu, Liyan Zhu

    Abstract: Multi-layer structures are ubiquitous in constructing metamaterial devices to realise various frontier applications including super-resolution imaging and invisibility cloaking. In this paper, we develop a general mathematical framework for studying elastostatics within multi-layer material structures in $\mathbb{R}^d$, $d=2,3$. The multi-layer structure is formed by concentric balls and each laye… ▽ More

    Submitted 27 January, 2023; originally announced February 2023.

  22. arXiv:2302.05516  [pdf, other

    stat.ML cs.LG math.OC

    Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD than Constant Stepsize

    Authors: Mert Gürbüzbalaban, Yuanhan Hu, Umut Şimşekli, Lingjiong Zhu

    Abstract: Cyclic and randomized stepsizes are widely used in the deep learning practice and can often outperform standard stepsize choices such as constant stepsize in SGD. Despite their empirical success, not much is currently known about when and why they can theoretically improve the generalization performance. We consider a general class of Markovian stepsizes for learning, which contain i.i.d. random s… ▽ More

    Submitted 29 August, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

    Comments: To Appear

    Journal ref: Transactions of Machine Learning Research, 2023

  23. arXiv:2301.07585  [pdf, ps, other

    math.PR

    Large deviations for the mean-field limit of Hawkes processes

    Authors: Fuqing Gao, Lingjiong Zhu

    Abstract: Hawkes processes are a class of simple point processes whose intensity depends on the past history, and is in general non-Markovian. Limit theorems for Hawkes processes in various asymptotic regimes have been studied in the literature. In this paper, we study a multidimensional nonlinear Hawkes process in the asymptotic regime when the dimension goes to infinity, whose mean-field limit is a time-i… ▽ More

    Submitted 18 January, 2023; originally announced January 2023.

    Comments: 34 pages

  24. arXiv:2301.06619  [pdf, other

    math.OC math.ST

    Distributionally Robust Learning with Weakly Convex Losses: Convergence Rates and Finite-Sample Guarantees

    Authors: Landi Zhu, Mert Gürbüzbalaban, Andrzej Ruszczyński

    Abstract: We consider a distributionally robust stochastic optimization problem and formulate it as a stochastic two-level composition optimization problem with the use of the mean--semideviation risk measure. In this setting, we consider a single time-scale algorithm, involving two versions of the inner function value tracking: linearized tracking of a continuously differentiable loss function, and SPIDER… ▽ More

    Submitted 9 June, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

  25. arXiv:2301.06450  [pdf, other

    q-fin.RM math.PR

    A delayed dual risk model

    Authors: Lingjiong Zhu

    Abstract: In this paper, we study a dual risk model with delays in the spirit of Dassios-Zhao. When a new innovation occurs, there is a delay before the innovation turns into a profit. We obtain large initial surplus asymptotics for the ruin probability and ruin time distributions. For some special cases, we get closed-form formulas. Numerical illustrations will also be provided.

    Submitted 16 January, 2023; originally announced January 2023.

    Comments: 17 pages, 2 figures, 2 tables

    Journal ref: Stochastic Models 33(1), 149-170, 2017

  26. Combinatorial Properties for a Class of Simplicial Complexes Extended from Pseudo-fractal Scale-free Web

    Authors: Zixuan Xie, Yucheng Wang, Wanyue Xu, Liwang Zhu, Wei Li, Zhongzhi Zhang

    Abstract: Simplicial complexes are a popular tool used to model higher-order interactions between elements of complex social and biological systems. In this paper, we study some combinatorial aspects of a class of simplicial complexes created by a graph product, which is an extension of the pseudo-fractal scale-free web. We determine explicitly the independence number, the domination number, and the chromat… ▽ More

    Submitted 9 January, 2023; originally announced January 2023.

    Comments: accepted by Fractals

  27. arXiv:2212.12978  [pdf, other

    math.OC cs.LG stat.ML

    Universal Gradient Descent Ascent Method for Nonconvex-Nonconcave Minimax Optimization

    Authors: Taoli Zheng, Linglingzhi Zhu, Anthony Man-Cho So, Jose Blanchet, Jia** Li

    Abstract: Nonconvex-nonconcave minimax optimization has received intense attention over the last decade due to its broad applications in machine learning. Most existing algorithms rely on one-sided information, such as the convexity (resp. concavity) of the primal (resp. dual) functions, or other specific structures, such as the Polyak-Łojasiewicz (PŁ) and Kurdyka-Łojasiewicz (KŁ) conditions. However, verif… ▽ More

    Submitted 30 October, 2023; v1 submitted 25 December, 2022; originally announced December 2022.

  28. arXiv:2212.12708  [pdf, ps, other

    math.SP math.CA

    On classification of singular matrix difference equations of mixed order

    Authors: Li Zhu, Huaqing Sun, Bing Xie

    Abstract: This paper is concerned with singular matrix difference equations of mixed order. The existence and uniqueness of initial value problems for these equations are derived, and then the classification of them is obtained with a similar classical Weyl's method by selecting a suitable quasi-difference. An equivalent characterization of this classification is given in terms of the number of linearly ind… ▽ More

    Submitted 24 December, 2022; originally announced December 2022.

    Comments: 27 pages

    MSC Class: 34B20; 39A27

  29. arXiv:2212.00363  [pdf, ps, other

    math.RA

    Braided crossed category over crossed group-cograded weak Hopf quasigroups

    Authors: Huili Liu, Lingli Zhu, Tao Yang

    Abstract: In this paper, we generalizing the main result in Liu[10] to weak Hopf coquasigroups case. We first define and study group-cograded weak Hopf quasigroups, which generalize both group-cograded Hopf quasigroups and weak Hopf group-coalgebras. Then we introduce the notion of p-Yetter-Drinfeld weak quasimodule over group-cograded weak Hopf quasigroups H. If the antipode of H is bijective, we show that… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

    Comments: 19pages

    MSC Class: 16T05; 17A01; 18M15

  30. arXiv:2211.10331  [pdf, ps, other

    math.OC math.NA

    A greedy randomized average block projection method for linear feasibility problems

    Authors: Lin Zhu, Yuan Lei, Jiaxin Xie

    Abstract: The randomized projection (RP) method is a simple iterative scheme for solving linear feasibility problems and has recently gained popularity due to its speed and low memory requirement. This paper develops an accelerated variant of the standard RP method by using two ingredients: the greedy probability criterion and the average block approach, and obtains a greedy randomized average block project… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: 21 pages

  31. arXiv:2210.03356  [pdf

    math.NA

    Two Iterative algorithms for the matrix sign function based on the adaptive filtering technology

    Authors: Feng Wu, Keqi Ye, Li Zhu, Yueling Zhao, Jiqiang Hu, Wanxie Zhong

    Abstract: In this paper, two new efficient algorithms for calculating the sign function of the large-scale sparse matrix are proposed by combining filtering algorithm with Newton method and Newton Schultz method respectively. Through the theoretical analysis of the error diffusion in the iterative process, we designed an adaptive filtering threshold, which can ensure that the filtering has little impact on… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: 18 pages,12 figures

    MSC Class: 65F30; 15A15

  32. arXiv:2209.15106  [pdf, other

    cs.LG math.OC

    Restricted Strong Convexity of Deep Learning Models with Smooth Activations

    Authors: Arindam Banerjee, Pedro Cisneros-Velarde, Libin Zhu, Mikhail Belkin

    Abstract: We consider the problem of optimization of deep learning models with smooth activation functions. While there exist influential results on the problem from the ``near initialization'' perspective, we shed considerable new light on the problem. In particular, we make two key technical contributions for such models with $L$ layers, $m$ width, and $σ_0^2$ initialization variance. First, for suitable… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

  33. arXiv:2209.10825  [pdf, other

    math.OC cs.LG

    Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual Balancing and Iteration Complexity Analysis

    Authors: Jia** Li, Linglingzhi Zhu, Anthony Man-Cho So

    Abstract: Nonconvex-nonconcave minimax optimization has gained widespread interest over the last decade. However, most existing works focus on variants of gradient descent-ascent (GDA) algorithms, which are only applicable to smooth nonconvex-concave settings. To address this limitation, we propose a novel algorithm named smoothed proximal linear descent-ascent (smoothed PLDA), which can effectively handle… ▽ More

    Submitted 26 July, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

  34. arXiv:2208.07044  [pdf, other

    stat.ME math.ST

    On minimum contrast method for multivariate spatial point processes

    Authors: Lin Zhu, Junho Yang, Mikyoung Jun, Scott Cook

    Abstract: Compared to widely used likelihood-based approaches, the minimum contrast (MC) method offers a computationally efficient method for estimation and inference of spatial point processes. These relative gains in computing time become more pronounced when analyzing complicated multivariate point process models. Despite this, there has been little exploration of the MC method for multivariate spatial p… ▽ More

    Submitted 2 July, 2024; v1 submitted 15 August, 2022; originally announced August 2022.

  35. arXiv:2208.05683  [pdf

    math.NA

    A filtering technique for the matrix power series being near-sparse

    Authors: Feng Wu, Li Zhu, Yuelin Zhao, Kailing Zhang

    Abstract: This work presents a new algorithm for matrix power series which is near-sparse, that is, there are a large number of near-zero elements in it. The proposed algorithm uses a filtering technique to improve the sparsity of the matrices involved in the calculation process of the Paterson-Stockmeyer (PS) scheme. Based on the error analysis considering the transaction error and the error introduced by… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

  36. arXiv:2206.15335  [pdf, ps, other

    cs.DC cs.DS math.ST

    Byzantine Agreement with Optimal Resilience via Statistical Fraud Detection

    Authors: Shang-En Huang, Seth Pettie, Leqi Zhu

    Abstract: Since the mid-1980s it has been known that Byzantine Agreement can be solved with probability 1 asynchronously, even against an omniscient, computationally unbounded adversary that can adaptively \emph{corrupt} up to $f<n/3$ parties. Moreover, the problem is insoluble with $f\geq n/3$ corruptions. However, Bracha's 1984 protocol achieved $f<n/3$ resilience at the cost of exponential expected laten… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

  37. arXiv:2206.10346  [pdf, other

    math.NA

    A new stable and avoiding inversion iteration for computing matrix square root

    Authors: Li Zhu, Keqi Ye, Yuelin Zhao, Feng Wu, Jiqiang Hu, Wanxie Zhong

    Abstract: The objective of this research was to compute the principal matrix square root with sparse approximation. A new stable iterative scheme avoiding fully matrix inversion (SIAI) is provided. The analysis on the sparsity and error of the matrices involved during the iterative process is given. Based on the bandwidth and error analysis, a more efficient algorithm combining the SIAI with the filtering t… ▽ More

    Submitted 21 June, 2022; originally announced June 2022.

    Comments: 19 pages, 3 figures

  38. arXiv:2205.11787  [pdf, other

    cs.LG math.OC stat.ML

    Quadratic models for understanding catapult dynamics of neural networks

    Authors: Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

    Abstract: While neural networks can be approximated by linear models as their width increases, certain properties of wide neural networks cannot be captured by linear models. In this work we show that recently proposed Neural Quadratic Models can exhibit the "catapult phase" [Lewkowycz et al. 2020] that arises when training such models with large learning rates. We then empirically show that the behaviour o… ▽ More

    Submitted 1 May, 2024; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: accepted in ICLR 2024; changed the title

  39. arXiv:2205.11786  [pdf, other

    cs.LG math.OC stat.ML

    Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture

    Authors: Libin Zhu, Chaoyue Liu, Mikhail Belkin

    Abstract: In this paper we show that feedforward neural networks corresponding to arbitrary directed acyclic graphs undergo transition to linearity as their "width" approaches infinity. The width of these general networks is characterized by the minimum in-degree of their neurons, except for the input and first layers. Our results identify the mathematical structure underlying transition to linearity and ge… ▽ More

    Submitted 7 June, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022

  40. arXiv:2205.06689  [pdf, other

    stat.ML cs.LG math.OC

    Heavy-Tail Phenomenon in Decentralized SGD

    Authors: Mert Gurbuzbalaban, Yuanhan Hu, Umut Simsekli, Kun Yuan, Lingjiong Zhu

    Abstract: Recent theoretical studies have shown that heavy-tails can emerge in stochastic optimization due to `multiplicative noise', even under surprisingly simple settings, such as linear regression with Gaussian data. While these studies have uncovered several interesting phenomena, they consider conventional stochastic optimization problems, which exclude decentralized settings that naturally arise in m… ▽ More

    Submitted 16 May, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

  41. arXiv:2204.13855  [pdf, ps, other

    math.OC

    A Sampling Control Framework and Applications to Robust and Adaptive Control

    Authors: Lijun Zhu, Zhiyong Chen

    Abstract: In this paper, we propose a novel sampling control framework based on the emulation technique where the sampling error is regarded as an auxiliary input to the emulated system. Utilizing the supremum norm of sampling error, the design of periodic sampling and event-triggered control law renders the error dynamics bounded-input-bounded-state (BIBS), and when coupled with system dynamics, achieves g… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

  42. arXiv:2201.12537  [pdf, ps, other

    math.ST

    Weighted residual empirical processes, martingale transformations and model checking for regressions

    Authors: Falong Tan, Xu Guo, Lixing Zhu

    Abstract: In this paper we propose a new methodology for testing the parametric forms of the mean and variance functions based on weighted residual empirical processes and their martingale transformations in regression models. The dimensions of the parameter vectors can be divergent as the sample size goes to infinity. We then study the convergence of weighted residual empirical processes and their martinga… ▽ More

    Submitted 29 January, 2022; originally announced January 2022.

  43. arXiv:2112.08046  [pdf, ps, other

    math.RA

    Yetter-Drinfeld modules for group-cograded Hopf quasigroups

    Authors: Huili Liu, Tao Yang, Lingli Zhu

    Abstract: Let $H$ be a crossed group-cograded Hopf quasigroup. We first introduce the notion of $p$-Yetter-Drinfeld quasimodule over $H$. If the antipode of $H$ is bijective, we show that the category $\mathscr Y\mathscr D\mathscr Q(H)$ of Yetter-Drinfeld quasimodules over $H$ is a crossed category, and the subcategory $\mathscr Y\mathscr D(H)$ of Yetter-Drinfeld modules is a braided crossed category.

    Submitted 28 December, 2021; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: 19pages. Add a mirror structure and modify some typos. Comments are welcomed

    MSC Class: 16T05; 17A01; 18M15

  44. arXiv:2112.06556  [pdf, ps, other

    math.OC stat.ML

    Orthogonal Group Synchronization with Incomplete Measurements: Error Bounds and Linear Convergence of the Generalized Power Method

    Authors: Linglingzhi Zhu, **xin Wang, Anthony Man-Cho So

    Abstract: Group synchronization refers to estimating a collection of group elements from the noisy pairwise measurements. Such a nonconvex problem has received much attention from numerous scientific fields including computer vision, robotics, and cryo-electron microscopy. In this paper, we focus on the orthogonal group synchronization problem with general additive noise models under incomplete measurements… ▽ More

    Submitted 13 December, 2021; originally announced December 2021.

  45. arXiv:2111.05294  [pdf, other

    cs.CG cs.CE math.OC

    Lattice structure design optimization under localized linear buckling constraints

    Authors: Xingtong Yang, Xinzhuo Hu, Liangchao Zhu, Ming Li

    Abstract: An optimization method for the design of multi-lattice structures satisfying local buckling constraints is proposed in this paper. First, the concept of free material optimization is introduced to find an optimal elastic tensor distribution among all feasible elastic continua. By approximating the elastic tensor under the buckling-containing constraint, a matching lattice structure is embedded in… ▽ More

    Submitted 9 November, 2021; originally announced November 2021.

    Comments: 12 pages, submitted to Computer-Aided Design

  46. arXiv:2110.15536  [pdf, other

    math.ST stat.ML

    Optimal prediction for kernel-based semi-functional linear regression

    Authors: Keli Guo, Jun Fan, Lixing Zhu

    Abstract: In this paper, we establish minimax optimal rates of convergence for prediction in a semi-functional linear model that consists of a functional component and a less smooth nonparametric component. Our results reveal that the smoother functional component can be learned with the minimax rate as if the nonparametric component were known. More specifically, a double-penalized least squares method is… ▽ More

    Submitted 29 October, 2021; originally announced October 2021.

  47. arXiv:2110.04493  [pdf

    math.NA

    High-performance computation of the exponential of a large sparse matrix

    Authors: Feng Wu, Kailing Zhang, Li Zhu, Jiayao Hu

    Abstract: Computation of the large sparse matrix exponential has been an important topic in many fields, such as network and finite-element analysis. The existing scaling and squaring algorithm (SSA) is not suitable for the computation of the large sparse matrix exponential as it requires greater memories and computational cost than is actually needed. By introducing two novel concepts, i.e., real bandwidth… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

  48. arXiv:2107.03246  [pdf, ps, other

    math.AP

    Global-in-time L p -- L q estimates for solutions of the Kramers-Fokker-Planck equation

    Authors: Xue ** Wang, Lu Zhu

    Abstract: In this work, we prove an optimal global-in-time L p --L q estimate for solutions to the Kramers-Fokker-Planck equation with short range potential in dimension three. Our result shows that the decay rate as t $\rightarrow$ +$\infty$ is the same as the heat equation in x-variables and the divergence rate as t $\rightarrow$ 0 + is related to the sub-ellipticity with loss of 1/3 derivatives of the Kr… ▽ More

    Submitted 7 July, 2021; originally announced July 2021.

  49. arXiv:2102.10346  [pdf, other

    math.OC stat.ML

    Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance

    Authors: Hongjian Wang, Mert Gürbüzbalaban, Lingjiong Zhu, Umut Şimşekli, Murat A. Erdogdu

    Abstract: Recent studies have provided both empirical and theoretical evidence illustrating that heavy tails can emerge in stochastic gradient descent (SGD) in various scenarios. Such heavy tails potentially result in iterates with diverging variance, which hinders the use of conventional convergence analysis techniques that rely on the existence of the second-order moments. In this paper, we provide conver… ▽ More

    Submitted 20 February, 2021; originally announced February 2021.

  50. arXiv:2012.05046  [pdf, other

    math.OC cs.DM cs.NE eess.SY

    A multi-objective optimization framework for on-line ridesharing systems

    Authors: Hamed Javidi, Dan Simon, Ling Zhu, Yan Wang

    Abstract: The ultimate goal of ridesharing systems is to matchtravelers who do not have a vehicle with those travelers whowant to share their vehicle. A good match can be found amongthose who have similar itineraries and time schedules. In thisway each rider can be served without any delay and also eachdriver can earn as much as possible without having too muchdeviation from their original route. We propose… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.