Search | arXiv e-print repository

Exact phylodynamic likelihood via structured Markov genealogy processes

Authors: Aaron A. King, Qianying Lin, Edward L. Ionides

Abstract: We consider genealogies arising from a Markov population process in which individuals are categorized into a discrete collection of compartments, with the requirement that individuals within the same compartment are statistically exchangeable. When equipped with a sampling process, each such population process induces a time-evolving tree-valued process defined as the genealogy of all sampled indi… ▽ More We consider genealogies arising from a Markov population process in which individuals are categorized into a discrete collection of compartments, with the requirement that individuals within the same compartment are statistically exchangeable. When equipped with a sampling process, each such population process induces a time-evolving tree-valued process defined as the genealogy of all sampled individuals. We provide a construction of this genealogy process and derive exact expressions for the likelihood of an observed genealogy in terms of filter equations. These filter equations can be numerically solved using standard Monte Carlo integration methods. Thus, we obtain statistically efficient likelihood-based inference for essentially arbitrary compartment models based on an observed genealogy of individuals sampled from the population. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.11709 [pdf, other]

Comparison of Coarsening Dynamics for the Cahn--Hilliard and Burgers--Cahn--Hilliard Equations

Authors: Peter Howard, Adam Larios, Quyuan Lin

Abstract: We consider coarsening dynamics associated with a Burgers--Cahn--Hilliard system modeling a two-phase flow in one space dimension. Our emphasis is on the effect that coupling between the phase and fluid dynamics has on coarsening rates, and on the mechanisms driving this effect. We start with a detailed examination of coarsening dynamics for the uncoupled Cahn--Hilliard equation, comparing numeric… ▽ More We consider coarsening dynamics associated with a Burgers--Cahn--Hilliard system modeling a two-phase flow in one space dimension. Our emphasis is on the effect that coupling between the phase and fluid dynamics has on coarsening rates, and on the mechanisms driving this effect. We start with a detailed examination of coarsening dynamics for the uncoupled Cahn--Hilliard equation, comparing numerically generated rates with two analytic methods, and then we consider how these dynamics are affected by appropriate coupling with a viscous Burgers equation. In order to keep the analysis as self-contained as possible, we establish the global well-posedness of the system under consideration. △ Less

Submitted 19 May, 2024; originally announced May 2024.

Comments: 55 pages, 14 figures

arXiv:2405.02152 [pdf, ps, other]

On the Three-dimensional Nernst-Planck-Boussinesq System

Authors: Elie Abdo, Ruimeng Hu, Quyuan Lin

Abstract: In this paper, we analyze a three-dimensional Nernst-Planck-Boussinesq (NPB) system that describes ionic electrodiffusion in an incompressible viscous fluid. This new model incorporates variational temperature and is forced by buoyancy force stemming from temperature and salinity fluctuations, enhancing its generality and realism. The electromigration term in the NPB system displays a complex nonl… ▽ More In this paper, we analyze a three-dimensional Nernst-Planck-Boussinesq (NPB) system that describes ionic electrodiffusion in an incompressible viscous fluid. This new model incorporates variational temperature and is forced by buoyancy force stemming from temperature and salinity fluctuations, enhancing its generality and realism. The electromigration term in the NPB system displays a complex nonlinear structure influenced by the reciprocal of the temperature that distinguishes its mathematical aspects from other electrodiffusion models studied in the literature. We address the global existence of weak solutions to the NPB system on the three-dimensional torus for large initial data. In addition, we study the long-time dynamics of these weak solutions and the associated relative entropies and establish their exponential decay in time to steady states. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: 28 pages

arXiv:2404.12597 [pdf, other]

The phase diagram of kernel interpolation in large dimensions

Authors: Haobo Zhang, Weihao Lu, Qian Lin

Abstract: The generalization ability of kernel interpolation in large dimensions (i.e., $n \asymp d^γ$ for some $γ>0$) might be one of the most interesting problems in the recent renaissance of kernel regression, since it may help us understand the 'benign overfitting phenomenon' reported in the neural networks literature. Focusing on the inner product kernel on the sphere, we fully characterized the exact… ▽ More The generalization ability of kernel interpolation in large dimensions (i.e., $n \asymp d^γ$ for some $γ>0$) might be one of the most interesting problems in the recent renaissance of kernel regression, since it may help us understand the 'benign overfitting phenomenon' reported in the neural networks literature. Focusing on the inner product kernel on the sphere, we fully characterized the exact order of both the variance and bias of large-dimensional kernel interpolation under various source conditions $s\geq 0$. Consequently, we obtained the $(s,γ)$-phase diagram of large-dimensional kernel interpolation, i.e., we determined the regions in $(s,γ)$-plane where the kernel interpolation is minimax optimal, sub-optimal and inconsistent. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 18 pages, 1 figure

arXiv:2402.01148 [pdf, other]

The Optimality of Kernel Classifiers in Sobolev Space

Authors: Jianfa Lai, Zhifan Li, Dongming Huang, Qian Lin

Abstract: Kernel methods are widely used in machine learning, especially for classification problems. However, the theoretical analysis of kernel classification is still limited. This paper investigates the statistical performances of kernel classifiers. With some mild assumptions on the conditional probability $η(x)=\mathbb{P}(Y=1\mid X=x)$, we derive an upper bound on the classification excess risk of a k… ▽ More Kernel methods are widely used in machine learning, especially for classification problems. However, the theoretical analysis of kernel classification is still limited. This paper investigates the statistical performances of kernel classifiers. With some mild assumptions on the conditional probability $η(x)=\mathbb{P}(Y=1\mid X=x)$, we derive an upper bound on the classification excess risk of a kernel classifier using recent advances in the theory of kernel regression. We also obtain a minimax lower bound for Sobolev spaces, which shows the optimality of the proposed classifier. Our theoretical results can be extended to the generalization error of overparameterized neural network classifiers. To make our theoretical results more applicable in realistic settings, we also propose a simple method to estimate the interpolation smoothness of $2η(x)-1$ and apply the method to real datasets. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: 21 pages, 2 figures

MSC Class: 62G08 (Primary); 68T07; 46E22 (secondary) ACM Class: G.3

arXiv:2401.11318 [pdf, ps, other]

Global well-posedness and enhanced dissipation for the 2D stochastic Nernst-Planck-Navier-Stokes equations with transport noise

Authors: Quyuan Lin, Rongchang Liu, Weinan Wang

Abstract: In this paper, we consider the 2D stochastic Nernst-Planck-Navier-Stokes equations with transport noise. By assuming the ionic species have the same diffusivity and opposite valences, we prove the global well-posedness of the system. Furthermore, we illustrate the enhanced dissipation phenomenon in the system with specific transportation noise by establishing that it enables an arbitrarily large e… ▽ More In this paper, we consider the 2D stochastic Nernst-Planck-Navier-Stokes equations with transport noise. By assuming the ionic species have the same diffusivity and opposite valences, we prove the global well-posedness of the system. Furthermore, we illustrate the enhanced dissipation phenomenon in the system with specific transportation noise by establishing that it enables an arbitrarily large exponential convergence rate of the solutions. △ Less

Submitted 20 January, 2024; originally announced January 2024.

Comments: 29 pages

arXiv:2401.10879 [pdf, ps, other]

Accuracy Analysis of Physics-Informed Neural Networks for Approximating the Critical SQG Equation

Authors: Elie Abdo, Ruimeng Hu, Quyuan Lin

Abstract: We systematically analyze the accuracy of Physics-Informed Neural Networks (PINNs) in approximating solutions to the critical Surface Quasi-Geostrophic (SQG) equation on two-dimensional periodic boxes. The critical SQG equation involves advection and diffusion described by nonlocal periodic operators, posing challenges for neural network-based methods that do not commonly exhibit periodic boundary… ▽ More We systematically analyze the accuracy of Physics-Informed Neural Networks (PINNs) in approximating solutions to the critical Surface Quasi-Geostrophic (SQG) equation on two-dimensional periodic boxes. The critical SQG equation involves advection and diffusion described by nonlocal periodic operators, posing challenges for neural network-based methods that do not commonly exhibit periodic boundary conditions. In this paper, we present a novel approximation of these operators using their nonperiodic analogs based on singular integral representation formulas and use it to perform error estimates. This idea can be generalized to a larger class of nonlocal partial differential equations whose solutions satisfy prescribed boundary conditions, thereby initiating a new PINNs theory for equations with nonlocalities. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: 21 pages

arXiv:2401.01599 [pdf, other]

Generalization Error Curves for Analytic Spectral Algorithms under Power-law Decay

Authors: Yicheng Li, Weiye Gan, Zuoqiang Shi, Qian Lin

Abstract: The generalization error curve of certain kernel regression method aims at determining the exact order of generalization error with various source condition, noise level and choice of the regularization parameter rather than the minimax rate. In this work, under mild assumptions, we rigorously provide a full characterization of the generalization error curves of the kernel gradient descent method… ▽ More The generalization error curve of certain kernel regression method aims at determining the exact order of generalization error with various source condition, noise level and choice of the regularization parameter rather than the minimax rate. In this work, under mild assumptions, we rigorously provide a full characterization of the generalization error curves of the kernel gradient descent method (and a large class of analytic spectral algorithms) in kernel regression. Consequently, we could sharpen the near inconsistency of kernel interpolation and clarify the saturation effects of kernel regression algorithms with higher qualification, etc. Thanks to the neural tangent kernel theory, these results greatly improve our understanding of the generalization behavior of training the wide neural networks. A novel technical contribution, the analytic functional argument, might be of independent interest. △ Less

Submitted 3 January, 2024; originally announced January 2024.

arXiv:2311.15320 [pdf, other]

Learning Coarse Propagators in Parareal Algorithm

Authors: Bangti **, Qingle Lin, Zhi Zhou

Abstract: The parareal algorithm represents an important class of parallel-in-time algorithms for solving evolution equations and has been widely applied in practice. To achieve effective speedup, the choice of the coarse propagator in the algorithm is vital. In this work, we investigate the use of learned coarse propagators. Building upon the error estimation framework, we present a systematic procedure fo… ▽ More The parareal algorithm represents an important class of parallel-in-time algorithms for solving evolution equations and has been widely applied in practice. To achieve effective speedup, the choice of the coarse propagator in the algorithm is vital. In this work, we investigate the use of learned coarse propagators. Building upon the error estimation framework, we present a systematic procedure for constructing coarse propagators that enjoy desirable stability and consistent order. Additionally, we provide preliminary mathematical guarantees for the resulting parareal algorithm. Numerical experiments on a variety of settings, e.g., linear diffusion model, Allen-Cahn model, and viscous Burgers model, show that learning can significantly improve parallel efficiency when compared with the more ad hoc choice of some conventional and widely used coarse propagators. △ Less

Submitted 26 November, 2023; originally announced November 2023.

Comments: 24 pages

arXiv:2310.20484 [pdf, ps, other]

On the Long-time Dynamics and Ergodicity of the Stochastic Nernst-Planck-Navier-Stokes System

Authors: Elie Abdo, Ruimeng Hu, Quyuan Lin

Abstract: We consider an electrodiffusion model that describes the intricate interplay of multiple ionic species with a two-dimensional, incompressible, viscous fluid subjected to stochastic additive noise. This system involves nonlocal nonlinear drift-diffusion Nernst-Planck equations for ionic species and stochastic Navier-Stokes equations for fluid motion under the influence of electric and time-independ… ▽ More We consider an electrodiffusion model that describes the intricate interplay of multiple ionic species with a two-dimensional, incompressible, viscous fluid subjected to stochastic additive noise. This system involves nonlocal nonlinear drift-diffusion Nernst-Planck equations for ionic species and stochastic Navier-Stokes equations for fluid motion under the influence of electric and time-independent forces. Under the selective boundary conditions imposed on the concentrations, we establish the existence and uniqueness of global pathwise solutions to this system on smooth bounded domains. Our study also investigates long-time ionic concentration dynamics and explores Feller properties of the associated Markovian semigroup. In the context of equal diffusive species and under appropriate conditions, we demonstrate the existence of invariant ergodic measures supported on $H^2$. We then enhance the ergodicity results on periodic tori and obtain smooth invariant measures under a constraint on the initial spatial averages of the concentrations. The uniqueness of the invariant measures on periodic boxes and smooth bounded domains is further established when the noise forces sufficient modes, and the diffusivities of the species are large. Finally, in the case of two ionic species with equal diffusivities and valences of $1$ and $-1$, we study the rate of convergence of the Markov transition kernels to the invariant measure and obtain unconditional, unique exponential ergodicity for the model. △ Less

Submitted 31 October, 2023; originally announced October 2023.

Comments: 48 pages

arXiv:2310.10993 [pdf, ps, other]

Deterministic and Stochastic Accelerated Gradient Method for Convex Semi-Infinite Optimization

Authors: Yao Yao, Qihang Lin, Tianbao Yang

Abstract: This paper explores numerical methods for solving a convex differentiable semi-infinite program. We introduce a primal-dual gradient method which performs three updates iteratively: a momentum gradient ascend step to update the constraint parameters, a momentum gradient ascend step to update the dual variables, and a gradient descend step to update the primal variables. Our approach also extends t… ▽ More This paper explores numerical methods for solving a convex differentiable semi-infinite program. We introduce a primal-dual gradient method which performs three updates iteratively: a momentum gradient ascend step to update the constraint parameters, a momentum gradient ascend step to update the dual variables, and a gradient descend step to update the primal variables. Our approach also extends to scenarios where gradients and function values are accessible solely through stochastic oracles. This method extends the recent primal-dual methods, for example, Hamedani and Aybat (2021); Boob et al. (2022), for optimization with a finite number of constraints. We show the iteration complexity of the proposed method for finding an $ε$-optimal solution under different convexity and concavity assumptions on the functions. △ Less

Submitted 17 October, 2023; originally announced October 2023.

arXiv:2309.13337 [pdf, other]

On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay

Authors: Yicheng Li, Haobo Zhang, Qian Lin

Abstract: The widely observed 'benign overfitting phenomenon' in the neural network literature raises the challenge to the 'bias-variance trade-off' doctrine in the statistical learning theory. Since the generalization ability of the 'lazy trained' over-parametrized neural network can be well approximated by that of the neural tangent kernel regression, the curve of the excess risk (namely, the learning cur… ▽ More The widely observed 'benign overfitting phenomenon' in the neural network literature raises the challenge to the 'bias-variance trade-off' doctrine in the statistical learning theory. Since the generalization ability of the 'lazy trained' over-parametrized neural network can be well approximated by that of the neural tangent kernel regression, the curve of the excess risk (namely, the learning curve) of kernel ridge regression attracts increasing attention recently. However, most recent arguments on the learning curve are heuristic and are based on the 'Gaussian design' assumption. In this paper, under mild and more realistic assumptions, we rigorously provide a full characterization of the learning curve: elaborating the effect and the interplay of the choice of the regularization parameter, the source condition and the noise. In particular, our results suggest that the 'benign overfitting phenomenon' exists in very wide neural networks only when the noise level is small. △ Less

Submitted 23 September, 2023; originally announced September 2023.

arXiv:2309.06952 [pdf, ps, other]

Anisotropic Viscosities Estimation for the Stochastic Primitive Equations

Authors: Igor Cialenco, Ruimeng Hu, Quyuan Lin

Abstract: The viscosity parameters plays a fundamental role in applications involving stochastic primitive equations (SPE), such as accurate weather predictions, climate modeling, and ocean current simulations. In this paper, we develop several novel estimators for the anisotropic viscosities in the SPE, using finite number of Fourier modes of a single sample path observed within a finite time interval. The… ▽ More The viscosity parameters plays a fundamental role in applications involving stochastic primitive equations (SPE), such as accurate weather predictions, climate modeling, and ocean current simulations. In this paper, we develop several novel estimators for the anisotropic viscosities in the SPE, using finite number of Fourier modes of a single sample path observed within a finite time interval. The focus is on analyzing the consistency and asymptotic normality of these estimators. We consider a torus domain and treat strong, pathwise solutions in the presence of additive white noise (in time). Notably, the analysis for estimating horizontal and vertical viscosities differs due to the unique structure of the SPE, as well as the fact that both parameters of interest are next to the highest order derivative. To the best of our knowledge, this is the first work addressing the estimation of anisotropic viscosities, with potential applicability of the methodology to other modeling. △ Less

Submitted 13 September, 2023; originally announced September 2023.

arXiv:2309.04268 [pdf, other]

Optimal Rate of Kernel Regression in Large Dimensions

Authors: Weihao Lu, Haobo Zhang, Yicheng Li, Manyun Xu, Qian Lin

Abstract: We perform a study on kernel regression for large-dimensional data (where the sample size $n$ is polynomially depending on the dimension $d$ of the samples, i.e., $n\asymp d^γ$ for some $γ>0$ ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity $\varepsilon_{n}^{2}$ and the metr… ▽ More We perform a study on kernel regression for large-dimensional data (where the sample size $n$ is polynomially depending on the dimension $d$ of the samples, i.e., $n\asymp d^γ$ for some $γ>0$ ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity $\varepsilon_{n}^{2}$ and the metric entropy $\bar{\varepsilon}_{n}^{2}$ respectively. When the target function falls into the RKHS associated with a (general) inner product model defined on $\mathbb{S}^{d}$, we utilize the new tool to show that the minimax rate of the excess risk of kernel regression is $n^{-1/2}$ when $n\asymp d^γ$ for $γ=2, 4, 6, 8, \cdots$. We then further determine the optimal rate of the excess risk of kernel regression for all the $γ>0$ and find that the curve of optimal rate varying along $γ$ exhibits several new phenomena including the multiple descent behavior and the periodic plateau behavior. As an application, For the neural tangent kernel (NTK), we also provide a similar explicit description of the curve of optimal rate. As a direct corollary, we know these claims hold for wide neural networks as well. △ Less

Submitted 28 June, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

MSC Class: 62G08; 46E22; 68T07

arXiv:2307.07605 [pdf, ps, other]

First-order Methods for Affinely Constrained Composite Non-convex Non-smooth Problems: Lower Complexity Bound and Near-optimal Methods

Authors: Wei Liu, Qihang Lin, Yangyang Xu

Abstract: Many recent studies on first-order methods (FOMs) focus on \emph{composite non-convex non-smooth} optimization with linear and/or nonlinear function constraints. Upper (or worst-case) complexity bounds have been established for these methods. However, little can be claimed about their optimality as no lower bound is known, except for a few special \emph{smooth non-convex} cases. In this paper, we… ▽ More Many recent studies on first-order methods (FOMs) focus on \emph{composite non-convex non-smooth} optimization with linear and/or nonlinear function constraints. Upper (or worst-case) complexity bounds have been established for these methods. However, little can be claimed about their optimality as no lower bound is known, except for a few special \emph{smooth non-convex} cases. In this paper, we make the first attempt to establish lower complexity bounds of FOMs for solving a class of composite non-convex non-smooth optimization with linear constraints. Assuming two different first-order oracles, we establish lower complexity bounds of FOMs to produce a (near) $ε$-stationary point of a problem (and its reformulation) in the considered problem class, for any given tolerance $ε>0$. In addition, we present an inexact proximal gradient (IPG) method by using the more relaxed one of the two assumed first-order oracles. The oracle complexity of the proposed IPG, to find a (near) $ε$-stationary point of the considered problem and its reformulation, matches our established lower bounds up to a logarithmic factor. Therefore, our lower complexity bounds and the proposed IPG method are almost non-improvable. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: Key words: non-convex optimization, non-smooth optimization, first-order methods, proximal gradient method, information-based complexity, lower complexity bound, worst-case complexity

arXiv:2307.02777 [pdf, other]

On the Optimality of Functional Sliced Inverse Regression

Authors: Rui Chen, Songtao Tian, Dongming Huang, Qian Lin, Jun S. Liu

Abstract: In this paper, we prove that functional sliced inverse regression (FSIR) achieves the optimal (minimax) rate for estimating the central space in functional sufficient dimension reduction problems. First, we provide a concentration inequality for the FSIR estimator of the covariance of the conditional mean, i.e., $\var(\E[\boldsymbol{X}\mid Y])$. Based on this inequality, we establish the root-$n$… ▽ More In this paper, we prove that functional sliced inverse regression (FSIR) achieves the optimal (minimax) rate for estimating the central space in functional sufficient dimension reduction problems. First, we provide a concentration inequality for the FSIR estimator of the covariance of the conditional mean, i.e., $\var(\E[\boldsymbol{X}\mid Y])$. Based on this inequality, we establish the root-$n$ consistency of the FSIR estimator of the image of $\var(\E[\boldsymbol{X}\mid Y])$. Second, we apply the most widely used truncated scheme to estimate the inverse of the covariance operator and identify the truncation parameter which ensures that FSIR can achieve the optimal minimax convergence rate for estimating the central space. Finally, we conduct simulations to demonstrate the optimal choice of truncation parameter and the estimation efficiency of FSIR. To the best of our knowledge, this is the first paper to rigorously prove the minimax optimality of FSIR in estimating the central space for multiple-index models and general $Y$ (not necessarily discrete). △ Less

Submitted 6 July, 2023; originally announced July 2023.

arXiv:2306.05054 [pdf, ps, other]

On a conjecture of Conlon, Fox and Wigderson

Authors: Chunchao Fan, Qizhong Lin, Yuanhui Yan

Abstract: For graphs $G$ and $H$, the Ramsey number $r(G,H)$ is the smallest positive integer $N$ such that any red/blue edge coloring of the complete graph $K_N$ contains either a red $G$ or a blue $H$. A book $B_n$ is a graph consisting of $n$ triangles all sharing a common edge. Recently, Conlon, Fox and Wigderson conjectured that for any $0<α<1$, the random lower bound… ▽ More For graphs $G$ and $H$, the Ramsey number $r(G,H)$ is the smallest positive integer $N$ such that any red/blue edge coloring of the complete graph $K_N$ contains either a red $G$ or a blue $H$. A book $B_n$ is a graph consisting of $n$ triangles all sharing a common edge. Recently, Conlon, Fox and Wigderson conjectured that for any $0<α<1$, the random lower bound $r(B_{\lceilαn\rceil},B_n)\ge (\sqrtα+1)^2n+o(n)$ is not tight. In other words, there exists some constant $β>(\sqrtα+1)^2$ such that $r(B_{\lceilαn\rceil},B_n)\ge βn$ for all sufficiently large $n$. This conjecture holds for every $α< 1/6$ by a result of Nikiforov and Rousseau from 2005, which says that in this range $r(B_{\lceilαn\rceil},B_n)=2n+3$ for all sufficiently large $n$. We disprove the conjecture of Conlon, Fox and Wigderson. Indeed, we show that the random lower bound is asymptotically tight for every $1/4\leq α\leq 1$. Moreover, we show that for any $1/6\leq α\le 1/4$ and large $n$, $r(B_{\lceilαn\rceil}, B_n)\le\left(\frac 32+3α\right) n+o(n)$, where the inequality is asymptotically tight when $α=1/6$ or $1/4$. We also give a lower bound of $r(B_{\lceilαn\rceil}, B_n)$ for $1/6\leα< \frac{52-16\sqrt{3}}{121}\approx0.2007$, showing that the random lower bound is not tight, i.e., the conjecture of Conlon, Fox and Wigderson holds in this interval. △ Less

Submitted 25 January, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

Comments: 16 pages

MSC Class: 05D10

arXiv:2305.14055 [pdf, other]

The Ensemble Approach of Column Generation for Solving Cutting Stock Problems

Authors: Mingjie Hu, Jie Yan, Liting Chen, Qingwei Lin

Abstract: This paper investigates the column generation (CG) for solving cutting stock problems (CSP). Traditional CG method, which repeatedly solves a restricted master problem (RMP), often suffers from two critical issues in practice -- the loss of solution quality introduced by linear relaxation of both feasible domain and objective and the high time cost of last iterations close to convergence. We empir… ▽ More This paper investigates the column generation (CG) for solving cutting stock problems (CSP). Traditional CG method, which repeatedly solves a restricted master problem (RMP), often suffers from two critical issues in practice -- the loss of solution quality introduced by linear relaxation of both feasible domain and objective and the high time cost of last iterations close to convergence. We empirically find that the first issue is common in ordinary CSPs with linear cutting constraints, while the second issue is especially severe in CSPs with nonlinear cutting constraints that are often generated by approximating chance constraints. We propose an alternative approach, ensembles of multiple column generation processes. In particular, we present two methods -- \mc (multi-column) which return multiple feasible columns in each RMP iteration, and \mt (multi-path) which restarts the RMP iterations from different initialized column sets once the iteration time exceeds a given time limit. The ideas behind are same: leverage the multiple column generation pathes to compensate the loss induced by relaxation, and add earlier sub-optimal columns to accelerate convergence of RMP iterations. Besides, we give theoretical analysis on performance improvement guarantees. Experiments on cutting stock problems demonstrate that compared to traditional CG, our method achieves significant run-time reduction on CSPs with nonlinear constraints, and dramatically improves the ratio of solve-to-optimal on CSPs with linear constraints. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: 11 pages

MSC Class: 90-00; 90C11 ACM Class: G.1.6

arXiv:2305.11912 [pdf, other]

Follow the Sun and Go with the Wind: Carbon Footprint Optimized Timely E-Truck Transportation

Authors: Junyan Su, Qiulin Lin, Minghua Chen

Abstract: We study the carbon footprint optimization (CFO) of a heavy-duty e-truck traveling from an origin to a destination across a national highway network subject to a hard deadline, by optimizing path planning, speed planning, and intermediary charging planning. Such a CFO problem is essential for carbon-friendly e-truck operations. However, it is notoriously challenging to solve due to (i) the hard de… ▽ More We study the carbon footprint optimization (CFO) of a heavy-duty e-truck traveling from an origin to a destination across a national highway network subject to a hard deadline, by optimizing path planning, speed planning, and intermediary charging planning. Such a CFO problem is essential for carbon-friendly e-truck operations. However, it is notoriously challenging to solve due to (i) the hard deadline constraint, (ii) positive battery state-of-charge constraints, (iii) non-convex carbon footprint objective, and (iv) enormous geographical and temporal charging options with diverse carbon intensity. Indeed, we show that the CFO problem is NP-hard. As a key contribution, we show that under practical settings it is equivalent to finding a generalized restricted shortest path on a stage-expanded graph, which extends the original transportation graph to model charging options. Compared to alternative approaches, our formulation incurs low model complexity and reveals a problem structure useful for algorithm design. We exploit the insights to develop an efficient dual-subgradient algorithm that always converges. As another major contribution, we prove that (i) each iteration only incurs polynomial-time complexity, albeit it requires solving an integer charging planning problem optimally, and (ii) the algorithm generates optimal results if a condition is met and solutions with bounded optimality loss otherwise. Extensive simulations based on real-world traces show that our scheme reduces up to 28% carbon footprint compared to baseline alternatives. The results also demonstrate that e-truck reduces 56% carbon footprint than internal combustion engine trucks. △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2305.07241 [pdf, other]

On the Optimality of Misspecified Kernel Ridge Regression

Authors: Haobo Zhang, Yicheng Li, Weihao Lu, Qian Lin

Abstract: In the misspecified kernel ridge regression problem, researchers usually assume the underground true function $f_ρ^{*} \in [\mathcal{H}]^{s}$, a less-smooth interpolation space of a reproducing kernel Hilbert space (RKHS) $\mathcal{H}$ for some $s\in (0,1)$. The existing minimax optimal results require $\|f_ρ^{*}\|_{L^{\infty}}<\infty$ which implicitly requires $s > α_{0}$ where $α_{0}\in (0,1)$ i… ▽ More In the misspecified kernel ridge regression problem, researchers usually assume the underground true function $f_ρ^{*} \in [\mathcal{H}]^{s}$, a less-smooth interpolation space of a reproducing kernel Hilbert space (RKHS) $\mathcal{H}$ for some $s\in (0,1)$. The existing minimax optimal results require $\|f_ρ^{*}\|_{L^{\infty}}<\infty$ which implicitly requires $s > α_{0}$ where $α_{0}\in (0,1)$ is the embedding index, a constant depending on $\mathcal{H}$. Whether the KRR is optimal for all $s\in (0,1)$ is an outstanding problem lasting for years. In this paper, we show that KRR is minimax optimal for any $s\in (0,1)$ when the $\mathcal{H}$ is a Sobolev RKHS. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: 23 pages, 6 figures, The Fortieth International Conference on Machine Learning. arXiv admin note: substantial text overlap with arXiv:2303.14942

arXiv:2305.04340 [pdf, other]

Sliced Inverse Regression with Large Structural Dimensions

Authors: Dongming Huang, Songtao Tian, Qian Lin

Abstract: The central space of a joint distribution $(\vX,Y)$ is the minimal subspace $\mathcal S$ such that $Y\perp\hspace{-2mm}\perp \vX \mid P_{\mathcal S}\vX$ where $P_{\mathcal S}$ is the projection onto $\mathcal S$. Sliced inverse regression (SIR), one of the most popular methods for estimating the central space, often performs poorly when the structural dimension… ▽ More The central space of a joint distribution $(\vX,Y)$ is the minimal subspace $\mathcal S$ such that $Y\perp\hspace{-2mm}\perp \vX \mid P_{\mathcal S}\vX$ where $P_{\mathcal S}$ is the projection onto $\mathcal S$. Sliced inverse regression (SIR), one of the most popular methods for estimating the central space, often performs poorly when the structural dimension $d=\operatorname{dim}\left( \mathcal S \right)$ is large (e.g., $\geqs 5$). In this paper, we demonstrate that the generalized signal-noise-ratio (gSNR) tends to be extremely small for a general multiple-index model when $d$ is large. Then we determine the minimax rate for estimating the central space over a large class of high dimensional distributions with a large structural dimension $d$ (i.e., there is no constant upper bound on $d$) in the low gSNR regime. This result not only extends the existing minimax rate results for estimating the central space of distributions with fixed $d$ to that with a large $d$, but also clarifies that the degradation in SIR performance is caused by the decay of signal strength. The technical tools developed here might be of independent interest for studying other central space estimation methods. △ Less

Submitted 7 May, 2023; originally announced May 2023.

Comments: 63 pages,44 figures

MSC Class: Primary 62J02; secondary 62C20; 62H12

arXiv:2303.15809 [pdf, ps, other]

doi 10.1093/biomet/asad048

Kernel interpolation generalizes poorly

Authors: Yicheng Li, Haobo Zhang, Qian Lin

Abstract: One of the most interesting problems in the recent renaissance of the studies in kernel regression might be whether the kernel interpolation can generalize well, since it may help us understand the `benign overfitting henomenon' reported in the literature on deep networks. In this paper, under mild conditions, we show that for any $\varepsilon>0$, the generalization error of kernel interpolation i… ▽ More One of the most interesting problems in the recent renaissance of the studies in kernel regression might be whether the kernel interpolation can generalize well, since it may help us understand the `benign overfitting henomenon' reported in the literature on deep networks. In this paper, under mild conditions, we show that for any $\varepsilon>0$, the generalization error of kernel interpolation is lower bounded by $Ω(n^{-\varepsilon})$. In other words, the kernel interpolation generalizes poorly for a large class of kernels. As a direct corollary, we can show that overfitted wide neural networks defined on the sphere generalize poorly. △ Less

Submitted 1 August, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

arXiv:2303.14942 [pdf, other]

On the Optimality of Misspecified Spectral Algorithms

Authors: Haobo Zhang, Yicheng Li, Qian Lin

Abstract: In the misspecified spectral algorithms problem, researchers usually assume the underground true function $f_ρ^{*} \in [\mathcal{H}]^{s}$, a less-smooth interpolation space of a reproducing kernel Hilbert space (RKHS) $\mathcal{H}$ for some $s\in (0,1)$. The existing minimax optimal results require $\|f_ρ^{*}\|_{L^{\infty}}<\infty$ which implicitly requires $s > α_{0}$ where $α_{0}\in (0,1)$ is th… ▽ More In the misspecified spectral algorithms problem, researchers usually assume the underground true function $f_ρ^{*} \in [\mathcal{H}]^{s}$, a less-smooth interpolation space of a reproducing kernel Hilbert space (RKHS) $\mathcal{H}$ for some $s\in (0,1)$. The existing minimax optimal results require $\|f_ρ^{*}\|_{L^{\infty}}<\infty$ which implicitly requires $s > α_{0}$ where $α_{0}\in (0,1)$ is the embedding index, a constant depending on $\mathcal{H}$. Whether the spectral algorithms are optimal for all $s\in (0,1)$ is an outstanding problem lasting for years. In this paper, we show that spectral algorithms are minimax optimal for any $α_{0}-\frac{1}β < s < 1$, where $β$ is the eigenvalue decay rate of $\mathcal{H}$. We also give several classes of RKHSs whose embedding index satisfies $ α_0 = \frac{1}β $. Thus, the spectral algorithms are minimax optimal for all $s\in (0,1)$ on these RKHSs. △ Less

Submitted 8 August, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

Comments: 48 pages, 2 figures

arXiv:2303.09768 [pdf, ps, other]

Global existence for the stochastic Boussinesq equations with transport noise and small rough data

Authors: Quyuan Lin, Rongchang Liu, Weinan Wang

Abstract: In this paper, we consider the stochastic Boussinesq equations on $\mathbb T^3$ with transport noise and rough initial data. We first prove the existence and uniqueness of the local pathwise solution with initial data in $L^p(Ω;L^p)$ for $p>5$. By assuming additional smallness on the initial data and the noise, we establish the global existence of the pathwise solution. In this paper, we consider the stochastic Boussinesq equations on $\mathbb T^3$ with transport noise and rough initial data. We first prove the existence and uniqueness of the local pathwise solution with initial data in $L^p(Ω;L^p)$ for $p>5$. By assuming additional smallness on the initial data and the noise, we establish the global existence of the pathwise solution. △ Less

Submitted 22 March, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

Comments: 24 pages

arXiv:2302.05835 [pdf, ps, other]

Sharp Ramsey thresholds for large books

Authors: Qizhong Lin, Ye Wang

Abstract: For graphs $G$ and $H$, let $G\to H$ signify that any red/blue edge coloring of $G$ contains a monochromatic $H$. Let $G(N,p)$ be the random graph of order $N$ and edge probability $p$. The sharp thresholds for Ramsey properties seemed out of hand until a general technique was introduced by Friedgut ({\em J. AMS} 12 (1999), 1017--1054). In this paper, we obtain the sharp Ramsey threshold for the b… ▽ More For graphs $G$ and $H$, let $G\to H$ signify that any red/blue edge coloring of $G$ contains a monochromatic $H$. Let $G(N,p)$ be the random graph of order $N$ and edge probability $p$. The sharp thresholds for Ramsey properties seemed out of hand until a general technique was introduced by Friedgut ({\em J. AMS} 12 (1999), 1017--1054). In this paper, we obtain the sharp Ramsey threshold for the book graph $B_n^{(k)}$, which consists of $n$ copies of $K_{k+1}$ all sharing a common $K_k$. In particular, for every fixed integer $k\ge 1$ and for any real $c>1$, let $N=c2^k n$. Then for any real $γ>0$, \[ \lim_{n\to \infty} \Pr(G(N,p)\to B_n^{(k)})= \left\{ \begin{array}{cl} 0 & \mbox{if $p\le\frac{1}{c^{1/k}}(1-γ)$,} \\ 1 & \mbox{if $p\ge\frac{1}{c^{1/k}}(1+γ)$}. \end{array} \right. \] The sharp Ramsey threshold $\frac{1}{c^{1/k}}$ for $B_n^{(k)}$, e.g. a star, is positive although its edge density tends to zero. △ Less

Submitted 11 February, 2023; originally announced February 2023.

Comments: 13 pages

arXiv:2301.13314 [pdf, other]

Oracle Complexity of Single-Loop Switching Subgradient Methods for Non-Smooth Weakly Convex Functional Constrained Optimization

Authors: Yankun Huang, Qihang Lin

Abstract: We consider a non-convex constrained optimization problem, where the objective function is weakly convex and the constraint function is either convex or weakly convex. To solve this problem, we consider the classical switching subgradient method, which is an intuitive and easily implementable first-order method whose oracle complexity was only known for convex problems. This paper provides the fir… ▽ More We consider a non-convex constrained optimization problem, where the objective function is weakly convex and the constraint function is either convex or weakly convex. To solve this problem, we consider the classical switching subgradient method, which is an intuitive and easily implementable first-order method whose oracle complexity was only known for convex problems. This paper provides the first analysis on the oracle complexity of the switching subgradient method for finding a nearly stationary point of non-convex problems. Our results are derived separately for convex and weakly convex constraints. Compared to existing approaches, especially the double-loop methods, the switching gradient method can be applied to non-smooth problems and achieves the same complexity using only a single loop, which saves the effort on tuning the number of inner iterations. △ Less

Submitted 28 October, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: This work is published in the proceedings of NeurIPS 2023

arXiv:2301.07810 [pdf, other]

Pathwise Solutions for Stochastic Hydrostatic Euler Equations and Hydrostatic Navier-Stokes Equations Under the Local Rayleigh Condition

Authors: Ruimeng Hu, Quyuan Lin

Abstract: Stochastic factors are not negligible in applications of hydrostatic Euler equations (EE) and hydrostatic Navier-Stokes equations (NSE). Compared with the deterministic cases for which the ill-posedness of these models in the Sobolev spaces can be overcome by imposing the local Rayleigh condition on the initial data, the studies on the well-posedness of stochastic models are still limited. In this… ▽ More Stochastic factors are not negligible in applications of hydrostatic Euler equations (EE) and hydrostatic Navier-Stokes equations (NSE). Compared with the deterministic cases for which the ill-posedness of these models in the Sobolev spaces can be overcome by imposing the local Rayleigh condition on the initial data, the studies on the well-posedness of stochastic models are still limited. In this paper, we consider the initial data to be a random variable in a certain Sobolev space and satisfy the local Rayleigh condition. We establish the local in time existence and uniqueness of maximal pathwise solutions to the stochastic hydrostatic EE and hydrostatic NSE with multiplicative noise. Compared with previous results on these models (e.g., the existence of martingale solutions in the analytic spaces), our work gives the first result about the existence and uniqueness of solutions to these models in Sobolev spaces, and presents the first result showing the existence of pathwise solutions. △ Less

Submitted 18 January, 2023; originally announced January 2023.

Comments: 54 pages, 1 figure

arXiv:2212.12603 [pdf, ps, other]

Stochastic Methods for AUC Optimization subject to AUC-based Fairness Constraints

Authors: Yao Yao, Qihang Lin, Tianbao Yang

Abstract: As machine learning being used increasingly in making high-stakes decisions, an arising challenge is to avoid unfair AI systems that lead to discriminatory decisions for protected population. A direct approach for obtaining a fair predictive model is to train the model through optimizing its prediction performance subject to fairness constraints, which achieves Pareto efficiency when trading off p… ▽ More As machine learning being used increasingly in making high-stakes decisions, an arising challenge is to avoid unfair AI systems that lead to discriminatory decisions for protected population. A direct approach for obtaining a fair predictive model is to train the model through optimizing its prediction performance subject to fairness constraints, which achieves Pareto efficiency when trading off performance against fairness. Among various fairness metrics, the ones based on the area under the ROC curve (AUC) are emerging recently because they are threshold-agnostic and effective for unbalanced data. In this work, we formulate the training problem of a fairness-aware machine learning model as an AUC optimization problem subject to a class of AUC-based fairness constraints. This problem can be reformulated as a min-max optimization problem with min-max constraints, which we solve by stochastic first-order methods based on a new Bregman divergence designed for the special structure of the problem. We numerically demonstrate the effectiveness of our approach on real-world data under different fairness metrics. △ Less

Submitted 22 February, 2023; v1 submitted 23 December, 2022; originally announced December 2022.

Comments: Published in AISTATS 2023

arXiv:2212.07234 [pdf, ps, other]

Two Ramsey-Turán numbers involving triangles

Authors: Xinyu Hu, Qizhong Lin

Abstract: Given integers $p, q\ge2$, we say that a graph $G$ is $(K_p,K_q)$-free if there exists a red/blue edge coloring of $G$ such that it contains neither a red $K_p$ nor a blue $K_q$. Fix a function $f( n )$, the Ramsey-Turán number $RT( {n,p,q,f( n ))} $ is the maximum number of edges in an $n$-vertex $(K_p,K_q)$-free graph with independence number at most $f( n )$. For any $δ>0$, let… ▽ More Given integers $p, q\ge2$, we say that a graph $G$ is $(K_p,K_q)$-free if there exists a red/blue edge coloring of $G$ such that it contains neither a red $K_p$ nor a blue $K_q$. Fix a function $f( n )$, the Ramsey-Turán number $RT( {n,p,q,f( n ))} $ is the maximum number of edges in an $n$-vertex $(K_p,K_q)$-free graph with independence number at most $f( n )$. For any $δ>0$, let $ρ(p, q,δ) = \mathop {\lim }\limits_{n \to \infty } \frac{RT(n,p, q,δn)}{n^2}$. We always call $ρ(p, q):= \mathop {\lim }\limits_{δ\to 0}ρ(p, q,δ)$ the Ramsey-Turán density of $K_p$ and $K_q$. In 1993, Erdős, Hajnal, Simonovits, Sós and Szemerédi proposed to determine the value of $ρ(3,q)$ for $q\ge3$, and they conjectured that for $q \ge 2$, $ρ\left( {3,2q - 1} \right) = \frac{1}{2}(1 - \frac{1}{r(3,q) - 1})$. Recently, Kim, Kim and Liu (2019) conjectured that for $q \ge 2$, $ρ( {3,2q } ) = \frac{1}{2}( 1 - \frac{1}{r( {3,q} )})$. Erdős et al. (1993) determined $ρ(3,q)$ for $q=3,4,5$ and $ρ(4,4)$. There is no progress on the Ramsey-Turán density $ρ(p, q)$ in the past thirty years. In this paper, we obtain $ρ(3,6)=\frac{5}{12}$ and $ρ(3,7)=\frac{7}{16}$. Moreover, we show that the corresponding asymptotically extremal structures are weakly stable, which answers a problem of Erdős et al. (1993) for the two cases. △ Less

Submitted 14 June, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

Comments: 30 pages. The proofs have been slightly revised, especially the second part

arXiv:2210.13998 [pdf, ps, other]

Ramsey numbers of large even cycles and fans

Authors: Chunlin You, Qizhong Lin

Abstract: For graphs $F$ and $H$, the Ramsey number $R(F, H)$ is the smallest positive integer $N$ such that any red/blue edge coloring of $K_N$ contains either a red $F$ or a blue $H$. Let $C_n$ be a cycle of length $n$ and $F_n$ be a fan consisting of $n$ triangles all sharing a common vertex. In this paper, we prove that for all sufficiently large $n$, \[ R(C_{2\lfloor an\rfloor}, F_n)= \left\{ \begin{ar… ▽ More For graphs $F$ and $H$, the Ramsey number $R(F, H)$ is the smallest positive integer $N$ such that any red/blue edge coloring of $K_N$ contains either a red $F$ or a blue $H$. Let $C_n$ be a cycle of length $n$ and $F_n$ be a fan consisting of $n$ triangles all sharing a common vertex. In this paper, we prove that for all sufficiently large $n$, \[ R(C_{2\lfloor an\rfloor}, F_n)= \left\{ \begin{array}{ll} (2+2a+o(1))n & \textrm{if $1/2\leq a< 1$,}\\ (4a+o(1))n & \textrm{if $ a\geq 1$.} \end{array} \right. \] △ Less

Submitted 25 October, 2022; originally announced October 2022.

arXiv:2210.06800 [pdf, ps, other]

Dirichlet problem for Schrödinger operators on Heisenberg groups

Authors: Ji Li, Qingze Lin, Liang Song

Abstract: We investigate the Dirichlet problem associated to the Schrödinger operator $\mathcal L=-Δ_{\mathbb{H}^n}+V$ on Heisenberg group $\mathbb H^n$: \begin{align*} \begin{cases} \partial_{ss}u(g,s)-\mathcal L u(g,s)=0\,,\quad &{\rm in \,\ } \mathbb{H}^n\times\mathbb{R}^+,\\ u(g,0)=f \,,\quad &{\rm on \,\ } \mathbb{H}^n \end{cases} \end{align*} with $f$ in $L^p(\mathbb{H}^n)$ ($1< p<\infty$) and in… ▽ More We investigate the Dirichlet problem associated to the Schrödinger operator $\mathcal L=-Δ_{\mathbb{H}^n}+V$ on Heisenberg group $\mathbb H^n$: \begin{align*} \begin{cases} \partial_{ss}u(g,s)-\mathcal L u(g,s)=0\,,\quad &{\rm in \,\ } \mathbb{H}^n\times\mathbb{R}^+,\\ u(g,0)=f \,,\quad &{\rm on \,\ } \mathbb{H}^n \end{cases} \end{align*} with $f$ in $L^p(\mathbb{H}^n)$ ($1< p<\infty$) and in $H^1_{\mathcal L}(\mathbb{H}^n)$, i.e., the Hardy space associated with $\mathcal L$. Here $Δ_{\mathbb{H}^n}$ is the sub-Laplacian on $\mathbb H^n$ and the nonnegative potential $V$ belongs to the reverse Hölder class $B_{Q/2}$ with $Q$ the homogeneous dimension of $\mathbb{H}^n$. The new approach is to establish a suitable weak maximum principle, which is the key to solve this problem under the condition $V\in B_{Q/2}$. This result is new even back to $\mathbb R^n$ (the condition will become $V\in B_{n/2}$) since the previous known result requires $V\in B_{(n+1)/2}$ which went through a Liouville type theorem. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Comments: 17 pages

arXiv:2209.11929 [pdf, other]

Higher-Order error estimates for physics-informed neural networks approximating the primitive equations

Authors: Ruimeng Hu, Quyuan Lin, Alan Raydan, Sui Tang

Abstract: Large-scale dynamics of the oceans and the atmosphere are governed by primitive equations (PEs). Due to the nonlinearity and nonlocality, the numerical study of the PEs is generally challenging. Neural networks have been shown to be a promising machine learning tool to tackle this challenge. In this work, we employ physics-informed neural networks (PINNs) to approximate the solutions to the PEs an… ▽ More Large-scale dynamics of the oceans and the atmosphere are governed by primitive equations (PEs). Due to the nonlinearity and nonlocality, the numerical study of the PEs is generally challenging. Neural networks have been shown to be a promising machine learning tool to tackle this challenge. In this work, we employ physics-informed neural networks (PINNs) to approximate the solutions to the PEs and study the error estimates. We first establish the higher-order regularity for the global solutions to the PEs with either full viscosity and diffusivity, or with only the horizontal ones. Such a result for the case with only the horizontal ones is new and required in the analysis under the PINNs framework. Then we prove the existence of two-layer tanh PINNs of which the corresponding training error can be arbitrarily small by taking the width of PINNs to be sufficiently wide, and the error between the true solution and its approximation can be arbitrarily small provided that the training error is small enough and the sample set is large enough. In particular, all the estimates are a priori, and our analysis includes higher-order (in spatial Sobolev norm) error estimates. Numerical results on prototype systems are presented to further illustrate the advantage of using the $H^s$ norm during the training. △ Less

Submitted 17 March, 2023; v1 submitted 24 September, 2022; originally announced September 2022.

Comments: 30 pages

arXiv:2208.05829 [pdf, ps, other]

Fan-complete Ramsey numbers

Authors: Fan Chung, Qizhong Lin

Abstract: We consider Ramsey numbers $r(G,H)$ with tight lower bounds, namely, \begin{align*} r(G,H) \geq (χ(G)-1)(|H|-1)+1, \end{align*} where $χ(G)$ denotes the chromatic number of $G$ and $|H|$ denotes the number of vertices in $H$. We say $H$ is $G$-good if the equality holds. In this paper, we prove that the fan-graph $F_n=K_1 + n K_2$ is $K_p$-good if $n\geq 27p^2$, improving previous tower-type low… ▽ More We consider Ramsey numbers $r(G,H)$ with tight lower bounds, namely, \begin{align*} r(G,H) \geq (χ(G)-1)(|H|-1)+1, \end{align*} where $χ(G)$ denotes the chromatic number of $G$ and $|H|$ denotes the number of vertices in $H$. We say $H$ is $G$-good if the equality holds. In this paper, we prove that the fan-graph $F_n=K_1 + n K_2$ is $K_p$-good if $n\geq 27p^2$, improving previous tower-type lower bounds for $n$ due to Li and Rousseau (1996). The join graph $G+H$ is defined by adding all edges between the disjoint vertex sets of $G$ and $H$. Let $nH$ denote the union graph of $n$ disjoint copies of $H$. We show that $K_1+nH$ is $K_p$-good if $n$ is sufficiently large. We give a stronger lower bound inequality for Ramsey number $r(G, K_1+F)$ for the case of $G=K_p(a_1, a_2, \dots, a_p)$, the complete $p$-partite graph with $a_1=1$ and $a_i \leq a_{i+1}$. In particular, using a stability-supersaturation lemma by Fox, He and Wigderson (2021), we show that for any fixed graph $H$, \begin{align*} r(G,K_1+nH) = \left\{ \begin{array}{ll} (p-1)(n |H|+a_2-1)+1 & \textrm{if $n|H|+a_2-1$ or $a_2-1$ is even,}\\ (p-1)(n |H|+a_2-2)+1 & \textrm{otherwise,} \end{array} \right. \end{align*} where $G=K_p(1,a_2, \dots, a_p)$ with $a_i$'s satisfying some mild conditions and $n$ is sufficiently large. The special case of $H=K_1$ gives an answer to Burr's question (1981) about the discrepancy of $r(G, K_{1,n})$ from $G$-goodness for sufficiently large $n$. All bounds of $n$ we obtain are not of tower-types. △ Less

Submitted 11 August, 2022; originally announced August 2022.

Comments: 15 pages

MSC Class: 05D10

arXiv:2207.11122 [pdf, other]

doi 10.1145/3534678.3539334

Solving the Batch Stochastic Bin Packing Problem in Cloud: A Chance-constrained Optimization Approach

Authors: Jie Yan, Yunlei Lu, Liting Chen, Si Qin, Yixin Fang, Qingwei Lin, Thomas Moscibroda, Saravan Rajmohan, Dongmei Zhang

Abstract: This paper investigates a critical resource allocation problem in the first party cloud: scheduling containers to machines. There are tens of services and each service runs a set of homogeneous containers with dynamic resource usage; containers of a service are scheduled daily in a batch fashion. This problem can be naturally formulated as Stochastic Bin Packing Problem (SBPP). However, traditiona… ▽ More This paper investigates a critical resource allocation problem in the first party cloud: scheduling containers to machines. There are tens of services and each service runs a set of homogeneous containers with dynamic resource usage; containers of a service are scheduled daily in a batch fashion. This problem can be naturally formulated as Stochastic Bin Packing Problem (SBPP). However, traditional SBPP research often focuses on cases of empty machines, whose objective, i.e., to minimize the number of used machines, is not well-defined for the more common reality with nonempty machines. This paper aims to close this gap. First, we define a new objective metric, Used Capacity at Confidence (UCaC), which measures the maximum used resources at a probability and is proved to be consistent for both empty and nonempty machines, and reformulate the SBPP under chance constraints. Second, by modeling the container resource usage distribution in a generative approach, we reveal that UCaC can be approximated with Gaussian, which is verified by trace data of real-world applications. Third, we propose an exact solver by solving the equivalent cutting stock variant as well as two heuristics-based solvers -- UCaC best fit, bi-level heuristics. We experimentally evaluate these solvers on both synthetic datasets and real application traces, demonstrating our methodology's advantage over traditional SBPP optimal solver minimizing the number of used machines, with a low rate of resource violations. △ Less

Submitted 20 July, 2022; originally announced July 2022.

Comments: To appear in SIGKDD 2022 as Research Track paper

MSC Class: 90-00; 90C11 ACM Class: G.1.6

arXiv:2204.03462 [pdf, ps, other]

Ramsey non-goodness involving books

Authors: Chunchao Fan, Qizhong Lin

Abstract: In 1983, Burr and Erdős initiated the study of Ramsey goodness problems.Nikiforov and Rousseau (2009) resolved almost all goodness questions raised by Burr and Erdős, in which the bounds on the parameters are of tower type since their proofs rely on the regularity lemma. Let $B_{k,n}$ be the book graph on $n$ vertices which consists of $n-k$ copies of $K_{k+1}$ all sharing a common $K_k$, and let… ▽ More In 1983, Burr and Erdős initiated the study of Ramsey goodness problems.Nikiforov and Rousseau (2009) resolved almost all goodness questions raised by Burr and Erdős, in which the bounds on the parameters are of tower type since their proofs rely on the regularity lemma. Let $B_{k,n}$ be the book graph on $n$ vertices which consists of $n-k$ copies of $K_{k+1}$ all sharing a common $K_k$, and let $H=K_p(a_1,\dots,a_{p})$ be the complete $p$-partite graph with parts of sizes $a_1,\dots,a_{p}$. Recently, avoiding use of the regularity lemma, Fox, He and Wigderson (2021) revisit several Ramsey goodness results involving books. They comment that it would be very interesting to see how far one can push these ideas. In particular, they conjecture that for all integers $k, p, t\ge 2$, there exists some $δ>0$ such that for all $n\ge 1$, $1\leq a_1\le\cdots\le a_{p-1}\le t$ and $a_p \le δn$, we have $r(H, B_{k,n})= (p-1)(n-1)+d_k(n,K_{a_1,a_2})+1,$ where $d_k(n,K_{a_1,a_2})$ is the maximum $d$ for which there is an $(n+d-1)$-vertex $K_{a_1,a_2}$-free graph in which at most $k-1$ vertices have degree less than $d$.They verify the conjecture when $a_1=a_2=1$. Building upon the work of Fox et al. (2021), we make a substantial step by showing that the conjecture "roughly" holds if $a_1=1$ and $a_2|(n-1-k)$, i.e. $a_2$ divides $n-1-k$. Moreover, avoiding use of the regularity lemma, we prove that for every $k, a\geq 1$ and $p\ge2$, there exists $δ>0$ such that for all large $n$ and $b\le δ\ln n$, $r(K_p(1,a,b,\dots,b), B_{k,n})= (p-1)(n-1)+k(p-1)(a-1)+1$ if $a|(n-1-k)$, where the case when $a=1$ has been proved by Nikiforov and Rousseau (2009) using the regularity lemma. The bounds on $1/δ$ we obtain are not of tower-type since our proofs do not rely on the regularity lemma. △ Less

Submitted 7 April, 2022; originally announced April 2022.

Comments: 16 pages. arXiv admin note: text overlap with arXiv:2109.09205 by other authors. text overlap with arXiv:2109.09205 by other authors

arXiv:2203.09703 [pdf, ps, other]

Cutting plane algorithms for nonlinear binary optimization

Authors: Hoa T. Bui, Qun Lin, Ryan Loxton

Abstract: Current state-of-the-art methods for solving discrete optimization problems are usually restricted to convex settings. In this paper, we propose a general approach based on cutting planes for solving nonlinear, possibly nonconvex, binary optimization problems. We provide a rigorous convergence analysis that quantifies the number of iterations required under different conditions. This is different… ▽ More Current state-of-the-art methods for solving discrete optimization problems are usually restricted to convex settings. In this paper, we propose a general approach based on cutting planes for solving nonlinear, possibly nonconvex, binary optimization problems. We provide a rigorous convergence analysis that quantifies the number of iterations required under different conditions. This is different to most other work in discrete optimization where only finite convergence is proved. Moreover, using tools from variational analysis, we provide necessary and sufficient dual optimality conditions. △ Less

Submitted 17 March, 2022; originally announced March 2022.

MSC Class: 90-08

arXiv:2203.04922 [pdf, ps, other]

doi 10.1007/s00021-022-00705-3

On the effect of fast rotation and vertical viscosity on the lifespan of the $3D$ primitive equations

Authors: Quyuan Lin, Xin Liu, Edriss S. Titi

Abstract: We study the effect of the fast rotation and vertical viscosity on the lifespan of solutions to the three-dimensional primitive equations (also known as the hydrostatic Navier-Stokes equations) with impermeable and stress-free boundary conditions. Firstly, for a short time interval, independent of the rate of rotation $|Ω|$, we establish the local well-posedness of solutions with initial data that… ▽ More We study the effect of the fast rotation and vertical viscosity on the lifespan of solutions to the three-dimensional primitive equations (also known as the hydrostatic Navier-Stokes equations) with impermeable and stress-free boundary conditions. Firstly, for a short time interval, independent of the rate of rotation $|Ω|$, we establish the local well-posedness of solutions with initial data that is analytic in the horizontal variables and only $L^2$ in the vertical variable. Moreover, it is shown that the solutions immediately become analytic in all the variables with increasing-in-time (at least linearly) radius of analyticity in the vertical variable for as long as the solutions exist. On the other hand, the radius of analyticity in the horizontal variables might decrease with time, but as long as it remains positive the solution exists. Secondly, with fast rotation, i.e., large $|Ω|$, we show that the existence time of the solution can be prolonged, with "well-prepared" initial data. Finally, in the case of two spatial dimensions with $Ω=0$, we establish the global well-posedness provided that the initial data is small enough. The smallness condition on the initial data depends on the vertical viscosity and the initial radius of analyticity in the horizontal variables. △ Less

Submitted 8 June, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

Comments: 45 pages

MSC Class: 35Q35; 35Q86; 86A10; 76E07

arXiv:2203.01505 [pdf, ps, other]

Large-scale Optimization of Partial AUC in a Range of False Positive Rates

Authors: Yao Yao, Qihang Lin, Tianbao Yang

Abstract: The area under the ROC curve (AUC) is one of the most widely used performance measures for classification models in machine learning. However, it summarizes the true positive rates (TPRs) over all false positive rates (FPRs) in the ROC space, which may include the FPRs with no practical relevance in some applications. The partial AUC, as a generalization of the AUC, summarizes only the TPRs over a… ▽ More The area under the ROC curve (AUC) is one of the most widely used performance measures for classification models in machine learning. However, it summarizes the true positive rates (TPRs) over all false positive rates (FPRs) in the ROC space, which may include the FPRs with no practical relevance in some applications. The partial AUC, as a generalization of the AUC, summarizes only the TPRs over a specific range of the FPRs and is thus a more suitable performance measure in many real-world situations. Although partial AUC optimization in a range of FPRs had been studied, existing algorithms are not scalable to big data and not applicable to deep learning. To address this challenge, we cast the problem into a non-smooth difference-of-convex (DC) program for any smooth predictive functions (e.g., deep neural networks), which allowed us to develop an efficient approximated gradient descent method based on the Moreau envelope smoothing technique, inspired by recent advances in non-smooth DC optimization. To increase the efficiency of large data processing, we used an efficient stochastic block coordinate update in our algorithm. Our proposed algorithm can also be used to minimize the sum of ranked range loss, which also lacks efficient solvers. We established a complexity of $\tilde O(1/ε^6)$ for finding a nearly $ε$-critical solution. Finally, we numerically demonstrated the effectiveness of our proposed algorithms for both partial AUC maximization and sum of ranked range loss minimization. △ Less

Submitted 27 October, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

arXiv:2201.05924 [pdf, ps, other]

doi 10.1007/s40072-022-00266-6

Local Martingale Solutions and Pathwise Uniqueness for the Three-dimensional Stochastic Inviscid Primitive Equations

Authors: Ruimeng Hu, Quyuan Lin

Abstract: We study the stochastic effect on the three-dimensional inviscid primitive equations (PEs, also called the hydrostatic Euler equations). Specifically, we consider a larger class of noises than multiplicative noises, and work in the analytic function space due to the ill-posedness in Sobolev spaces of PEs without horizontal viscosity. Under proper conditions, we prove the local existence of marting… ▽ More We study the stochastic effect on the three-dimensional inviscid primitive equations (PEs, also called the hydrostatic Euler equations). Specifically, we consider a larger class of noises than multiplicative noises, and work in the analytic function space due to the ill-posedness in Sobolev spaces of PEs without horizontal viscosity. Under proper conditions, we prove the local existence of martingale solutions and pathwise uniqueness. By adding vertical viscosity, i.e., considering the hydrostatic Navier-Stokes equations, we can relax the restriction on initial conditions to be only analytic in the horizontal variables with Sobolev regularity in the vertical variable, and allow the transport noise in the vertical direction. We establish the local existence of martingale solutions and pathwise uniqueness, and show that the solutions become analytic in the vertical variable instantaneously as $t>0$ and the vertical analytic radius increases as long as the solutions exist. △ Less

Submitted 15 January, 2022; originally announced January 2022.

Comments: 34 pages

arXiv:2201.01169 [pdf, other]

Inexact accelerated proximal gradient method with line search and reduced complexity for affine-constrained and bilinear saddle-point structured convex problems

Authors: Qihang Lin, Yangyang Xu

Abstract: The goal of this paper is to reduce the total complexity of gradient-based methods for two classes of problems: affine-constrained composite convex optimization and bilinear saddle-point structured non-smooth convex optimization. Our technique is based on a double-loop inexact accelerated proximal gradient (APG) method for minimizing the summation of a non-smooth but proximable convex function and… ▽ More The goal of this paper is to reduce the total complexity of gradient-based methods for two classes of problems: affine-constrained composite convex optimization and bilinear saddle-point structured non-smooth convex optimization. Our technique is based on a double-loop inexact accelerated proximal gradient (APG) method for minimizing the summation of a non-smooth but proximable convex function and two smooth convex functions with different smoothness constants and computational costs. Compared to the standard APG method, the inexact APG method can reduce the total computation cost if one smooth component has higher computational cost but a smaller smoothness constant than the other. With this property, the inexact APG method can be applied to approximately solve the subproblems of a proximal augmented Lagrangian method for affine-constrained composite convex optimization and the smooth approximation for bilinear saddle-point structured non-smooth convex optimization, where the smooth function with a smaller smoothness constant has significantly higher computational cost. Thus it can reduce total complexity for finding an approximately optimal/stationary solution. This technique is similar to the gradient sliding technique in the literature. The difference is that our inexact APG method can efficiently stop the inner loop by using a computable condition based on a measure of stationarity violation, while the gradient sliding methods need to pre-specify the number of iterations for the inner loop. Numerical experiments demonstrate significantly higher efficiency of our methods over an optimal primal-dual first-order method and the gradient sliding methods. △ Less

Submitted 4 January, 2022; originally announced January 2022.

arXiv:2201.00675 [pdf, ps, other]

Three-color Ramsey number of an odd cycle versus bipartite graphs with small bandwidth

Authors: Chunlin You, Qizhong Lin

Abstract: A graph $\mathcal{H}=(W,E_\mathcal{H})$ is said to have {\em bandwidth} at most $b$ if there exists a labeling of $W$ as $w_1,w_2,\dots,w_n$ such that $|i-j|\leq b$ for every edge $w_iw_j\in E_\mathcal{H}$. We say that $\mathcal{H}$ is a {\em balanced $(β,Δ)$-graph} if it is a bipartite graph with bandwidth at most $β|W|$ and maximum degree at most $Δ$, and it also has a proper 2-coloring… ▽ More A graph $\mathcal{H}=(W,E_\mathcal{H})$ is said to have {\em bandwidth} at most $b$ if there exists a labeling of $W$ as $w_1,w_2,\dots,w_n$ such that $|i-j|\leq b$ for every edge $w_iw_j\in E_\mathcal{H}$. We say that $\mathcal{H}$ is a {\em balanced $(β,Δ)$-graph} if it is a bipartite graph with bandwidth at most $β|W|$ and maximum degree at most $Δ$, and it also has a proper 2-coloring $χ:W\rightarrow[2]$ such that $||χ^{-1}(1)|-|χ^{-1}(2)||\leqβ|χ^{-1}(2)|$. In this paper, we prove that for every $γ>0$ and every natural number $Δ$, there exists a constant $β>0$ such that for every balanced $(β,Δ)$-graph $\mathcal{H}$ on $n$ vertices we have $$R(\mathcal{H}, \mathcal{H}, C_n) \leq (3+γ)n$$ for all sufficiently large odd $n$. The upper bound is sharp for several classes of graphs. Let $θ_{n,t}$ be the graph consisting of $t$ internally disjoint paths of length $n$ all sharing the same endpoints. As a corollary, for each fixed $t\geq 1$, $R(θ_{n, t},θ_{n, t}, C_{nt+λ})=(3t+o(1))n,$ where $λ=0$ if $nt$ is odd and $λ=1$ if $nt$ is even. In particular, we have $R(C_{2n},C_{2n}, C_{2n+1})=(6+o(1))n$, which is a special case of a result of Figaj and Łuczak (2018). △ Less

Submitted 14 March, 2022; v1 submitted 3 January, 2022; originally announced January 2022.

Comments: 17 pages

arXiv:2112.09759 [pdf, ps, other]

Stable Singularity Formation for the Inviscid Primitive Equations

Authors: Charles Collot, Slim Ibrahim, Quyuan Lin

Abstract: The primitive equations (PEs) model large scale dynamics of the oceans and the atmosphere. While it is by now well-known that the three-dimensional viscous PEs is globally well-posed in Sobolev spaces, and that there are solutions to the inviscid PEs (also called the hydrostatic Euler equations) that develop singularities in finite time, the qualitative description of the blowup still remains undi… ▽ More The primitive equations (PEs) model large scale dynamics of the oceans and the atmosphere. While it is by now well-known that the three-dimensional viscous PEs is globally well-posed in Sobolev spaces, and that there are solutions to the inviscid PEs (also called the hydrostatic Euler equations) that develop singularities in finite time, the qualitative description of the blowup still remains undiscovered. In this paper, we provide a full description of two blowup mechanisms, for a reduced PDE that is satisfied by a class of particular solutions to the PEs. In the first one a shock forms, and pressure effects are subleading, but in a critical way: they localize the singularity closer and closer to the boundary near the blow-up time (with a logarithmic in time law). This first mechanism involves a smooth blow-up profile and is stable among smooth enough solutions. In the second one the pressure effects are fully negligible; this dynamics involves a two-parameters family of non-smooth profiles, and is stable only by smoother perturbations. △ Less

Submitted 17 December, 2021; originally announced December 2021.

Comments: 29 pages

arXiv:2112.07755 [pdf, other]

Separate Exchangeability as Modeling Principle in Bayesian Nonparametrics

Authors: Giovanni Rebaudo, Qiaohui Lin, Peter Mueller

Abstract: We argue for the use of separate exchangeability as a modeling principle in Bayesian nonparametric (BNP) inference. Separate exchangeability is \emph{de facto} widely applied in the Bayesian parametric case, e.g., it naturally arises in simple mixed models. However, while in some areas, such as random graphs, separate and (closely related) joint exchangeability are widely used, it is curiously und… ▽ More We argue for the use of separate exchangeability as a modeling principle in Bayesian nonparametric (BNP) inference. Separate exchangeability is \emph{de facto} widely applied in the Bayesian parametric case, e.g., it naturally arises in simple mixed models. However, while in some areas, such as random graphs, separate and (closely related) joint exchangeability are widely used, it is curiously underused for several other applications in BNP. We briefly review the definition of separate exchangeability focusing on the implications of such a definition in Bayesian modeling. We then discuss two tractable classes of models that implement separate exchangeability that are the natural counterparts of familiar partially exchangeable BNP models. The first is nested random partitions for a data matrix, defining a partition of columns and nested partitions of rows, nested within column clusters. Many recent models for nested partitions implement partially exchangeable models related to variations of the well-known nested Dirichlet process. We argue that inference under such models in some cases ignores important features of the experimental setup. We obtain the separately exchangeable counterpart of such partially exchangeable partition structures. The second class is about setting up separately exchangeable priors for a nonparametric regression model when multiple sets of experimental units are involved. We highlight how a Dirichlet process mixture of linear models known as ANOVA DDP can naturally implement separate exchangeability in such regression problems. Finally, we illustrate how to perform inference under such models in two real data examples. △ Less

Submitted 20 June, 2024; v1 submitted 14 December, 2021; originally announced December 2021.

arXiv:2111.11358 [pdf, other]

A Surrogate Objective Framework for Prediction+Optimization with Soft Constraints

Authors: Kai Yan, Jie Yan, Chuan Luo, Liting Chen, Qingwei Lin, Dongmei Zhang

Abstract: Prediction+optimization is a common real-world paradigm where we have to predict problem parameters before solving the optimization problem. However, the criteria by which the prediction model is trained are often inconsistent with the goal of the downstream optimization problem. Recently, decision-focused prediction approaches, such as SPO+ and direct optimization, have been proposed to fill this… ▽ More Prediction+optimization is a common real-world paradigm where we have to predict problem parameters before solving the optimization problem. However, the criteria by which the prediction model is trained are often inconsistent with the goal of the downstream optimization problem. Recently, decision-focused prediction approaches, such as SPO+ and direct optimization, have been proposed to fill this gap. However, they cannot directly handle the soft constraints with the $max$ operator required in many real-world objectives. This paper proposes a novel analytically differentiable surrogate objective framework for real-world linear and semi-definite negative quadratic programming problems with soft linear and non-negative hard constraints. This framework gives the theoretical bounds on constraints' multipliers, and derives the closed-form solution with respect to predictive parameters and thus gradients for any variable in the problem. We evaluate our method in three applications extended with soft constraints: synthetic linear programming, portfolio optimization, and resource provisioning, demonstrating that our method outperforms traditional two-staged methods and other decision-focused approaches. △ Less

Submitted 22 November, 2021; originally announced November 2021.

Comments: 32 pages; published as NeurIPS 2021 poster paper

arXiv:2108.11201 [pdf, other]

Ramsey numbers of quadrilateral versus books

Authors: Tianyu Li, Qizhong Lin, Xing Peng

Abstract: A book $B_n$ is a graph which consists of $n$ triangles sharing a common edge. In this paper, we study Ramsey numbers of quadrilateral versus books. Previous results give the exact value of $r(C_4,B_n)$ for $1\le n\le 14$. We aim to show the exact value of $r(C_4,B_n)$ for infinitely many $n$. To achieve this, we first prove that $r(C_4,B_{(m-1)^2+(t-2)})\le m^2+t$ for $m\ge4$ and… ▽ More A book $B_n$ is a graph which consists of $n$ triangles sharing a common edge. In this paper, we study Ramsey numbers of quadrilateral versus books. Previous results give the exact value of $r(C_4,B_n)$ for $1\le n\le 14$. We aim to show the exact value of $r(C_4,B_n)$ for infinitely many $n$. To achieve this, we first prove that $r(C_4,B_{(m-1)^2+(t-2)})\le m^2+t$ for $m\ge4$ and $0 \leq t \leq m-1$. This improves upon a result by Faudree, Rousseau and Sheehan (1978) which states that \begin{align*} r(C_4,B_n)\le g(g(n)), \;\;\text{where}\;\;g(n)=n+\lfloor\sqrt{n-1}\rfloor+2. \end{align*} Combining the new upper bound and constructions of $C_4$-free graphs, we are able to determine the exact value of $r(C_4,B_n)$ for infinitely many $n$. As a special case, we show $r(C_4,B_{q^2-q-2}) = q^2+q-1$ for all prime power $q\ge4$. △ Less

Submitted 25 August, 2021; originally announced August 2021.

Comments: 12 pages

arXiv:2107.04918 [pdf, ps, other]

Convergence of the Gradient Sampling Algorithm on Directionally Lipschitz Functions

Authors: James V. Burke, Qiuying Lin

Abstract: The convergence theory for the gradient sampling algorithm is extended to directionally Lipschitz functions. Although directionally Lipschitz functions are not necessarily locally Lipschitz, they are almost everywhere differentiable and well approximated by gradients and so are a natural candidate for the application of the gradient sampling algorithm. The main obstacle to this extension is the po… ▽ More The convergence theory for the gradient sampling algorithm is extended to directionally Lipschitz functions. Although directionally Lipschitz functions are not necessarily locally Lipschitz, they are almost everywhere differentiable and well approximated by gradients and so are a natural candidate for the application of the gradient sampling algorithm. The main obstacle to this extension is the potential unboundedness or emptiness of the Clarke subdifferential at points of interest. The convergence analysis we present provides one path to addressing these issues. In particular, we recover the usual convergence theory when the function is locally Lipschitz. Moreover, if the algorithm does not drive a certain measure of criticality to zero, then the iterates must converge to a point at which either the Clarke subdifferential is empty or the direction of steepest descent is degenerate in the sense that it does lie in the interior of the domain of the regular subderivative. △ Less

Submitted 10 July, 2021; originally announced July 2021.

MSC Class: 49J22; 65K05; 65K10; 90C26

arXiv:2105.12730 [pdf, other]

doi 10.1016/j.tpb.2021.11.003

Markov Genealogy Processes

Authors: Aaron A. King, Qianying Lin, Edward L. Ionides

Abstract: We construct a family of genealogy-valued Markov processes that are induced by a continuous-time Markov population process. We derive exact expressions for the likelihood of a given genealogy conditional on the history of the underlying population process. These lead to a nonlinear filtering equation which can be used to design efficient Monte Carlo inference algorithms. We demonstrate these calcu… ▽ More We construct a family of genealogy-valued Markov processes that are induced by a continuous-time Markov population process. We derive exact expressions for the likelihood of a given genealogy conditional on the history of the underlying population process. These lead to a nonlinear filtering equation which can be used to design efficient Monte Carlo inference algorithms. We demonstrate these calculations with several examples. Existing full-information approaches for phylodynamic inference are special cases of the theory. △ Less

Submitted 24 January, 2022; v1 submitted 26 May, 2021; originally announced May 2021.

MSC Class: 60J99

Journal ref: Theoretical Population Biology 143:77-91 (2022)

arXiv:2105.04728 [pdf, other]

doi 10.1145/3447555.3464857

Optimal Online Algorithms for Peak-Demand Reduction Maximization with Energy Storage

Authors: Yanfang Mo, Qiulin Lin, Minghua Chen, Si-Zhao Joe Qin

Abstract: The high proportions of demand charges in electric bills motivate large-power customers to leverage energy storage for reducing the peak procurement from the outer grid. Given limited energy storage, we expect to maximize the peak-demand reduction in an online fashion, challenged by the highly uncertain demands and renewable injections, the non-cumulative nature of peak consumption, and the coupli… ▽ More The high proportions of demand charges in electric bills motivate large-power customers to leverage energy storage for reducing the peak procurement from the outer grid. Given limited energy storage, we expect to maximize the peak-demand reduction in an online fashion, challenged by the highly uncertain demands and renewable injections, the non-cumulative nature of peak consumption, and the coupling of online decisions. In this paper, we propose an optimal online algorithm that achieves the best competitive ratio, following the idea of maintaining a constant ratio between the online and the optimal offline peak-reduction performance. We further show that the optimal competitive ratio can be computed by solving a linear number of linear-fractional programs. Moreover, we extend the algorithm to adaptively maintain the best competitive ratio given the revealed inputs and actions at each decision-making round. The adaptive algorithm retains the optimal worst-case guarantee and attains improved average-case performance. We evaluate our proposed algorithms using real-world traces and show that they obtain up to 81% peak reduction of the optimal offline benchmark. Additionally, the adaptive algorithm achieves at least 20% more peak reduction against baseline alternatives. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: 12 pages, 6 figures

arXiv:2103.05814 [pdf, ps, other]

Ramsey numbers of large books

Authors: Xun Chen, Qizhong Lin, Chunlin You

Abstract: A book $B_n$ is a graph which consists of $n$ triangles sharing a common edge. In 1978, Rousseau and Sheehan conjectured that the Ramsey number satisfies $r(B_m,B_n)\le 2(m+n)+c$ for some constant $c>0$. In this paper, we obtain that $r(B_m, B_n)\le 2(m+n)+o(n)$ for all $m\le n$ and $n$ large, which confirms the conjecture of Rousseau and Sheehan asymptotically. As a corollary, our result implies… ▽ More A book $B_n$ is a graph which consists of $n$ triangles sharing a common edge. In 1978, Rousseau and Sheehan conjectured that the Ramsey number satisfies $r(B_m,B_n)\le 2(m+n)+c$ for some constant $c>0$. In this paper, we obtain that $r(B_m, B_n)\le 2(m+n)+o(n)$ for all $m\le n$ and $n$ large, which confirms the conjecture of Rousseau and Sheehan asymptotically. As a corollary, our result implies that a related conjecture of Faudree, Rousseau and Sheehan (1982) on strongly regular graph holds asymptotically. △ Less

Submitted 17 December, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

MSC Class: 05D10

arXiv:2103.00005 [pdf, other]

Optimal Online Peak Minimization Using Energy Storage

Authors: Yanfang Mo, Qiulin Lin, Minghua Chen, Si-Zhao Joe Qin

Abstract: The significant presence of demand charges in electric bills motivates large-load customers to utilize energy storage to reduce the peak procurement from the grid. We herein study the problem of energy storage allocation for peak minimization, under the online setting where irrevocable decisions are sequentially made without knowing future demands. The problem is uniquely challenging due to (i) th… ▽ More The significant presence of demand charges in electric bills motivates large-load customers to utilize energy storage to reduce the peak procurement from the grid. We herein study the problem of energy storage allocation for peak minimization, under the online setting where irrevocable decisions are sequentially made without knowing future demands. The problem is uniquely challenging due to (i) the coupling of online decisions across time imposed by the inventory constraints and (ii) the noncumulative nature of the peak procurement. We apply the CR-Pursuit framework and address the challenges unique to our minimization problem to design an online algorithm achieving the optimal competitive ratio (CR) among all online algorithms. We show that the optimal CR can be computed in polynomial time by solving a linear number of linear-fractional problems. More importantly, we generalize our approach to develop an \emph{anytime-optimal} online algorithm that achieves the best possible CR at any epoch, given the inputs and online decisions so far. The algorithm retains the optimal worst-case performance and attains adaptive average-case performance. Trace-driven simulations show that our algorithm can decrease the peak demand by an extra 19% compared to baseline alternatives under typical settings. △ Less

Submitted 17 September, 2022; v1 submitted 26 February, 2021; originally announced March 2021.

Showing 1–50 of 130 results for author: Lin, Q