-
Incremental Gauss--Newton Methods with Superlinear Convergence Rates
Authors:
Zhiling Zhou,
Zhuanghua Liu,
Chengchang Liu,
Luo Luo
Abstract:
This paper addresses the challenge of solving large-scale nonlinear equations with Hölder continuous Jacobians. We introduce a novel Incremental Gauss--Newton (IGN) method within explicit superlinear convergence rate, which outperforms existing methods that only achieve linear convergence rate. In particular, we formulate our problem by the nonlinear least squares with finite-sum structure, and ou…
▽ More
This paper addresses the challenge of solving large-scale nonlinear equations with Hölder continuous Jacobians. We introduce a novel Incremental Gauss--Newton (IGN) method within explicit superlinear convergence rate, which outperforms existing methods that only achieve linear convergence rate. In particular, we formulate our problem by the nonlinear least squares with finite-sum structure, and our method incrementally iterates with the information of one component in each round. We also provide a mini-batch extension to our IGN method that obtains an even faster superlinear convergence rate. Furthermore, we conduct numerical experiments to show the advantages of the proposed methods.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Numerical Recovery of the Diffusion Coefficient in Diffusion Equations from Terminal Measurement
Authors:
Bangti **,
Xiliang Lu,
Qimeng Quan,
Zhi Zhou
Abstract:
In this work, we investigate a numerical procedure for recovering a space-dependent diffusion coefficient in a (sub)diffusion model from the given terminal data, and provide a rigorous numerical analysis of the procedure. By exploiting decay behavior of the observation in time, we establish a novel H{ö}lder type stability estimate for a large terminal time $T$. This is achieved by novel decay esti…
▽ More
In this work, we investigate a numerical procedure for recovering a space-dependent diffusion coefficient in a (sub)diffusion model from the given terminal data, and provide a rigorous numerical analysis of the procedure. By exploiting decay behavior of the observation in time, we establish a novel H{ö}lder type stability estimate for a large terminal time $T$. This is achieved by novel decay estimates of the (fractional) time derivative of the solution. To numerically recover the diffusion coefficient, we employ the standard output least-squares formulation with an $H^1(Ω)$-seminorm penalty, and discretize the regularized problem by the Galerkin finite element method with continuous piecewise linear finite elements in space and backward Euler convolution quadrature in time. Further, we provide an error analysis of discrete approximations, and prove a convergence rate that matches the stability estimate. The derived $L^2(Ω)$ error bound depends explicitly on the noise level, regularization parameter and discretization parameter(s), which gives a useful guideline of the \textsl{a priori} choice of discretization parameters with respect to the noise level in practical implementation. The error analysis is achieved using the conditional stability argument and discrete maximum-norm resolvent estimates. Several numerical experiments are also given to illustrate and complement the theoretical analysis.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Statistical Error of Numerical Integrators for Underdamped Langevin Dynamics with Deterministic And Stochastic Gradients
Authors:
Xuda Ye,
Zhennan Zhou
Abstract:
We propose a novel discrete Poisson equation approach to estimate the statistical error of a broad class of numerical integrators for the underdamped Langevin dynamics. The statistical error refers to the mean square error of the estimator to the exact ensemble average with a finite number of iterations. With the proposed error analysis framework, we show that when the potential function $U(x)$ is…
▽ More
We propose a novel discrete Poisson equation approach to estimate the statistical error of a broad class of numerical integrators for the underdamped Langevin dynamics. The statistical error refers to the mean square error of the estimator to the exact ensemble average with a finite number of iterations. With the proposed error analysis framework, we show that when the potential function $U(x)$ is strongly convex in $\mathbb R^d$ and the numerical integrator has strong order $p$, the statistical error is $O(h^{2p}+\frac1{Nh})$, where $h$ is the time step and $N$ is the number of iterations. Besides, this approach can be adopted to analyze integrators with stochastic gradients, and quantitative estimates can be derived as well. Our approach only requires the geometric ergodicity of the continuous-time underdamped Langevin dynamics, and relaxes the constraint on the time step.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
Maximum bound principle and original energy dissipation of arbitrarily high-order rescaled exponential time differencing Runge-Kutta schemes for Allen--Cahn equations
Authors:
Chaoyu Quan,
Xiaoming Wang,
Pinzhong Zheng,
Zhi Zhou
Abstract:
The energy dissipation law and the maximum bound principle are two critical physical properties of the Allen--Cahn equations. While many existing time-step** methods are known to preserve the energy dissipation law, most apply to a modified form of energy. In this work, we demonstrate that, when the nonlinear term of the Allen--Cahn equation is Lipschitz continuous, a class of arbitrarily high-o…
▽ More
The energy dissipation law and the maximum bound principle are two critical physical properties of the Allen--Cahn equations. While many existing time-step** methods are known to preserve the energy dissipation law, most apply to a modified form of energy. In this work, we demonstrate that, when the nonlinear term of the Allen--Cahn equation is Lipschitz continuous, a class of arbitrarily high-order exponential time differencing Runge--Kutta (ETDRK) schemes preserve the original energy dissipation property, under a mild step-size constraint. Additionally, we guarantee the Lipschitz condition on the nonlinear term by applying a rescaling post-processing technique, which ensures that the numerical solution unconditionally satisfies the maximum bound principle. Consequently, our proposed schemes maintain both the original energy dissipation law and the maximum bound principle and can achieve arbitrarily high-order accuracy. We also establish an optimal error estimate for the proposed schemes. Some numerical experiments are carried out to verify our theoretical results.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Efficient bound preserving and asymptotic preserving semi-implicit schemes for the fast reaction-diffusion system
Authors:
Yu Zhao,
Zhennan Zhou
Abstract:
We consider a special type of fast reaction-diffusion systems in which the coefficients of the reaction terms of the two substances are much larger than those of the diffusion terms while the diffusive motion to the substrate is negligible. Specifically speaking, the rate constants of the reaction terms are $O(1/ε)$ while the diffusion coefficients are $O(1)$ where the parameter $ε$ is small. When…
▽ More
We consider a special type of fast reaction-diffusion systems in which the coefficients of the reaction terms of the two substances are much larger than those of the diffusion terms while the diffusive motion to the substrate is negligible. Specifically speaking, the rate constants of the reaction terms are $O(1/ε)$ while the diffusion coefficients are $O(1)$ where the parameter $ε$ is small. When the rate constants of the reaction terms become highly large, i.e. $ε$ tends to 0, the singular limit behavior of such a fast reaction-diffusion system is inscribed by the Stefan problem with latent heat, which brings great challenges in numerical simulations. In this paper, we adopt a semi-implicit scheme, which is first-order accurate in time and can accurately approximate the interface propagation even when the reaction becomes extremely fast, that is to say, the parameter $ε$ is sufficiently small. The scheme satisfies the positivity, bound preserving properties and has $L^2$ stability and the linearized stability results of the system. For better performance on numerical simulations, we then construct a semi-implicit Runge-Kutta scheme which is second-order accurate in time. Numerous numerical tests are carried out to demonstrate the properties, such as the order of accuracy, positivity and bound preserving, the capturing of the sharp interface with various $ε$ and to simulate the dynamics of the substances and the substrate, and to explore the heat transfer process, such as solid melting or liquid solidification in two dimensions.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Rigorous derivation of a Hele-Shaw type model and its non-symmetric traveling wave solution
Authors:
Yu Feng,
Qingyou He,
Jian-Guo Liu,
Zhennan Zhou
Abstract:
In this paper, we consider a Hele-Shaw model that describes tumor growth subject to nutrient supply. This model was recently studied in \cite{feng2022tumor} via asymptotic analysis. Our contributions are twofold: Firstly, we provide a rigorous derivation of this Hele-Shaw model by taking the incompressible limit of the porous medium reaction-diffusion equation, which solidifies the mathematical fo…
▽ More
In this paper, we consider a Hele-Shaw model that describes tumor growth subject to nutrient supply. This model was recently studied in \cite{feng2022tumor} via asymptotic analysis. Our contributions are twofold: Firstly, we provide a rigorous derivation of this Hele-Shaw model by taking the incompressible limit of the porous medium reaction-diffusion equation, which solidifies the mathematical foundations of the model. Secondly, from a bifurcation theory perspective, we prove the existence of non-symmetric traveling wave solutions to the model, which reflect the intrinsic boundary instability in tumor growth dynamics.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Bourgeois' contact manifolds are tight
Authors:
Russell Avdek,
Zhengyi Zhou
Abstract:
We prove that Bourgeois' contact structures on $M \times \mathbb{T}^{2}$ determined by the supporting open books of a contact manifold $(M, ξ)$ are always tight. The proof is based on a contact homology computation leveraging holomorphic foliations and Kuranishi structures.
We prove that Bourgeois' contact structures on $M \times \mathbb{T}^{2}$ determined by the supporting open books of a contact manifold $(M, ξ)$ are always tight. The proof is based on a contact homology computation leveraging holomorphic foliations and Kuranishi structures.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Existence of weak solutions for a class of non-divergent parabolic equations with variable exponent
Authors:
**gfeng Shao,
Zhichang Guo,
Zhongxiang Zhou
Abstract:
A doubly degenerate parabolic equation in non-divergent form with variable growth is investigated in this paper. In suitable spaces, we prove the existence of weak solutions of the equation for cases $1\leq m < 2$ and $m\geq 2$ in different ways. And we establish the non-expansion of support of the solution for the problem.
A doubly degenerate parabolic equation in non-divergent form with variable growth is investigated in this paper. In suitable spaces, we prove the existence of weak solutions of the equation for cases $1\leq m < 2$ and $m\geq 2$ in different ways. And we establish the non-expansion of support of the solution for the problem.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Classical solutions of a mean field system for pulse-coupled oscillators: long time asymptotics versus blowup
Authors:
José Antonio Carrillo,
Xu'an Dou,
Pierre Roux,
Zhennan Zhou
Abstract:
We introduce a novel reformulation of the mean-field system for pulse-coupled oscillators. It is based on writing a closed equation for the inverse distribution function associated to the probability density of oscillators with a given phase in a suitable time scale. This new framework allows to show a hidden contraction/expansion of certain distances leading to a full clarification of the long-ti…
▽ More
We introduce a novel reformulation of the mean-field system for pulse-coupled oscillators. It is based on writing a closed equation for the inverse distribution function associated to the probability density of oscillators with a given phase in a suitable time scale. This new framework allows to show a hidden contraction/expansion of certain distances leading to a full clarification of the long-time behavior, existence of steady states, rates of convergence, and finite time blow-up of classical solutions for a large class of monotone phase response functions. In the process, we get insights about the origin of obstructions to global-in-time existence and uniform in time estimates on the firing rate of the oscillators.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Algebraically overtwisted tight $3$-manifolds from $+1$ surgeries
Authors:
Youlin Li,
Zhengyi Zhou
Abstract:
We execute Avdek's algorithm to find many algebraically overtwisted and tight $3$-manifolds by contact $+1$ surgeries. In particular, we show that a contact $1/k$ surgery on the standard contact $3$-sphere along any positive torus knot with the maximum Thurston-Bennequin invariant yields an algebraically overtwisted and tight $3$-manifold, where $k$ is a positive integer.
We execute Avdek's algorithm to find many algebraically overtwisted and tight $3$-manifolds by contact $+1$ surgeries. In particular, we show that a contact $1/k$ surgery on the standard contact $3$-sphere along any positive torus knot with the maximum Thurston-Bennequin invariant yields an algebraically overtwisted and tight $3$-manifold, where $k$ is a positive integer.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Puzzle game: Prediction and Classification of Wordle Solution Words
Authors:
Haidong Xin,
Fang Wu,
Zhitong Zhou
Abstract:
In MCM/ICM 2023, we proposed a new result prediction model for the popular game Wordle launched by The New York Times. We first preprocessed the raw data and then established a prediction model based on ARIMA to predict the number of report results as of March 1, 2023. We selected word usage frequency, word information entropy, and the number of repeated letters contained in the word as the attrib…
▽ More
In MCM/ICM 2023, we proposed a new result prediction model for the popular game Wordle launched by The New York Times. We first preprocessed the raw data and then established a prediction model based on ARIMA to predict the number of report results as of March 1, 2023. We selected word usage frequency, word information entropy, and the number of repeated letters contained in the word as the attributes of the word, and conducted a correlation analysis between these three attributes and the percentage of seven attempts. We also established a regression model based on the XGBoost algorithm, predicted the distribution of reported results, and predicted the correlation percentage of "EERIE". In addition, we also constructed a word classification model that classified words into "simple", "moderate", and "difficult", and explored the relationship between the three attributes and the classification results. Finally, we calculated the percentage of players in the dataset who needed 3 or more attempts for each word. The appendix provides relevant information and problems to be solved for the mathematical modeling competition.
△ Less
Submitted 10 June, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
On the $A_α$-index of graphs with given order and dissociation number
Authors:
Zihan Zhou,
Shuchao Li
Abstract:
Given a graph $G,$ a subset of vertices is called a maximum dissociation set of $G$ if it induces a subgraph with vertex degree at most 1, and the subset has maximum cardinality. The cardinality of a maximum dissociation set is called the dissociation number of $G$. The adjacency matrix and the degree diagonal matrix of $G$ are denoted by $A(G)$ and $D(G),$ respectively. In 2017, Nikiforov propose…
▽ More
Given a graph $G,$ a subset of vertices is called a maximum dissociation set of $G$ if it induces a subgraph with vertex degree at most 1, and the subset has maximum cardinality. The cardinality of a maximum dissociation set is called the dissociation number of $G$. The adjacency matrix and the degree diagonal matrix of $G$ are denoted by $A(G)$ and $D(G),$ respectively. In 2017, Nikiforov proposed the $A_α$-matrix: $A_α(G)=αD(G)+(1-α)A(G),$ where $α\in[0,1].$ The largest eigenvalue of this novel matrix is called the $A_α$-index of $G.$ In this paper, we firstly determine the connected graph (resp. bipartite graph, tree) having the largest $A_α$-index over all connected graphs (resp. bipartite graphs, trees) with fixed order and dissociation number. Secondly, we describe the structure of all the $n$-vertex graphs having the minimum $A_α$-index with dissociation number $τ$, where $τ\geqslant\lceil\frac{2}{3}n\rceil.$ Finally, we identify all the connected $n$-vertex graphs with dissociation number $τ\in\{2,\lceil\frac{2}{3}n\rceil,n-1,n-2\}$ having the minimum $A_α$-index.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
A fast low-rank inversion algorithm of dielectric matrix in GW approximation
Authors:
Zhengbang Zhou,
Huanhuan Ma,
Wentiao Wu,
Weiguo Gao,
**long Yang,
Meiyue Shao,
Wei Hu
Abstract:
The dielectric response function and its inverse are crucial physical quantities in materials science. We propose an accurate and efficient strategy to invert the dielectric function matrix. The GW approximation, a powerful approach to accurately describe many-body excited states, is taken as an application to demonstrate accuracy and efficiency. We incorporate the interpolative separable density…
▽ More
The dielectric response function and its inverse are crucial physical quantities in materials science. We propose an accurate and efficient strategy to invert the dielectric function matrix. The GW approximation, a powerful approach to accurately describe many-body excited states, is taken as an application to demonstrate accuracy and efficiency. We incorporate the interpolative separable density fitting (ISDF) algorithm with Sherman--Morrison--Woodbury (SMW) formula to accelerate the inversion process by exploiting low-rank properties of dielectric function in plane-wave GW calculations. Our ISDF--SMW strategy produces accurate quasiparticle energies with $O(N_{\mathrm{r}}N_{\mathrm{e}}^2)$ computational cost $(N_{\mathrm{e}}$ is the number of electrons and $N_{\mathrm{r}}=100$--$1000N_{\mathrm{e}}$ is the number of grid points) with negligible small error of $0.03$ eV for both complex molecules and solids. This new strategy for inverting the dielectric matrix can be \(50\times\) faster than the current state-of-the-art implementation in BerkeleyGW, resulting in two orders of magnitude speedup for total GW calculations.
△ Less
Submitted 12 June, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
On the Convergence of A Data-Driven Regularized Stochastic Gradient Descent for Nonlinear Ill-Posed Problems
Authors:
Zehui Zhou
Abstract:
Stochastic gradient descent (SGD) is a promising method for solving large-scale inverse problems, due to its excellent scalability with respect to data size. In this work, we analyze a new data-driven regularized stochastic gradient descent for the efficient numerical solution of a class of nonlinear ill-posed inverse problems in infinite dimensional Hilbert spaces. At each step of the iteration,…
▽ More
Stochastic gradient descent (SGD) is a promising method for solving large-scale inverse problems, due to its excellent scalability with respect to data size. In this work, we analyze a new data-driven regularized stochastic gradient descent for the efficient numerical solution of a class of nonlinear ill-posed inverse problems in infinite dimensional Hilbert spaces. At each step of the iteration, the method randomly selects one equation from the nonlinear system combined with a corresponding equation from the learned system based on training data to obtain a stochastic estimate of the gradient and then performs a descent step with the estimated gradient. We prove the regularizing property of this method under the tangential cone condition and a priori parameter choice and then derive the convergence rates under the additional source condition and range invariance conditions. Several numerical experiments are provided to complement the analysis.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Inverse Coefficient Problem for One-Dimensional Subdiffusion with Data on Disjoint Sets in Time
Authors:
Siyu Cen,
Bangti **,
Yavar Kian,
Eric Soccorsi,
Rachid Zarouf,
Zhi Zhou
Abstract:
In this work we investigate an inverse coefficient problem for the one-dimensional subdiffusion model, which involves a Caputo fractional derivative in time. The inverse problem is to determine two coefficients and multiple parameters (the order, and length of the interval) from one pair of lateral Cauchy data. The lateral Cauchy data are given on disjoint sets in time with a single excitation and…
▽ More
In this work we investigate an inverse coefficient problem for the one-dimensional subdiffusion model, which involves a Caputo fractional derivative in time. The inverse problem is to determine two coefficients and multiple parameters (the order, and length of the interval) from one pair of lateral Cauchy data. The lateral Cauchy data are given on disjoint sets in time with a single excitation and the measurement is made on a time sequence located outside the support of the excitation. We prove two uniqueness results for different lateral Cauchy data. The analysis is based on the solution representation, analyticity of the observation and a refined version of inverse Sturm-Liouville theory due to Sini [35]. Our results heavily exploit the memory effect of fractional diffusion for the unique recovery of the coefficients in the model. Several numerical experiments are also presented to complement the analysis.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
On the Last-Iterate Convergence of Shuffling Gradient Methods
Authors:
Zijian Liu,
Zhengyuan Zhou
Abstract:
Shuffling gradient methods are widely used in modern machine learning tasks and include three popular implementations: Random Reshuffle (RR), Shuffle Once (SO), and Incremental Gradient (IG). Compared to the empirical success, the theoretical guarantee of shuffling gradient methods was not well-understood for a long time. Until recently, the convergence rates had just been established for the aver…
▽ More
Shuffling gradient methods are widely used in modern machine learning tasks and include three popular implementations: Random Reshuffle (RR), Shuffle Once (SO), and Incremental Gradient (IG). Compared to the empirical success, the theoretical guarantee of shuffling gradient methods was not well-understood for a long time. Until recently, the convergence rates had just been established for the average iterate for convex functions and the last iterate for strongly convex problems (using squared distance as the metric). However, when using the function value gap as the convergence criterion, existing theories cannot interpret the good performance of the last iterate in different settings (e.g., constrained optimization). To bridge this gap between practice and theory, we prove the first last-iterate convergence rates for shuffling gradient methods with respect to the objective value even without strong convexity. Our new results either (nearly) match the existing last-iterate lower bounds or are as fast as the previous best upper bounds for the average iterate.
△ Less
Submitted 5 June, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Stochastic contextual bandits with graph feedback: from independence number to MAS number
Authors:
Yuxiao Wen,
Yanjun Han,
Zhengyuan Zhou
Abstract:
We consider contextual bandits with graph feedback, a class of interactive learning problems with richer structures than vanilla contextual bandits, where taking an action reveals the rewards for all neighboring actions in the feedback graph under all contexts. Unlike the multi-armed bandits setting where a growing literature has painted a near-complete understanding of graph feedback, much remain…
▽ More
We consider contextual bandits with graph feedback, a class of interactive learning problems with richer structures than vanilla contextual bandits, where taking an action reveals the rewards for all neighboring actions in the feedback graph under all contexts. Unlike the multi-armed bandits setting where a growing literature has painted a near-complete understanding of graph feedback, much remains unexplored in the contextual bandits counterpart. In this paper, we make inroads into this inquiry by establishing a regret lower bound $Ω(\sqrt{β_M(G) T})$, where $M$ is the number of contexts, $G$ is the feedback graph, and $β_M(G)$ is our proposed graph-theoretical quantity that characterizes the fundamental learning limit for this class of problems. Interestingly, $β_M(G)$ interpolates between $α(G)$ (the independence number of the graph) and $\mathsf{m}(G)$ (the maximum acyclic subgraph (MAS) number of the graph) as the number of contexts $M$ varies. We also provide algorithms that achieve near-optimal regrets for important classes of context sequences and/or feedback graphs, such as transitively closed graphs that find applications in auctions and inventory control. In particular, with many contexts, our results show that the MAS number completely characterizes the statistical complexity for contextual bandits, as opposed to the independence number in multi-armed bandits.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
A degeneration formula of Donaldson-Thomas theory on Calabi-Yau 4-folds
Authors:
Yalong Cao,
Gufang Zhao,
Zijun Zhou
Abstract:
We prove a degeneration formula for Donaldson-Thomas theory on Calabi-Yau 4-folds, and apply it to compute zero dimensional invariants on $\mathbb{C}^4$ and on any local curve.
We prove a degeneration formula for Donaldson-Thomas theory on Calabi-Yau 4-folds, and apply it to compute zero dimensional invariants on $\mathbb{C}^4$ and on any local curve.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
When Should you Offer an Upgrade: Online Upgrading Mechanisms for Resource Allocation
Authors:
Patrick Jaillet,
Chara Podimata,
Andrew Vakhutinsky,
Zijie Zhou
Abstract:
In this work, we study an upgrading scheme for online resource allocation problems. We work in a sequential setting, where at each round a request for a resource arrives and the decision-maker has to decide whether to accept it (and thus, offer the resource) or reject it. The resources are ordered in terms of their value. If the decision-maker decides to accept the request, they can offer an upgra…
▽ More
In this work, we study an upgrading scheme for online resource allocation problems. We work in a sequential setting, where at each round a request for a resource arrives and the decision-maker has to decide whether to accept it (and thus, offer the resource) or reject it. The resources are ordered in terms of their value. If the decision-maker decides to accept the request, they can offer an upgrade-for-a-fee to the next more valuable resource. This fee is dynamically decided based on the currently available resources. After the upgrade-for-a-fee option is presented to the requester, they can either accept it, get upgraded, and pay the additional fee, or reject it and maintain their originally allocated resource.
We take the perspective of the decision-maker and wish to design upgrading mechanisms in a way that simultaneously maximizes revenue and minimizes underutilization of resources. Both of these desiderata are encapsulated in a notion of regret that we define, and according to which we measure our algorithms' performance. We present a fast algorithm that achieves O(log T) regret. Finally, we implemented our algorithm utilizing data akin to those observed in the hospitality industry and estimated our upgrading mechanism would increase the annual revenue by over 17%.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Grace Period is All You Need: Individual Fairness without Revenue Loss in Revenue Management
Authors:
Patrick Jaillet,
Chara Podimata,
Zijie Zhou
Abstract:
Imagine you and a friend purchase identical items at a store, yet only your friend received a discount. Would your friend's discount make you feel unfairly treated by the store? And would you be less willing to purchase from that store again in the future? Based on a large-scale online survey that we ran on Prolific, it turns out that the answers to the above questions are positive. Motivated by t…
▽ More
Imagine you and a friend purchase identical items at a store, yet only your friend received a discount. Would your friend's discount make you feel unfairly treated by the store? And would you be less willing to purchase from that store again in the future? Based on a large-scale online survey that we ran on Prolific, it turns out that the answers to the above questions are positive. Motivated by these findings, in this work we propose a notion of individual fairness in online revenue management and an algorithmic module (called ``Grace Period'') that can be embedded in traditional revenue management algorithms and guarantee individual fairness. Specifically, we show how to embed the Grace Period in five common revenue management algorithms including Deterministic Linear Programming with Probabilistic Assignment, Resolving Deterministic Linear Programming with Probabilistic Assignment, Static Bid Price Control, Booking Limit, and Nesting, thus covering both stochastic and adversarial customer arrival settings. Embedding the Grace Period does not incur additional regret for any of these algorithms. This finding indicates that there is no tradeoff between a seller maximizing their revenue and guaranteeing that each customer feels fairly treated.
△ Less
Submitted 17 May, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Uncertainty Quantification of Phase Transition Problems with an Injection Boundary
Authors:
Zhenyi Zhang,
Shengbo Ma,
Zhennan Zhou
Abstract:
We develop an enthalpy-based modeling and computational framework to quantify uncertainty in Stefan problems with an injection boundary. Inspired by airfoil icing studies, we consider a system featuring an injection boundary inducing domain changes and a free boundary separating phases, resulting in two types of moving boundaries. Our proposed enthalpy-based formulation seamlessly integrates therm…
▽ More
We develop an enthalpy-based modeling and computational framework to quantify uncertainty in Stefan problems with an injection boundary. Inspired by airfoil icing studies, we consider a system featuring an injection boundary inducing domain changes and a free boundary separating phases, resulting in two types of moving boundaries. Our proposed enthalpy-based formulation seamlessly integrates thermal diffusion across the domain with energy fluxes at the boundaries, addressing a modified injection condition for boundary movement. Uncertainty then stems from random variations in the injection boundary. The primary focus of our Uncertainty Quantification (UQ) centers on investigating the effects of uncertainty on free boundary propagation. Through map** to a reference domain, we derive an enthalpy-based numerical scheme tailored to the transformed coordinate system, facilitating a simple and efficient simulation. Numerical and UQ studies in one and two dimensions validate the proposed model and the extended enthalpy method. They offer intriguing insights into ice accretion and other multiphysics processes involving phase transitions.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Binomial-tree approximation for time-inconsistent stop**
Authors:
Erhan Bayraktar,
Zhenhua Wang,
Zhou Zhou
Abstract:
For time-inconsistent stop** in a one-dimensional diffusion setup, we investigate how to use discrete-time models to approximate the original problem. In particular, we consider the value function $V(\cdot)$ induced by all mild equilibria in the continuous-time problem, as well as the value $V^h(\cdot)$ associated with the equilibria in a binomial-tree setting with time step size $h$. We show th…
▽ More
For time-inconsistent stop** in a one-dimensional diffusion setup, we investigate how to use discrete-time models to approximate the original problem. In particular, we consider the value function $V(\cdot)$ induced by all mild equilibria in the continuous-time problem, as well as the value $V^h(\cdot)$ associated with the equilibria in a binomial-tree setting with time step size $h$. We show that $\lim_{h\rightarrow 0+} V^h \leq V$. We provide an example showing that the exact convergence may fail. Then we relax the set of equilibria and consider the value $V^h_{\varepsilon}(\cdot)$ induced by $\varepsilon$-equilibria in the binomial-tree model. We prove that $\lim_{\varepsilon \rightarrow 0+}\lim_{h \rightarrow 0+}V^h_{\varepsilon} = V$.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
On the partial autocorrelation function for locally stationary time series: characterization, estimation and inference
Authors:
Xiucai Ding,
Zhou Zhou
Abstract:
For stationary time series, it is common to use the plots of partial autocorrelation function (PACF) or PACF-based tests to explore the temporal dependence structure of such processes. To our best knowledge, such analogs for non-stationary time series have not been fully established yet. In this paper, we fill this gap for locally stationary time series with short-range dependence. First, we chara…
▽ More
For stationary time series, it is common to use the plots of partial autocorrelation function (PACF) or PACF-based tests to explore the temporal dependence structure of such processes. To our best knowledge, such analogs for non-stationary time series have not been fully established yet. In this paper, we fill this gap for locally stationary time series with short-range dependence. First, we characterize the PACF locally in the time domain and show that the $j$th PACF, denoted as $ρ_{j}(t),$ decays with $j$ whose rate is adaptive to the temporal dependence of the time series $\{x_{i,n}\}$. Second, at time $i,$ we justify that the PACF $ρ_j(i/n)$ can be efficiently approximated by the best linear prediction coefficients via the Yule-Walker's equations. This allows us to study the PACF via ordinary least squares (OLS) locally. Third, we show that the PACF is smooth in time for locally stationary time series. We use the sieve method with OLS to estimate $ρ_j(\cdot)$ and construct some statistics to test the PACFs and infer the structures of the time series. These tests generalize and modify those used for stationary time series. Finally, a multiplier bootstrap algorithm is proposed for practical implementation and an $\mathtt R$ package $\mathtt {Sie2nts}$ is provided to implement our algorithm. Numerical simulations and real data analysis also confirm usefulness of our results.
△ Less
Submitted 30 January, 2024; v1 submitted 28 January, 2024;
originally announced January 2024.
-
Variational formulation for stratified steady water wave in two-layer flows
Authors:
Yuchao He,
Yonghui Xia,
Zhe Zhou
Abstract:
In this paper, the variational formulation for steady periodic stratified water waves in two-layer flows is given. The critical points of a natural energy functional is proved to be the solutions of the governing equations. And the second variation of the functional is also presented.
In this paper, the variational formulation for steady periodic stratified water waves in two-layer flows is given. The critical points of a natural energy functional is proved to be the solutions of the governing equations. And the second variation of the functional is also presented.
△ Less
Submitted 23 March, 2024; v1 submitted 10 January, 2024;
originally announced January 2024.
-
High-order BDF convolution quadrature for stochastic fractional evolution equations driven by integrated additive noise
Authors:
Minghua Chen,
Jiankang Shi,
Zhen Song,
Yubin Yan,
Zhi Zhou
Abstract:
The numerical analysis of stochastic time fractional evolution equations presents considerable challenges due to the limited regularity of the model caused by the nonlocal operator and the presence of noise.
The existing time-step** methods exhibit a significantly low order convergence rate. In this work, we introduce a smoothing technique and develop the novel high-order schemes for solving t…
▽ More
The numerical analysis of stochastic time fractional evolution equations presents considerable challenges due to the limited regularity of the model caused by the nonlocal operator and the presence of noise.
The existing time-step** methods exhibit a significantly low order convergence rate. In this work, we introduce a smoothing technique and develop the novel high-order schemes for solving the linear stochastic fractional evolution equations driven by integrated additive noise. Our approach involves regularizing the additive noise through an $m$-fold integral-differential calculus, and discretizing the equation using the $k$-step BDF convolution quadrature. This novel method, which we refer to as the ID$m$-BDF$k$ method, is able to achieve higher-order convergence in solving the stochastic models. Our theoretical analysis reveals that the convergence rate of the ID$2$-BDF2 method is $O(τ^{α+ γ-1/2})$ for $1< α+ γ\leq 5/2$, and $O(τ^{2})$ for $5/2< α+ γ<3$, where $α\in (1, 2)$ and $γ\in (0, 1)$ denote the time fractional order and the order of the integrated noise, respectively. Furthermore, this convergence rate could be improved to $O(τ^{α+ γ-1/2})$ for any $α\in (1, 2)$ and $γ\in (0, 1)$, if we employ the ID$3$-BDF3 method. The argument could be easily extended to the subdiffusion model with $α\in (0, 1)$. Numerical examples are provided to support and complement the theoretical findings.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Artificial Intelligence for Operations Research: Revolutionizing the Operations Research Process
Authors:
Zhenan Fan,
Bissan Ghaddar,
Xinglu Wang,
Linzi Xing,
Yong Zhang,
Zirui Zhou
Abstract:
The rapid advancement of artificial intelligence (AI) techniques has opened up new opportunities to revolutionize various fields, including operations research (OR). This survey paper explores the integration of AI within the OR process (AI4OR) to enhance its effectiveness and efficiency across multiple stages, such as parameter generation, model formulation, and model optimization. By providing a…
▽ More
The rapid advancement of artificial intelligence (AI) techniques has opened up new opportunities to revolutionize various fields, including operations research (OR). This survey paper explores the integration of AI within the OR process (AI4OR) to enhance its effectiveness and efficiency across multiple stages, such as parameter generation, model formulation, and model optimization. By providing a comprehensive overview of the state-of-the-art and examining the potential of AI to transform OR, this paper aims to inspire further research and innovation in the development of AI-enhanced OR methods and tools. The synergy between AI and OR is poised to drive significant advancements and novel solutions in a multitude of domains, ultimately leading to more effective and efficient decision-making.
△ Less
Submitted 26 March, 2024; v1 submitted 6 January, 2024;
originally announced January 2024.
-
Composition method for chromatic symmetric functions: Neat noncommutative analogs
Authors:
David G. L. Wang,
James Z. F. Zhou
Abstract:
This work is inspired by Shareshian and Wachs's exquisite formula for the chromatic symmetric function of paths. We develop a composition method to unearth neat noncommutative analogs of chromatic symmetric functions. A symmetric function is $e$-positive if and only if it has a $Λ$-positive noncommutative analog. We bring to light short and sweet $Λ$-positive noncommutative analogs for the chromat…
▽ More
This work is inspired by Shareshian and Wachs's exquisite formula for the chromatic symmetric function of paths. We develop a composition method to unearth neat noncommutative analogs of chromatic symmetric functions. A symmetric function is $e$-positive if and only if it has a $Λ$-positive noncommutative analog. We bring to light short and sweet $Λ$-positive noncommutative analogs for the chromatic symmetric functions of tadpoles and barbells. Using these elegant formulas and the composition method, we discover a new family of $e$-positive graphs and call it hat graphs, which are the unicyclic graphs obtained by adding an edge to a path. We also obtain a compact ribbon Schur analog for cycles.
△ Less
Submitted 10 January, 2024; v1 submitted 1 January, 2024;
originally announced January 2024.
-
Mixed virtual element methods for elliptic optimal control problems with boundary observations in L^2(Gamma)
Authors:
Minghui Yang,
Zhaojie Zhou
Abstract:
In this paper we study the mixed virtual element approximation to an elliptic optimal control problem with boundary observations. The objective functional of this type of optimal control problem contains the outward normal derivatives of the state variable on the boundary, which reduces the regularity of solutions to the optimal control problems. We construct the mixed virtual element discrete sch…
▽ More
In this paper we study the mixed virtual element approximation to an elliptic optimal control problem with boundary observations. The objective functional of this type of optimal control problem contains the outward normal derivatives of the state variable on the boundary, which reduces the regularity of solutions to the optimal control problems. We construct the mixed virtual element discrete scheme and derive a priori error estimate for the optimal control problem based on the variational discretization for the control variable. Numerical experiments are carried out on different meshes to support our theoretical findings.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Oriented Supersingular Elliptic Curves and Eichler Orders
Authors:
Guanju Xiao,
Zijian Zhou,
Longjiang Qu
Abstract:
Let $p>3$ be a prime and $E$ be a supersingular elliptic curve defined over $\mathbb{F}_{p^2}$. Let $c$ be a prime with $c < 3p/16$ and $G$ be a subgroup of $E[c]$ of order $c$. The pair $(E,G)$ is called a supersingular elliptic curve with level-$c$ structure, and the endomorphism ring $\text{End}(E,G)$ is isomorphic to an Eichler order with level $c$. We construct two kinds of Eichler orders…
▽ More
Let $p>3$ be a prime and $E$ be a supersingular elliptic curve defined over $\mathbb{F}_{p^2}$. Let $c$ be a prime with $c < 3p/16$ and $G$ be a subgroup of $E[c]$ of order $c$. The pair $(E,G)$ is called a supersingular elliptic curve with level-$c$ structure, and the endomorphism ring $\text{End}(E,G)$ is isomorphic to an Eichler order with level $c$. We construct two kinds of Eichler orders $\mathcal{O}_c(q,r)$ and $\mathcal{O}'_c(q,r')$ with level $c$. Interestingly, we can reduce each $\mathcal{O}_c(q,r)$ or $\mathcal{O}'_c(q,r')$ to a primitive reduced binary quadratic form with discriminant $-16cp$ or $-cp$ respectively. If the curve $E$ is $\mathbb{Z}[\sqrt{-cp}]$-oriented, then we prove that $\text{End}(E,G)$ is isomorphic to $\mathcal{O}_c(q,r)$ or $\mathcal{O}'_c(q,r')$. Due to the fact that $\mathbb{Z}[\sqrt{-cp}]$-oriented isogenies between $\mathbb{Z}[\sqrt{-cp}]$-oriented elliptic curves could be represented by quadratic forms, we show that these isogenies are reflected in the corresponding Eichler orders via the composition law for their corresponding quadratic forms.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods
Authors:
Zijian Liu,
Zhengyuan Zhou
Abstract:
In the past several years, the last-iterate convergence of the Stochastic Gradient Descent (SGD) algorithm has triggered people's interest due to its good performance in practice but lack of theoretical understanding. For Lipschitz convex functions, different works have established the optimal $O(\log(1/δ)\log T/\sqrt{T})$ or $O(\sqrt{\log(1/δ)/T})$ high-probability convergence rates for the final…
▽ More
In the past several years, the last-iterate convergence of the Stochastic Gradient Descent (SGD) algorithm has triggered people's interest due to its good performance in practice but lack of theoretical understanding. For Lipschitz convex functions, different works have established the optimal $O(\log(1/δ)\log T/\sqrt{T})$ or $O(\sqrt{\log(1/δ)/T})$ high-probability convergence rates for the final iterate, where $T$ is the time horizon and $δ$ is the failure probability. However, to prove these bounds, all the existing works are either limited to compact domains or require almost surely bounded noises. It is natural to ask whether the last iterate of SGD can still guarantee the optimal convergence rate but without these two restrictive assumptions. Besides this important question, there are still lots of theoretical problems lacking an answer. For example, compared with the last-iterate convergence of SGD for non-smooth problems, only few results for smooth optimization have yet been developed. Additionally, the existing results are all limited to a non-composite objective and the standard Euclidean norm. It still remains unclear whether the last-iterate convergence can be provably extended to wider composite optimization and non-Euclidean norms. In this work, to address the issues mentioned above, we revisit the last-iterate convergence of stochastic gradient methods and provide the first unified way to prove the convergence rates both in expectation and in high probability to accommodate general domains, composite objectives, non-Euclidean norms, Lipschitz conditions, smoothness, and (strong) convexity simultaneously. Additionally, we extend our analysis to obtain the last-iterate convergence under heavy-tailed noises.
△ Less
Submitted 11 March, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
Fast Sampling for Linear Inverse Problems of Vectors and Tensors using Multilinear Extensions
Authors:
Hao Li,
Dong Liang,
Zixi Zhou,
Zheng Xie
Abstract:
This paper studies the problem of sampling vector and tensor signals, which is the process of choosing sites in vectors and tensors to place sensors for better recovery. A small core tensor and multiple factor matrices can be used to sparsely represent a dense higher-order tensor within a linear model. Using this linear model, one can effectively recover the whole signals from a limited number of…
▽ More
This paper studies the problem of sampling vector and tensor signals, which is the process of choosing sites in vectors and tensors to place sensors for better recovery. A small core tensor and multiple factor matrices can be used to sparsely represent a dense higher-order tensor within a linear model. Using this linear model, one can effectively recover the whole signals from a limited number of measurements by solving linear inverse problems (LIPs). By providing the closed-form expressions of multilinear extensions for the frame potential of pruned matrices, we develop an algorithm named fast Frank-Wolfe algorithm (FFW) for sampling vectors and tensors with low complexity. We provide the approximation factor of our proposed algorithm for the factor matrices that are non-orthogonal and have elements of the same sign in each row. Moreover, we conduct experiments to verify the higher performance and lower complexity of our proposed algorithm for general factor matrix. Finally, we demonstrate that sampling by FFW and reconstruction by least squares methods yield better results for image data compared to convCNP completion with random sampling.
△ Less
Submitted 2 July, 2024; v1 submitted 3 December, 2023;
originally announced December 2023.
-
Learning Coarse Propagators in Parareal Algorithm
Authors:
Bangti **,
Qingle Lin,
Zhi Zhou
Abstract:
The parareal algorithm represents an important class of parallel-in-time algorithms for solving evolution equations and has been widely applied in practice. To achieve effective speedup, the choice of the coarse propagator in the algorithm is vital. In this work, we investigate the use of learned coarse propagators. Building upon the error estimation framework, we present a systematic procedure fo…
▽ More
The parareal algorithm represents an important class of parallel-in-time algorithms for solving evolution equations and has been widely applied in practice. To achieve effective speedup, the choice of the coarse propagator in the algorithm is vital. In this work, we investigate the use of learned coarse propagators. Building upon the error estimation framework, we present a systematic procedure for constructing coarse propagators that enjoy desirable stability and consistent order. Additionally, we provide preliminary mathematical guarantees for the resulting parareal algorithm. Numerical experiments on a variety of settings, e.g., linear diffusion model, Allen-Cahn model, and viscous Burgers model, show that learning can significantly improve parallel efficiency when compared with the more ad hoc choice of some conventional and widely used coarse propagators.
△ Less
Submitted 26 November, 2023;
originally announced November 2023.
-
On the Foundation of Distributionally Robust Reinforcement Learning
Authors:
Shengbo Wang,
Nian Si,
Jose Blanchet,
Zhengyuan Zhou
Abstract:
Motivated by the need for a robust policy in the face of environment shifts between training and the deployment, we contribute to the theoretical foundation of distributionally robust reinforcement learning (DRRL). This is accomplished through a comprehensive modeling framework centered around distributionally robust Markov decision processes (DRMDPs). This framework obliges the decision maker to…
▽ More
Motivated by the need for a robust policy in the face of environment shifts between training and the deployment, we contribute to the theoretical foundation of distributionally robust reinforcement learning (DRRL). This is accomplished through a comprehensive modeling framework centered around distributionally robust Markov decision processes (DRMDPs). This framework obliges the decision maker to choose an optimal policy under the worst-case distributional shift orchestrated by an adversary. By unifying and extending existing formulations, we rigorously construct DRMDPs that embraces various modeling attributes for both the decision maker and the adversary. These attributes include adaptability granularity, exploring history-dependent, Markov, and Markov time-homogeneous decision maker and adversary dynamics. Additionally, we delve into the flexibility of shifts induced by the adversary, examining SA and S-rectangularity. Within this DRMDP framework, we investigate conditions for the existence or absence of the dynamic programming principle (DPP). From an algorithmic standpoint, the existence of DPP holds significant implications, as the vast majority of existing data and computationally efficiency RL algorithms are reliant on the DPP. To study its existence, we comprehensively examine combinations of controller and adversary attributes, providing streamlined proofs grounded in a unified methodology. We also offer counterexamples for settings in which a DPP with full generality is absent.
△ Less
Submitted 19 January, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
An operator-splitting optimization approach for phase-field simulation of equilibrium shapes of crystals
Authors:
Zeyu Zhou,
Wen Huang,
Wei Jiang,
Zhen Zhang
Abstract:
Computing equilibrium shapes of crystals (ESC) is a challenging problem in materials science that involves minimizing an orientation-dependent (i.e., anisotropic) surface energy functional subject to a prescribed mass constraint. The highly nonlinear and singular anisotropic terms in the problem make it very challenging from both the analytical and numerical aspects. Especially, when the strength…
▽ More
Computing equilibrium shapes of crystals (ESC) is a challenging problem in materials science that involves minimizing an orientation-dependent (i.e., anisotropic) surface energy functional subject to a prescribed mass constraint. The highly nonlinear and singular anisotropic terms in the problem make it very challenging from both the analytical and numerical aspects. Especially, when the strength of anisotropy is very strong (i.e., strongly anisotropic cases), the ESC will form some singular, sharp corners even if the surface energy function is smooth. Traditional numerical approaches, such as the $H^{-1}$ gradient flow, are unable to produce true sharp corners due to the necessary addition of a high-order regularization term that penalizes sharp corners and rounds them off. In this paper, we propose a new numerical method based on the Davis-Yin splitting (DYS) optimization algorithm to predict the ESC instead of using gradient flow approaches. We discretize the infinite-dimensional phase-field energy functional in the absence of regularization terms and transform it into a finite-dimensional constraint minimization problem. The resulting optimization problem is solved using the DYS method which automatically guarantees the mass-conservation and bound-preserving properties. We also prove the global convergence of the proposed algorithm. These desired properties are numerically observed. In particular, the proposed method can produce real sharp corners with satisfactory accuracy. Finally, we present numerous numerical results to demonstrate that the ESC can be well simulated under different types of anisotropic surface energies, which also confirms the effectiveness and efficiency of the proposed method.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Numerical Recovery of a Time-Dependent Potential in Subdiffusion
Authors:
Bangti **,
Kwancheol Shin,
Zhi Zhou
Abstract:
In this work we investigate an inverse problem of recovering a time-dependent potential in a semilinear subdiffusion model from an integral measurement of the solution over the domain. The model involves the Djrbashian--Caputo fractional derivative in time. Theoretically, we prove a novel conditional Lipschitz stability result, and numerically, we develop an easy-to-implement fixed point iteration…
▽ More
In this work we investigate an inverse problem of recovering a time-dependent potential in a semilinear subdiffusion model from an integral measurement of the solution over the domain. The model involves the Djrbashian--Caputo fractional derivative in time. Theoretically, we prove a novel conditional Lipschitz stability result, and numerically, we develop an easy-to-implement fixed point iteration for recovering the unknown coefficient. In addition, we establish rigorous error bounds on the discrete approximation. These results are obtained by crucially using smoothing properties of the solution operators and suitable choice of a weighted $L^p(0,T)$ norm. The efficiency and accuracy of the scheme are showcased on several numerical experiments in one- and two-dimensions.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
Asymptotic analysis for Bloch electrons with Weyl nodes
Authors:
Jianfeng Lu,
Changhe Yang,
Zhennan Zhou
Abstract:
In this paper, we study the semiclassical behavior of Bloch electrons in the presence of Weyl nodes, which are singular points in the band structure of certain materials. We carry out asymptotic analysis and present a rigorous derivation of the semiclassical asymptotic expansion of the current of Bloch electrons with the presence of Weyl nodes. The analysis shows that the current contains two part…
▽ More
In this paper, we study the semiclassical behavior of Bloch electrons in the presence of Weyl nodes, which are singular points in the band structure of certain materials. We carry out asymptotic analysis and present a rigorous derivation of the semiclassical asymptotic expansion of the current of Bloch electrons with the presence of Weyl nodes. The analysis shows that the current contains two parts, one independent of the Weyl nodes and the other a contribution from the singular points. This work provides a theoretical foundation towards a rigorous justification of recent scientific discoveries in Weyl semimetals. The main innovation of this paper is a new strategy to deal with the singular points with quantitative estimates, which may have broader applications in multiscale models with singularities.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
Prompt Engineering Through the Lens of Optimal Control
Authors:
Yifan Luo,
Yiming Tang,
Chengfeng Shen,
Zhennan Zhou,
Bin Dong
Abstract:
Prompt Engineering (PE) has emerged as a critical technique for guiding Large Language Models (LLMs) in solving intricate tasks. Its importance is highlighted by its potential to significantly enhance the efficiency and effectiveness of human-machine interaction. As tasks grow increasingly complex, recent advanced PE methods have extended beyond the limitations of single-round interactions to embr…
▽ More
Prompt Engineering (PE) has emerged as a critical technique for guiding Large Language Models (LLMs) in solving intricate tasks. Its importance is highlighted by its potential to significantly enhance the efficiency and effectiveness of human-machine interaction. As tasks grow increasingly complex, recent advanced PE methods have extended beyond the limitations of single-round interactions to embrace multi-round interactions, which allows for a deeper and more nuanced engagement with LLMs. In this paper, we propose an optimal control framework tailored for multi-round interactions with LLMs. This framework provides a unified mathematical structure that not only systematizes the existing PE methods but also sets the stage for rigorous analytical improvements. Furthermore, we extend this framework to include PE via ensemble methods and multi-agent collaboration, thereby enlarging the scope of applicability. By adopting an optimal control perspective, we offer fresh insights into existing PE methods and highlight theoretical challenges that warrant future research. Besides, our work lays a foundation for the development of more effective and interpretable PE methods.
△ Less
Submitted 3 November, 2023; v1 submitted 22 October, 2023;
originally announced October 2023.
-
Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback
Authors:
Michael I. Jordan,
Tianyi Lin,
Zhengyuan Zhou
Abstract:
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions: (1) in the single-agent setting, it achieves an optimal regret of $Θ(\log T)$ for strongly convex cost functions; and (2) in the multi-agent setting of strongly monotone games, with each agent employing OGD, we obtain last-iterate convergence of the joint action to a unique Nash equ…
▽ More
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions: (1) in the single-agent setting, it achieves an optimal regret of $Θ(\log T)$ for strongly convex cost functions; and (2) in the multi-agent setting of strongly monotone games, with each agent employing OGD, we obtain last-iterate convergence of the joint action to a unique Nash equilibrium at an optimal rate of $Θ(\frac{1}{T})$. While these finite-time guarantees highlight its merits, OGD has the drawback that it requires knowing the strong convexity/monotonicity parameters. In this paper, we design a fully adaptive OGD algorithm, \textsf{AdaOGD}, that does not require a priori knowledge of these parameters. In the single-agent setting, our algorithm achieves $O(\log^2(T))$ regret under strong convexity, which is optimal up to a log factor. Further, if each agent employs \textsf{AdaOGD} in strongly monotone games, the joint action converges in a last-iterate sense to a unique Nash equilibrium at a rate of $O(\frac{\log^3 T}{T})$, again optimal up to log factors. We illustrate our algorithms in a learning version of the classical newsvendor problem, where due to lost sales, only (noisy) gradient feedback can be observed. Our results immediately yield the first feasible and near-optimal algorithm for both the single-retailer and multi-retailer settings. We also extend our results to the more general setting of exp-concave cost functions and games, using the online Newton step (ONS) algorithm.
△ Less
Submitted 28 March, 2024; v1 submitted 21 October, 2023;
originally announced October 2023.
-
Simultaneous Nonparametric Inference of M-regression under Complex Temporal Dynamics
Authors:
Miaoshiqi Liu,
Zhou Zhou
Abstract:
The paper considers simultaneous nonparametric inference for a wide class of M-regression models with time-varying coefficients. The covariates and errors of the regression model are tackled as a general class of nonstationary time series and are allowed to be cross-dependent. We construct $\sqrt{n}$-consistent inference for the cumulative regression function, whose limiting properties are disclos…
▽ More
The paper considers simultaneous nonparametric inference for a wide class of M-regression models with time-varying coefficients. The covariates and errors of the regression model are tackled as a general class of nonstationary time series and are allowed to be cross-dependent. We construct $\sqrt{n}$-consistent inference for the cumulative regression function, whose limiting properties are disclosed using Bahadur representation and Gaussian approximation theory. A simple and unified self-convolved bootstrap procedure is proposed. With only one tuning parameter, the bootstrap consistently simulates the desired limiting behavior of the M-estimators under complex temporal dynamics, even under the possible presence of breakpoints in time series. Our methodology leads to a unified framework to conduct general classes of Exact Function Tests, Lack-of-fit Tests, and Qualitative Tests for the time-varying coefficients under complex temporal dynamics. These tests enable one to, among many others, conduct variable selection procedures, check for constancy and linearity, as well as verify shape assumptions, including monotonicity and convexity. As applications, our method is utilized to study the time-varying properties of global climate data and Microsoft stock return, respectively.
△ Less
Submitted 26 February, 2024; v1 submitted 18 October, 2023;
originally announced October 2023.
-
A Parallel-in-time Method Based on Preconditioner for Biot's Model
Authors:
Zeyuan Zhou,
Huipeng Gu,
Guoliang Ju,
Wei Xing
Abstract:
We proposed a parallel-in-time method based on preconditioner for Biot's consolidation model in poroelasticity. In order to achieve a fast and stable convergence for the matrix system of the Biot's model, we design two preconditioners with approximations of the Schur complement. The parallel-in-time method employs an inverted time-step** scheme that iterates to solve the preconditioned linear sy…
▽ More
We proposed a parallel-in-time method based on preconditioner for Biot's consolidation model in poroelasticity. In order to achieve a fast and stable convergence for the matrix system of the Biot's model, we design two preconditioners with approximations of the Schur complement. The parallel-in-time method employs an inverted time-step** scheme that iterates to solve the preconditioned linear system in the outer loop and advances the time step in the inner loop. This allows us to parallelize the iterations with a theoretical parallel efficiency that approaches 1 as the number of time steps and spatial steps grows. We demonstrate the stability, accuracy, and linear speedup of our method on HPC platform through numerical experiments.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Time integration schemes based on neural networks for solving partial differential equations on coarse grids
Authors:
Xinxin Yan,
Zhideng Zhou,
Xiaohan Cheng,
Xiaolei Yang
Abstract:
The accuracy of solving partial differential equations (PDEs) on coarse grids is greatly affected by the choice of discretization schemes. In this work, we propose to learn time integration schemes based on neural networks which satisfy three distinct sets of mathematical constraints, i.e., unconstrained, semi-constrained with the root condition, and fully-constrained with both root and consistenc…
▽ More
The accuracy of solving partial differential equations (PDEs) on coarse grids is greatly affected by the choice of discretization schemes. In this work, we propose to learn time integration schemes based on neural networks which satisfy three distinct sets of mathematical constraints, i.e., unconstrained, semi-constrained with the root condition, and fully-constrained with both root and consistency conditions. We focus on the learning of 3-step linear multistep methods, which we subsequently applied to solve three model PDEs, i.e., the one-dimensional heat equation, the one-dimensional wave equation, and the one-dimensional Burgers' equation. The results show that the prediction error of the learned fully-constrained scheme is close to that of the Runge-Kutta method and Adams-Bashforth method. Compared to the traditional methods, the learned unconstrained and semi-constrained schemes significantly reduce the prediction error on coarse grids. On a grid that is 4 times coarser than the reference grid, the mean square error shows a reduction of up to an order of magnitude for some of the heat equation cases, and a substantial improvement in phase prediction for the wave equation. On a 32 times coarser grid, the mean square error for the Burgers' equation can be reduced by up to 35% to 40%.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
On the sum of the first two largest signless Laplacian eigenvalues of a graph
Authors:
Zi-Ming Zhou,
Chang-Xiang He,
Hai-Ying Shan
Abstract:
For a graph $G$, let $S_2(G)$ be the sum of the first two largest signless Laplacian eigenvalues of $G$, and $f(G)=e(G)+3-S_2(G)$. Oliveira, Lima, Rama and Carvalho conjectured that $K^+_{1,n-1}$ (the star graph with an additional edge) is the unique graph with minimum value of $f(G)$ on $n$ vertices. In this paper, we prove this conjecture, which also confirm a conjecture for the upper bound of…
▽ More
For a graph $G$, let $S_2(G)$ be the sum of the first two largest signless Laplacian eigenvalues of $G$, and $f(G)=e(G)+3-S_2(G)$. Oliveira, Lima, Rama and Carvalho conjectured that $K^+_{1,n-1}$ (the star graph with an additional edge) is the unique graph with minimum value of $f(G)$ on $n$ vertices. In this paper, we prove this conjecture, which also confirm a conjecture for the upper bound of $S_2(G)$ proposed by Ashraf et al.
△ Less
Submitted 13 June, 2024; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Fictitious Play via Finite Differences for Mean Field Games with Optimal Stop**
Authors:
Chengfeng Shen,
Yifan Luo,
Zhennan Zhou
Abstract:
This paper considers mean field games with optimal stop** time (OSMFGs) where agents make optimal exit decisions, the coupled obstacle and Fokker-Planck equations in such models pose challenges versus classic MFGs. This paper proposes a generalized fictitious play algorithm that computes OSMFG mixed equilibria by iteratively solving pure strategy systems, i.e. approximating mixed strategies thro…
▽ More
This paper considers mean field games with optimal stop** time (OSMFGs) where agents make optimal exit decisions, the coupled obstacle and Fokker-Planck equations in such models pose challenges versus classic MFGs. This paper proposes a generalized fictitious play algorithm that computes OSMFG mixed equilibria by iteratively solving pure strategy systems, i.e. approximating mixed strategies through averaging pure strategies according to a certain updating rule. The generalized fictitious play allows for a broad family of learning rates and the convergence to the mixed strategy equilibrium can be rigorously justified. The algorithm also incorporates efficient finite difference schemes of the pure strategy system, and numerical experiments demonstrate the effectiveness of the proposed method in robustly and efficiently computing mixed equilibria for OSMFGs.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Adaptive finite element approximation of bilinear optimal control with fractional Laplacian
Authors:
Fangyuan Wang,
Qiming Wang,
Zhaojie Zhou
Abstract:
We investigate the application of a posteriori error estimates to a fractional optimal control problem with pointwise control constraints. Specifically, we address a problem in which the state equation is formulated as an integral form of the fractional Laplacian equation, with the control variable embedded within the state equation as a coefficient. We propose two distinct finite element discreti…
▽ More
We investigate the application of a posteriori error estimates to a fractional optimal control problem with pointwise control constraints. Specifically, we address a problem in which the state equation is formulated as an integral form of the fractional Laplacian equation, with the control variable embedded within the state equation as a coefficient. We propose two distinct finite element discretization approaches for an optimal control problem. The first approach employs a fully discrete scheme where the control variable is discretized using piecewise constant functions. The second approach, a semi-discrete scheme, does not discretize the control variable. Using the first-order optimality condition, the second-order optimality condition, and a solution regularity analysis for the optimal control problem, we devise a posteriori error estimates. We subsequently demonstrate the reliability and efficiency of the proposed error estimators. Based on the established error estimates framework, an adaptive refinement strategy is developed to help achieve the optimal convergence rate. The effectiveness of the refinement strategy is verified by numerical experiments.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
Data-driven localized waves and parameter discovery in the massive Thirring model via extended physics-informed neural networks with interface zones
Authors:
Junchao Chen,
** Song,
Zijian Zhou,
Zhenya Yan
Abstract:
In this paper, we study data-driven localized wave solutions and parameter discovery in the massive Thirring (MT) model via the deep learning in the framework of physics-informed neural networks (PINNs) algorithm. Abundant data-driven solutions including soliton of bright/dark type, breather and rogue wave are simulated accurately and analyzed contrastively with relative and absolute errors. For h…
▽ More
In this paper, we study data-driven localized wave solutions and parameter discovery in the massive Thirring (MT) model via the deep learning in the framework of physics-informed neural networks (PINNs) algorithm. Abundant data-driven solutions including soliton of bright/dark type, breather and rogue wave are simulated accurately and analyzed contrastively with relative and absolute errors. For higher-order localized wave solutions, we employ the extended PINNs (XPINNs) with domain decomposition to capture the complete pictures of dynamic behaviors such as soliton collisions, breather oscillations and rogue-wave superposition. In particular, we modify the interface line in domain decomposition of XPINNs into a small interface zone and introduce the pseudo initial, residual and gradient conditions as interface conditions linked adjacently with individual neural networks. Then this modified approach is applied successfully to various solutions ranging from bright-bright soliton, dark-dark soliton, dark-antidark soliton, general breather, Kuznetsov-Ma breather and second-order rogue wave. Experimental results show that this improved version of XPINNs reduce the complexity of computation with faster convergence rate and keep the quality of learned solutions with smoother stitching performance as well. For the inverse problems, the unknown coefficient parameters of linear and nonlinear terms in the MT model are identified accurately with and without noise by using the classical PINNs algorithm.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
Neighborhood of vertices in the isogeny graph of principally polarized superspecial abelian surfaces
Authors:
Zheng Xu,
Yi Ouyang,
Zijian Zhou
Abstract:
For two supersingular elliptic curves $E$ and $E'$ defined over $\mathbb{F}_{p^2}$, let $[E \times E']$ be the superspecial abelian surface with the principal polarization $\{0\} \times E' + E \times \{0\}$. We determine local structure of the vertices $[E \times E']$ in the $(\ell, \ell)$-isogeny graph of principally polarized superspecial abelian surfaces where either $E$ or $E'$ is defined over…
▽ More
For two supersingular elliptic curves $E$ and $E'$ defined over $\mathbb{F}_{p^2}$, let $[E \times E']$ be the superspecial abelian surface with the principal polarization $\{0\} \times E' + E \times \{0\}$. We determine local structure of the vertices $[E \times E']$ in the $(\ell, \ell)$-isogeny graph of principally polarized superspecial abelian surfaces where either $E$ or $E'$ is defined over $\mathbb{F}_p$. We also present a simple new proof of the main theorem in \cite{LOX20}.
△ Less
Submitted 12 March, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
Adaptive finite element approximation of sparse optimal control with integral fractional Laplacian
Authors:
Fangyuan Wang,
Qiming Wang,
Zhaojie Zhou
Abstract:
In this paper we present and analyze a weighted residual a posteriori error estimate for an optimal control problem. The problem involves a nondifferentiable cost functional, a state equation with an integral fractional Laplacian, and control constraints. We employ subdifferentiation in the context of nondifferentiable convex analysis to obtain first-order optimality conditions. Piecewise linear p…
▽ More
In this paper we present and analyze a weighted residual a posteriori error estimate for an optimal control problem. The problem involves a nondifferentiable cost functional, a state equation with an integral fractional Laplacian, and control constraints. We employ subdifferentiation in the context of nondifferentiable convex analysis to obtain first-order optimality conditions. Piecewise linear polynomials are utilized to approximate the solutions of the state and adjoint equations. The control variable is discretized using the variational discretization method. Upper and lower bounds for the a posteriori error estimate of the finite element approximation of the optimal control problem are derived. In the region where 3/2 < alpha < 2, the residuals do not satisfy the L2(Omega) regularity. To address this issue, an additional weight is included in the weighted residual estimator, which is based on a power of the distance from the mesh skeleton. Furthermore, we propose an h-adaptive algorithm driven by the posterior view error estimator, utilizing the Dorfler labeling criterion. The convergence analysis results show that the approximation sequence generated by the adaptive algorithm converges at the optimal algebraic rate. Finally, numerical experiments are conducted to validate the theoretical results.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Quantitative Convergence Analysis of Path Integral Representations for Quantum Thermal Average
Authors:
Xuda Ye,
Zhennan Zhou
Abstract:
The quantum thermal average is a central topic in quantum physics and can be represented by the path integrals. For the computational perspective, the path integral representation (PIR) needs to be approximated in a finite-dimensional space, and the convergence of such approximation is termed as the convergence of the PIR. In this paper, we establish the Trotter product formula in the trace form,…
▽ More
The quantum thermal average is a central topic in quantum physics and can be represented by the path integrals. For the computational perspective, the path integral representation (PIR) needs to be approximated in a finite-dimensional space, and the convergence of such approximation is termed as the convergence of the PIR. In this paper, we establish the Trotter product formula in the trace form, which connects the quantum thermal average and the Boltzmann distribution of a continuous loop in a rigorous way. We prove the qualitative convergence of the standard PIR, and obtain the explicit convergence rates of the continuous loop PIR. These results showcase various approaches to approximate the quantum thermal average, which provide theoretical guarantee for the path integral approaches of quantum thermal equilibrium systems, such as the path integral molecular dynamics.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
Tumor growth with a necrotic core as an obstacle problem in pressure
Authors:
Xu'an Dou,
Chengfeng Shen,
Zhennan Zhou
Abstract:
Motivated by the incompressible limit of a cell density model, we propose a free boundary tumor growth model where the pressure satisfies an obstacle problem on an evolving domain $Ω(t)$, and the coincidence set $Λ(t)$ captures the emerging necrotic core. We contribute to the analytical characterization of the solution structure in the following two aspects. By deriving a semi-analytical solution…
▽ More
Motivated by the incompressible limit of a cell density model, we propose a free boundary tumor growth model where the pressure satisfies an obstacle problem on an evolving domain $Ω(t)$, and the coincidence set $Λ(t)$ captures the emerging necrotic core. We contribute to the analytical characterization of the solution structure in the following two aspects. By deriving a semi-analytical solution and studying its dynamical behavior, we obtain quantitative transitional properties of the solution separating phases in the development of necrotic cores and establish its long time limit with the traveling wave solutions. Also, we prove the existence of traveling wave solutions incorporating non-zero outer densities outside the tumor bulk, provided that the size of the outer density is below a threshold.
△ Less
Submitted 31 October, 2023; v1 submitted 31 August, 2023;
originally announced September 2023.
-
Solving Poisson Problems in Polygonal Domains with Singularity Enriched Physics Informed Neural Networks
Authors:
Tianhao Hu,
Bangti **,
Zhi Zhou
Abstract:
Physics-Informed Neural Networks (PINNs) are a powerful class of numerical solvers for partial differential equations, employing deep neural networks with successful applications across a diverse set of problems. However, their effectiveness is somewhat diminished when addressing issues involving singularities, such as point sources or geometric irregularities, where the approximations they provid…
▽ More
Physics-Informed Neural Networks (PINNs) are a powerful class of numerical solvers for partial differential equations, employing deep neural networks with successful applications across a diverse set of problems. However, their effectiveness is somewhat diminished when addressing issues involving singularities, such as point sources or geometric irregularities, where the approximations they provide often suffer from reduced accuracy due to the limited regularity of the exact solution. In this work, we investigate PINNs for solving Poisson equations in polygonal domains with geometric singularities and mixed boundary conditions. We propose a novel singularity enriched PINN (SEPINN), by explicitly incorporating the singularity behavior of the analytic solution, e.g., corner singularity, mixed boundary condition and edge singularities, into the ansatz space, and present a convergence analysis of the scheme. We present extensive numerical simulations in two and three-dimensions to illustrate the efficiency of the method, and also a comparative study with several existing neural network based approaches.
△ Less
Submitted 17 April, 2024; v1 submitted 30 August, 2023;
originally announced August 2023.