-
Robust First and Second-Order Differentiation for Regularized Optimal Transport
Authors:
Xingjie Li,
Fei Lu,
Molei Tao,
Felix X. -F. Ye
Abstract:
Applications such as unbalanced and fully shuffled regression can be approached by optimizing regularized optimal transport (OT) distances, such as the entropic OT and Sinkhorn distances. A common approach for this optimization is to use a first-order optimizer, which requires the gradient of the OT distance. For faster convergence, one might also resort to a second-order optimizer, which addition…
▽ More
Applications such as unbalanced and fully shuffled regression can be approached by optimizing regularized optimal transport (OT) distances, such as the entropic OT and Sinkhorn distances. A common approach for this optimization is to use a first-order optimizer, which requires the gradient of the OT distance. For faster convergence, one might also resort to a second-order optimizer, which additionally requires the Hessian. The computations of these derivatives are crucial for efficient and accurate optimization. However, they present significant challenges in terms of memory consumption and numerical instability, especially for large datasets and small regularization strengths. We circumvent these issues by analytically computing the gradients for OT distances and the Hessian for the entropic OT distance, which was not previously used due to intricate tensor-wise calculations and the complex dependency on parameters within the bi-level loss function. Through analytical derivation and spectral analysis, we identify and resolve the numerical instability caused by the singularity and ill-posedness of a key linear system. Consequently, we achieve scalable and stable computation of the Hessian, enabling the implementation of the stochastic gradient descent (SGD)-Newton methods. Tests on shuffled regression examples demonstrate that the second stage of the SGD-Newton method converges orders of magnitude faster than the gradient descent-only method while achieving significantly more accurate parameter estimations.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Removable edges in near-bipartite bricks
Authors:
Yipei Zhang,
Fuliang Lu,
Xiumei Wang,
**jiang Yuan
Abstract:
An edge $e$ of a matching covered graph $G$ is removable if $G-e$ is also matching covered. The notion of removable edge arises in connection with ear decompositions of matching covered graphs introduced by Lovász and Plummer. A nonbipartite matching covered graph $G$ is a brick if it is free of nontrivial tight cuts. Carvalho, Lucchesi, and Murty proved that every brick other than $K_4$ and…
▽ More
An edge $e$ of a matching covered graph $G$ is removable if $G-e$ is also matching covered. The notion of removable edge arises in connection with ear decompositions of matching covered graphs introduced by Lovász and Plummer. A nonbipartite matching covered graph $G$ is a brick if it is free of nontrivial tight cuts. Carvalho, Lucchesi, and Murty proved that every brick other than $K_4$ and $\overline{C_6}$ has at least $Δ-2$ removable edges. A brick $G$ is near-bipartite if it has a pair of edges $\{e_1,e_2\}$ such that $G-\{e_1,e_2\}$ is a bipartite matching covered graph. In this paper, we show that in a near-bipartite brick $G$ with at least six vertices, every vertex of $G$, except at most six vertices of degree three contained in two disjoint triangles, is incident with at most two nonremovable edges; consequently, $G$ has at least $\frac{|V(G)|-6}{2}$ removable edges. Moreover, all graphs attaining this lower bound are characterized.
△ Less
Submitted 31 May, 2024;
originally announced June 2024.
-
Probabilistic cellular automata with local transition matrices: synchronization, ergodicity, and inference
Authors:
Erhan Bayraktar,
Fei Lu,
Mauro Maggioni,
Ruoyu Wu,
Sichen Yang
Abstract:
We introduce a new class of probabilistic cellular automata that are capable of exhibiting rich dynamics such as synchronization and ergodicity and can be easily inferred from data. The system is a finite-state locally interacting Markov chain on a circular graph. Each site's subsequent state is random, with a distribution determined by its neighborhood's empirical distribution multiplied by a loc…
▽ More
We introduce a new class of probabilistic cellular automata that are capable of exhibiting rich dynamics such as synchronization and ergodicity and can be easily inferred from data. The system is a finite-state locally interacting Markov chain on a circular graph. Each site's subsequent state is random, with a distribution determined by its neighborhood's empirical distribution multiplied by a local transition matrix. We establish sufficient and necessary conditions on the local transition matrix for synchronization and ergodicity. Also, we introduce novel least squares estimators for inferring the local transition matrix from various types of data, which may consist of either multiple trajectories, a long trajectory, or ensemble sequences without trajectory information. Under suitable identifiability conditions, we show the asymptotic normality of these estimators and provide non-asymptotic bounds for their accuracy.
△ Less
Submitted 23 June, 2024; v1 submitted 5 May, 2024;
originally announced May 2024.
-
All meromorphic solutions of Fermat-type functional equations
Authors:
Feng Lü
Abstract:
In this paper, by making use of properties of elliptic functions, we describe meromorphic solutions of Fermat-type functional equations $f(z)^{n}+f(L(z))^{m}=1$ over the complex plane $\mathbb{C}$, where $L(z)$ is a nonconstant entire function, $m$ and $n$ are two positive integers. As applications, we also consider meromorphic solutions of Fermat-type difference and $q$-difference equations.
In this paper, by making use of properties of elliptic functions, we describe meromorphic solutions of Fermat-type functional equations $f(z)^{n}+f(L(z))^{m}=1$ over the complex plane $\mathbb{C}$, where $L(z)$ is a nonconstant entire function, $m$ and $n$ are two positive integers. As applications, we also consider meromorphic solutions of Fermat-type difference and $q$-difference equations.
△ Less
Submitted 21 May, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Neural-Network-Based Optimal Guidance for Lunar Vertical Landing
Authors:
Kun Wang,
Zheng Chen,
Fangmin Lu,
Jun Li
Abstract:
This paper addresses an optimal guidance problem concerning the vertical landing of a lunar lander with the objective of minimizing fuel consumption. The vertical landing imposes a final attitude constraint, which is treated as a final control constraint. To handle this constraint, we propose a nonnegative small regularization term to augment the original cost functional. This ensures the satisfac…
▽ More
This paper addresses an optimal guidance problem concerning the vertical landing of a lunar lander with the objective of minimizing fuel consumption. The vertical landing imposes a final attitude constraint, which is treated as a final control constraint. To handle this constraint, we propose a nonnegative small regularization term to augment the original cost functional. This ensures the satisfaction of the final control constraint in accordance with Pontryagin's Minimum Principle. By leveraging the necessary conditions for optimality, we establish a parameterized system that facilitates the generation of numerous optimal trajectories, which contain the nonlinear map** from the flight state to the optimal guidance command. Subsequently, a neural network is trained to approximate such map**. Finally, numerical examples are presented to validate the proposed method.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel
Authors:
Quanjun Lang,
Xiong Wang,
Fei Lu,
Mauro Maggioni
Abstract:
Modeling multi-agent systems on networks is a fundamental challenge in a wide variety of disciplines. We jointly infer the weight matrix of the network and the interaction kernel, which determine respectively which agents interact with which others and the rules of such interactions from data consisting of multiple trajectories. The estimator we propose leads naturally to a non-convex optimization…
▽ More
Modeling multi-agent systems on networks is a fundamental challenge in a wide variety of disciplines. We jointly infer the weight matrix of the network and the interaction kernel, which determine respectively which agents interact with which others and the rules of such interactions from data consisting of multiple trajectories. The estimator we propose leads naturally to a non-convex optimization problem, and we investigate two approaches for its solution: one is based on the alternating least squares (ALS) algorithm; another is based on a new algorithm named operator regression with alternating least squares (ORALS). Both algorithms are scalable to large ensembles of data trajectories. We establish coercivity conditions guaranteeing identifiability and well-posedness. The ALS algorithm appears statistically efficient and robust even in the small data regime but lacks performance and convergence guarantees. The ORALS estimator is consistent and asymptotically normal under a coercivity condition. We conduct several numerical experiments ranging from Kuramoto particle systems on networks to opinion dynamics in leader-follower models.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
A Physics-Informed Indirect Method for Trajectory Optimization
Authors:
Kun Wang,
Fangmin Lu,
Zheng Chen,
Jun Li
Abstract:
This work presents a Physics-Informed Indirect Method (PIIM) that propagates the dynamics of both states and co-states backward in time for trajectory optimization problems. In the case of a Time-Optimal Soft Landing Problem (TOSLP), based on the initial co-state vector normalization technique, we show that the initial guess of the mass co-state and the numerical factor can be eliminated from the…
▽ More
This work presents a Physics-Informed Indirect Method (PIIM) that propagates the dynamics of both states and co-states backward in time for trajectory optimization problems. In the case of a Time-Optimal Soft Landing Problem (TOSLP), based on the initial co-state vector normalization technique, we show that the initial guess of the mass co-state and the numerical factor can be eliminated from the shooting procedure. As a result, the initial guess of the unknown co-states can be constrained to lie on a unit 3-D hypersphere. Then, using the PIIM allows one to exploit the physical significance of the optimal control law, which further narrows down the solution space to a unit 3-D octant sphere. Meanwhile, the analytical estimations of the fuel consumption and final time are provided. Additionally, a usually overlooked issue that results in an infeasible solution with a negative final time, is fixed by a simple remedy strategy. Consequently, the reduced solution space becomes sufficiently small to ensure fast, robust, and guaranteed convergence for the TOSLP. Then, we extend the PIIM to solve the Fuel-Optimal Soft Landing Problem (FOSLP) with a homotopy approach. The numerical simulations show that compared with the conventional indirect method with a success rate of 89.35%, it takes a shorter time for the proposed method to find the feasible solution to the FOSLP with a success rate of 100%.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Scalable iterative data-adaptive RKHS regularization
Authors:
Haibo Li,
**chao Feng,
Fei Lu
Abstract:
We present iDARR, a scalable iterative Data-Adaptive RKHS Regularization method, for solving ill-posed linear inverse problems. The method searches for solutions in subspaces where the true solution can be identified, with the data-adaptive RKHS penalizing the spaces of small singular values. At the core of the method is a new generalized Golub-Kahan bidiagonalization procedure that recursively co…
▽ More
We present iDARR, a scalable iterative Data-Adaptive RKHS Regularization method, for solving ill-posed linear inverse problems. The method searches for solutions in subspaces where the true solution can be identified, with the data-adaptive RKHS penalizing the spaces of small singular values. At the core of the method is a new generalized Golub-Kahan bidiagonalization procedure that recursively constructs orthonormal bases for a sequence of RKHS-restricted Krylov subspaces. The method is scalable with a complexity of $O(kmn)$ for $m$-by-$n$ matrices with $k$ denoting the iteration numbers. Numerical tests on the Fredholm integral equation and 2D image deblurring show that it outperforms the widely used $L^2$ and $l^2$ norms, producing stable accurate solutions consistently converging when the noise level decays.
△ Less
Submitted 31 December, 2023;
originally announced January 2024.
-
Optimal minimax rate of learning interaction kernels
Authors:
Xiong Wang,
Inbar Seroussi,
Fei Lu
Abstract:
Nonparametric estimation of nonlocal interaction kernels is crucial in various applications involving interacting particle systems. The inference challenge, situated at the nexus of statistical learning and inverse problems, comes from the nonlocal dependency. A central question is whether the optimal minimax rate of convergence for this problem aligns with the rate of $M^{-\frac{2β}{2β+1}}$ in cl…
▽ More
Nonparametric estimation of nonlocal interaction kernels is crucial in various applications involving interacting particle systems. The inference challenge, situated at the nexus of statistical learning and inverse problems, comes from the nonlocal dependency. A central question is whether the optimal minimax rate of convergence for this problem aligns with the rate of $M^{-\frac{2β}{2β+1}}$ in classical nonparametric regression, where $M$ is the sample size and $β$ represents the smoothness exponent of the radial kernel. Our study confirms this alignment for systems with a finite number of particles.
We introduce a tamed least squares estimator (tLSE) that attains the optimal convergence rate for a broad class of exchangeable distributions. The tLSE bridges the smallest eigenvalue of random matrices and Sobolev embedding. This estimator relies on nonasymptotic estimates for the left tail probability of the smallest eigenvalue of the normal matrix. The lower minimax rate is derived using the Fano-Tsybakov hypothesis testing method. Our findings reveal that provided the inverse problem in the large sample limit satisfies a coercivity condition, the left tail probability does not alter the bias-variance tradeoff, and the optimal minimax rate remains intact. Our tLSE method offers a straightforward approach for establishing the optimal minimax rate for models with either local or nonlocal dependency.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
The minimum degree of minimal $k$-factor-critical claw-free graphs*
Authors:
**g Guo,
Qiuli Li,
Fuliang Lu,
He** Zhang
Abstract:
A graph $G$ of order $n$ is said to be $k$-factor-critical for integers $1\leq k< n$, if the removal of any $k$ vertices results in a graph with a perfect matching. A $k$-factor-critical graph is minimal if for every edge, the deletion of it results in a graph that is not $k$-factor-critical. In 1998, O. Favaron and M. Shi conjectured that every minimal $k$-factor-critical graph has minimum degree…
▽ More
A graph $G$ of order $n$ is said to be $k$-factor-critical for integers $1\leq k< n$, if the removal of any $k$ vertices results in a graph with a perfect matching. A $k$-factor-critical graph is minimal if for every edge, the deletion of it results in a graph that is not $k$-factor-critical. In 1998, O. Favaron and M. Shi conjectured that every minimal $k$-factor-critical graph has minimum degree $k+1$. In this paper, we confirm the conjecture for minimal $k$-factor-critical claw-free graphs. Moreover, we show that every minimal $k$-factor-critical claw-free graph $G$ has at least $\frac{k-1}{2k}|V(G)|$ vertices of degree $k+1$ in the case of $(k+1)$-connected, yielding further evidence for S. Norine and R. Thomas' conjecture on the minimum degree of minimal bricks when $k=2$.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Real-time optimal control for attitude-constrained solar sailcrafts via neural networks
Authors:
Kun Wang,
Fangmin Lu,
Zheng Chen,
Jun Li
Abstract:
This work is devoted to generating optimal guidance commands in real time for attitude-constrained solar sailcrafts in coplanar circular-to-circular interplanetary transfers. Firstly, a nonlinear optimal control problem is established, and necessary conditions for optimality are derived by the Pontryagin's Minimum Principle. Under some assumptions, the attitude constraints are rewritten as control…
▽ More
This work is devoted to generating optimal guidance commands in real time for attitude-constrained solar sailcrafts in coplanar circular-to-circular interplanetary transfers. Firstly, a nonlinear optimal control problem is established, and necessary conditions for optimality are derived by the Pontryagin's Minimum Principle. Under some assumptions, the attitude constraints are rewritten as control constraints, which are replaced by a saturation function so that a parameterized system is formulated to generate an optimal trajectory via solving an initial value problem. This approach allows for the efficient generation of a dataset containing optimal samples, which are essential for training Neural Networks (NNs) to achieve real-time implementation. However, the optimal guidance command may suddenly change from one extreme to another, resulting in discontinuous jumps that generally impair the NN's approximation performance. To address this issue, we use two co-states that the optimal guidance command depends on, to detect discontinuous jumps. A procedure for preprocessing these jumps is then established, thereby ensuring that the preprocessed guidance command remains smooth. Meanwhile, the sign of one co-state is found to be sufficient to revert the preprocessed guidance command back into the original optimal guidance command. Furthermore, three NNs are built and trained offline, and they cooperate together to precisely generate the optimal guidance command in real time. Finally, numerical simulations are presented to demonstrate the developments of the paper.
△ Less
Submitted 16 November, 2023; v1 submitted 16 September, 2023;
originally announced September 2023.
-
Convex Q Learning in a Stochastic Environment: Extended Version
Authors:
Fan Lu,
Sean Meyn
Abstract:
The paper introduces the first formulation of convex Q-learning for Markov decision processes with function approximation. The algorithms and theory rest on a relaxation of a dual of Manne's celebrated linear programming characterization of optimal control. The main contributions firstly concern properties of the relaxation, described as a deterministic convex program: we identify conditions for a…
▽ More
The paper introduces the first formulation of convex Q-learning for Markov decision processes with function approximation. The algorithms and theory rest on a relaxation of a dual of Manne's celebrated linear programming characterization of optimal control. The main contributions firstly concern properties of the relaxation, described as a deterministic convex program: we identify conditions for a bounded solution, and a significant relationship between the solution to the new convex program, and the solution to standard Q-learning. The second set of contributions concern algorithm design and analysis: (i) A direct model-free method for approximating the convex program for Q-learning shares properties with its ideal. In particular, a bounded solution is ensured subject to a simple property of the basis functions; (ii) The proposed algorithms are convergent and new techniques are introduced to obtain the rate of convergence in a mean-square sense; (iii) The approach can be generalized to a range of performance criteria, and it is found that variance can be reduced by considering ``relative'' dynamic programming equations; (iv) The theory is illustrated with an application to a classical inventory control problem.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
A New Smoothing Technique for Bang-Bang Optimal Control Problems
Authors:
Kun Wang,
Zheng Chen,
Zhenyu Wei,
Fangmin Lu,
Jun Li
Abstract:
Bang-bang control is ubiquitous for Optimal Control Problems (OCPs) where the constrained control variable appears linearly in the dynamics and cost function. Based on the Pontryagin's Minimum Principle, the indirect method is widely used to numerically solve OCPs because it enables to derive the theoretical structure of the optimal control. However, discontinuities in the bang-bang control struct…
▽ More
Bang-bang control is ubiquitous for Optimal Control Problems (OCPs) where the constrained control variable appears linearly in the dynamics and cost function. Based on the Pontryagin's Minimum Principle, the indirect method is widely used to numerically solve OCPs because it enables to derive the theoretical structure of the optimal control. However, discontinuities in the bang-bang control structure may result in numerical difficulties for gradient-based indirect method. In this case, smoothing or regularization procedures are usually applied to eliminating the discontinuities of bang-bang controls. Traditional smoothing or regularization procedures generally modify the cost function by adding a term depending on a small parameter, or introducing a small error into the state equation. Those procedures may complexify the numerical algorithms or degenerate the convergence performance. To overcome these issues, we propose a bounded smooth function, called normalized L2-norm function, to approximate the sign function in terms of the switching function. The resulting optimal control is smooth and can be readily embedded into the indirect method. Then, the simplicity and improved performance of the proposed method over some existing methods are numerically demonstrated by a minimal-time oscillator problem and a minimal-fuel low-thrust trajectory optimization problem that involves many revolutions.
△ Less
Submitted 1 December, 2023; v1 submitted 6 September, 2023;
originally announced September 2023.
-
PINNacle: A Comprehensive Benchmark of Physics-Informed Neural Networks for Solving PDEs
Authors:
Zhongkai Hao,
Jiachen Yao,
Chang Su,
Hang Su,
Ziao Wang,
Fanzhi Lu,
Zeyu Xia,
Yichi Zhang,
Songming Liu,
Lu Lu,
Jun Zhu
Abstract:
While significant progress has been made on Physics-Informed Neural Networks (PINNs), a comprehensive comparison of these methods across a wide range of Partial Differential Equations (PDEs) is still lacking. This study introduces PINNacle, a benchmarking tool designed to fill this gap. PINNacle provides a diverse dataset, comprising over 20 distinct PDEs from various domains, including heat condu…
▽ More
While significant progress has been made on Physics-Informed Neural Networks (PINNs), a comprehensive comparison of these methods across a wide range of Partial Differential Equations (PDEs) is still lacking. This study introduces PINNacle, a benchmarking tool designed to fill this gap. PINNacle provides a diverse dataset, comprising over 20 distinct PDEs from various domains, including heat conduction, fluid dynamics, biology, and electromagnetics. These PDEs encapsulate key challenges inherent to real-world problems, such as complex geometry, multi-scale phenomena, nonlinearity, and high dimensionality. PINNacle also offers a user-friendly toolbox, incorporating about 10 state-of-the-art PINN methods for systematic evaluation and comparison. We have conducted extensive experiments with these methods, offering insights into their strengths and weaknesses. In addition to providing a standardized means of assessing performance, PINNacle also offers an in-depth analysis to guide future research, particularly in areas such as domain decomposition methods and loss reweighting for handling multi-scale problems and complex geometry. To the best of our knowledge, it is the largest benchmark with a diverse and comprehensive evaluation that will undoubtedly foster further research in PINNs.
△ Less
Submitted 5 October, 2023; v1 submitted 14 June, 2023;
originally announced June 2023.
-
%-Immanants and Temperley-Lieb Immanants
Authors:
Frank Lu,
Kevin Ren,
Dawei Shen,
Siki Wang
Abstract:
In this paper, we investigate the relationship between Temperley-Lieb immanants, which were introduced by Rhoades and Skandera, and %-immanants, an immanant based on a concept introduced by Chepuri and Sherman-Bennett. Our main result is a classification of when a Temperley-Lieb immanant can be written as a linear combination of %-immanants. This result uses a formula by Rhoades and Skandera to co…
▽ More
In this paper, we investigate the relationship between Temperley-Lieb immanants, which were introduced by Rhoades and Skandera, and %-immanants, an immanant based on a concept introduced by Chepuri and Sherman-Bennett. Our main result is a classification of when a Temperley-Lieb immanant can be written as a linear combination of %-immanants. This result uses a formula by Rhoades and Skandera to compute Temperley-Lieb immanants in terms of complementary minors. Using this formula, we also derive an explicit expression for the coefficients of a Temperley-Lieb immanant coming from a $321$-, $1324$-avoiding permutation $w$ containing the pattern $2143,$ which we use to derive our main result.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
An adaptive RKHS regularization for Fredholm integral equations
Authors:
Fei Lu,
Miao-Jung Yvonne Ou
Abstract:
Regularization is a long-standing challenge for ill-posed linear inverse problems, and a prototype is the Fredholm integral equation of the first kind with additive Gaussian measurement noise. We introduce a new RKHS regularization adaptive to measurement data and the underlying linear operator. This RKHS arises naturally in a variational approach, and its closure is the function space in which we…
▽ More
Regularization is a long-standing challenge for ill-posed linear inverse problems, and a prototype is the Fredholm integral equation of the first kind with additive Gaussian measurement noise. We introduce a new RKHS regularization adaptive to measurement data and the underlying linear operator. This RKHS arises naturally in a variational approach, and its closure is the function space in which we can identify the true solution. Also, we introduce a small noise analysis to compare regularization norms by sharp convergence rates in the small noise limit. Our analysis shows that the RKHS- and $L^2$-regularizers yield the same convergence rate when their optimal hyper-parameters are selected using the true solution, and the RKHS-regularizer has a smaller multiplicative constant. However, in computational practice, the RKHS regularizer significantly outperforms the $L^2$-and $l^2$-regularizers in producing consistently converging estimators when the noise level decays or the observation mesh refines.
△ Less
Submitted 4 December, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
Sufficient Exploration for Convex Q-learning
Authors:
Fan Lu,
Prashant Mehta,
Sean Meyn,
Gergely Neu
Abstract:
In recent years there has been a collective research effort to find new formulations of reinforcement learning that are simultaneously more efficient and more amenable to analysis. This paper concerns one approach that builds on the linear programming (LP) formulation of optimal control of Manne. A primal version is called logistic Q-learning, and a dual variant is convex Q-learning. This paper fo…
▽ More
In recent years there has been a collective research effort to find new formulations of reinforcement learning that are simultaneously more efficient and more amenable to analysis. This paper concerns one approach that builds on the linear programming (LP) formulation of optimal control of Manne. A primal version is called logistic Q-learning, and a dual variant is convex Q-learning. This paper focuses on the latter, while building bridges with the former. The main contributions follow: (i) The dual of convex Q-learning is not precisely Manne's LP or a version of logistic Q-learning, but has similar structure that reveals the need for regularization to avoid over-fitting. (ii) A sufficient condition is obtained for a bounded solution to the Q-learning LP. (iii) Simulation studies reveal numerical challenges when addressing sampled-data systems based on a continuous time model. The challenge is addressed using state-dependent sampling. The theory is illustrated with applications to examples from OpenAI gym. It is shown that convex Q-learning is successful in cases where standard Q-learning diverges, such as the LQR problem.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Model-Free Characterizations of the Hamilton-Jacobi-Bellman Equation and Convex Q-Learning in Continuous Time
Authors:
Fan Lu,
Joel Mathias,
Sean Meyn,
Karanjit Kalsi
Abstract:
Convex Q-learning is a recent approach to reinforcement learning, motivated by the possibility of a firmer theory for convergence, and the possibility of making use of greater a priori knowledge regarding policy or value function structure. This paper explores algorithm design in the continuous time domain, with finite-horizon optimal control objective. The main contributions are (i) Algorithm des…
▽ More
Convex Q-learning is a recent approach to reinforcement learning, motivated by the possibility of a firmer theory for convergence, and the possibility of making use of greater a priori knowledge regarding policy or value function structure. This paper explores algorithm design in the continuous time domain, with finite-horizon optimal control objective. The main contributions are (i) Algorithm design is based on a new Q-ODE, which defines the model-free characterization of the Hamilton-Jacobi-Bellman equation. (ii) The Q-ODE motivates a new formulation of Convex Q-learning that avoids the approximations appearing in prior work. The Bellman error used in the algorithm is defined by filtered measurements, which is beneficial in the presence of measurement noise. (iii) A characterization of boundedness of the constraint region is obtained through a non-trivial extension of recent results from the discrete time setting. (iv) The theory is illustrated in application to resource allocation for distributed energy resources, for which the theory is ideally suited.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Stochastic Data-Driven Variational Multiscale Reduced Order Models
Authors:
Fei Lu,
Changhong Mou,
Honghu Liu,
Traian Iliescu
Abstract:
Trajectory-wise data-driven reduced order models (ROMs) tend to be sensitive to training data, and thus lack robustness. We propose to construct a robust stochastic ROM closure (S-ROM) from data consisting of multiple trajectories from random initial conditions. The S-ROM is a low-dimensional time series model for the coefficients of the dominating proper orthogonal decomposition (POD) modes infer…
▽ More
Trajectory-wise data-driven reduced order models (ROMs) tend to be sensitive to training data, and thus lack robustness. We propose to construct a robust stochastic ROM closure (S-ROM) from data consisting of multiple trajectories from random initial conditions. The S-ROM is a low-dimensional time series model for the coefficients of the dominating proper orthogonal decomposition (POD) modes inferred from data. Thus, it achieves reduction both space and time, leading to simulations orders of magnitude faster than the full order model. We show that both the estimated POD modes and parameters in the S-ROM converge when the number of trajectories increases. Thus, the S-ROM is robust when the training data size increases. We demonstrate the S-ROM on a 1D Burgers equation with a viscosity $ν= 0.002$ and with random initial conditions. The numerical results verify the convergence. Furthermore, the S-ROM makes accurate trajectory-wise predictions from new initial conditions and with a prediction time far beyond the training range, and it quantifies the spread of uncertainties due to the unresolved scales.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
NySALT: Nyström-type inference-based schemes adaptive to large time-step**
Authors:
Xingjie Li,
Fei Lu,
Molei Tao,
Felix Ye
Abstract:
Large time-step** is important for efficient long-time simulations of deterministic and stochastic Hamiltonian dynamical systems. Conventional structure-preserving integrators, while being successful for generic systems, have limited tolerance to time step size due to stability and accuracy constraints. We propose to use data to innovate classical integrators so that they can be adaptive to larg…
▽ More
Large time-step** is important for efficient long-time simulations of deterministic and stochastic Hamiltonian dynamical systems. Conventional structure-preserving integrators, while being successful for generic systems, have limited tolerance to time step size due to stability and accuracy constraints. We propose to use data to innovate classical integrators so that they can be adaptive to large time-step** and are tailored to each specific system. In particular, we introduce NySALT, Nyström-type inference-based schemes adaptive to large time-step**. The NySALT has optimal parameters for each time step learnt from data by minimizing the one-step prediction error. Thus, it is tailored for each time step size and the specific system to achieve optimal performance and tolerate large time-step** in an adaptive fashion. We prove and numerically verify the convergence of the estimators as data size increases. Furthermore, analysis and numerical tests on the deterministic and stochastic Fermi-Pasta-Ulam (FPU) models show that NySALT enlarges the maximal admissible step size of linear stability, and quadruples the time step size of the Störmer--Verlet and the BAOAB when maintaining similar levels of accuracy.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
Shock trace prediction by reduced models for a viscous stochastic Burgers equation
Authors:
Nan Chen,
Honghu Liu,
Fei Lu
Abstract:
Viscous shocks are a particular type of extreme events in nonlinear multiscale systems, and their representation requires small scales. Model reduction can thus play an important role in reducing the computational cost for an efficient prediction of shocks. Yet, reduced models typically aim to approximate large-scale dominating dynamics, which do not resolve the small scales by design. To resolve…
▽ More
Viscous shocks are a particular type of extreme events in nonlinear multiscale systems, and their representation requires small scales. Model reduction can thus play an important role in reducing the computational cost for an efficient prediction of shocks. Yet, reduced models typically aim to approximate large-scale dominating dynamics, which do not resolve the small scales by design. To resolve this representation barrier, we introduce a new qualitative characterization of the space-time locations of shocks, named as the ``shock trace'', via a space-time indicator function based on an empirical resolution-adaptive threshold. Different from the exact shocks, the shock traces can be captured within the representation capacity of the large scales, which facilitates the forecast of the timing and locations of the shocks utilizing reduced models. Within the context of a viscous stochastic Burgers equation, we show that a data-driven reduced model, in the form of nonlinear autoregression (NAR) time series models, can accurately predict the random shock traces, with relatively low rates of false predictions. The NAR model significantly outperforms the corresponding Galerkin truncated model in the scenario of either noiseless or noisy observations. The results illustrate the importance of the data-driven closure terms in the NAR model, which account for the effects of the unresolved small scale dynamics on the resolved ones due to nonlinear interactions.
△ Less
Submitted 27 December, 2021;
originally announced December 2021.
-
Identifiability of interaction kernels in mean-field equations of interacting particles
Authors:
Quanjun Lang,
Fei Lu
Abstract:
This study examines the identifiability of interaction kernels in mean-field equations of interacting particles or agents, an area of growing interest across various scientific and engineering fields. The main focus is identifying data-dependent function spaces where a quadratic loss functional possesses a unique minimizer. We consider two data-adaptive $L^2$ spaces: one weighted by a data-adaptiv…
▽ More
This study examines the identifiability of interaction kernels in mean-field equations of interacting particles or agents, an area of growing interest across various scientific and engineering fields. The main focus is identifying data-dependent function spaces where a quadratic loss functional possesses a unique minimizer. We consider two data-adaptive $L^2$ spaces: one weighted by a data-adaptive measure and the other using the Lebesgue measure. In each $L^2$ space, we show that the function space of identifiability is the closure of the RKHS associated with the integral operator of inversion.
Alongside prior research, our study completes a full characterization of identifiability in interacting particle systems with either finite or infinite particles, highlighting critical differences between these two settings. Moreover, the identifiability analysis has important implications for computational practice. It shows that the inverse problem is ill-posed, necessitating regularization. Our numerical demonstrations show that the weighted $L^2$ space is preferable over the unweighted $L^2$ space, as it yields more accurate regularized estimators.
△ Less
Submitted 20 May, 2023; v1 submitted 10 June, 2021;
originally announced June 2021.
-
ISALT: Inference-based schemes adaptive to large time-step** for locally Lipschitz ergodic systems
Authors:
Xingjie Li,
Fei Lu,
Felix X. -F. Ye
Abstract:
Efficient simulation of SDEs is essential in many applications, particularly for ergodic systems that demand efficient simulation of both short-time dynamics and large-time statistics. However, locally Lipschitz SDEs often require special treatments such as implicit schemes with small time-steps to accurately simulate the ergodic measure. We introduce a framework to construct inference-based schem…
▽ More
Efficient simulation of SDEs is essential in many applications, particularly for ergodic systems that demand efficient simulation of both short-time dynamics and large-time statistics. However, locally Lipschitz SDEs often require special treatments such as implicit schemes with small time-steps to accurately simulate the ergodic measure. We introduce a framework to construct inference-based schemes adaptive to large time-steps (ISALT) from data, achieving a reduction in time by several orders of magnitudes. The key is the statistical learning of an approximation to the infinite-dimensional discrete-time flow map. We explore the use of numerical schemes (such as the Euler-Maruyama, a hybrid RK4, and an implicit scheme) to derive informed basis functions, leading to a parameter inference problem. We introduce a scalable algorithm to estimate the parameters by least squares, and we prove the convergence of the estimators as data size increases.
We test the ISALT on three non-globally Lipschitz SDEs: the 1D double-well potential, a 2D multi-scale gradient system, and the 3D stochastic Lorenz equation with degenerate noise. Numerical results show that ISALT can tolerate time-step magnitudes larger than plain numerical schemes. It reaches optimal accuracy in reproducing the invariant measure when the time-step is medium-large.
△ Less
Submitted 24 February, 2021;
originally announced February 2021.
-
A Generalization of the Greene-Kleitman Duality Theorem
Authors:
Frank Y. Lu
Abstract:
In this paper, we describe and prove a generalization of both the classical Greene-Kleitman duality theorem for posets and the local version proved recently by Lewis-Lyu-Pylyavskyy-Sen in studying discrete solitons, using an approach more closely linked to the approach of the classical case.
In this paper, we describe and prove a generalization of both the classical Greene-Kleitman duality theorem for posets and the local version proved recently by Lewis-Lyu-Pylyavskyy-Sen in studying discrete solitons, using an approach more closely linked to the approach of the classical case.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Meromorphic functions partially share three values with their difference operators
Authors:
Feng Lü,
Zhenliu Yang
Abstract:
In this paper, we give a simple proof and strengthening of a uniqueness theorem of meromorphic functions which partially share 0, $\infty$ CM and 1 IM with their difference operators. Meanwhile, we partial solve a conjecture given by Chen-Yi in \cite{CY} and generalize some previous theorems in \cite{C, CX}.
In this paper, we give a simple proof and strengthening of a uniqueness theorem of meromorphic functions which partially share 0, $\infty$ CM and 1 IM with their difference operators. Meanwhile, we partial solve a conjecture given by Chen-Yi in \cite{CY} and generalize some previous theorems in \cite{C, CX}.
△ Less
Submitted 4 January, 2021;
originally announced January 2021.
-
Learning interaction kernels in mean-field equations of 1st-order systems of interacting particles
Authors:
Quanjun Lang,
Fei Lu
Abstract:
We introduce a nonparametric algorithm to learn interaction kernels of mean-field equations for 1st-order systems of interacting particles. The data consist of discrete space-time observations of the solution. By least squares with regularization, the algorithm learns the kernel on data-adaptive hypothesis spaces efficiently. A key ingredient is a probabilistic error functional derived from the li…
▽ More
We introduce a nonparametric algorithm to learn interaction kernels of mean-field equations for 1st-order systems of interacting particles. The data consist of discrete space-time observations of the solution. By least squares with regularization, the algorithm learns the kernel on data-adaptive hypothesis spaces efficiently. A key ingredient is a probabilistic error functional derived from the likelihood of the mean-field equation's diffusion process. The estimator converges, in a reproducing kernel Hilbert space and an L2 space under an identifiability condition, at a rate optimal in the sense that it equals the numerical integrator's order. We demonstrate our algorithm on three typical examples: the opinion dynamics with a piecewise linear kernel, the granular media model with a quadratic kernel, and the aggregation-diffusion with a repulsive-attractive kernel.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Data-driven model reduction for stochastic Burgers equations
Authors:
Fei Lu
Abstract:
We present a class of efficient parametric closure models for 1D stochastic Burgers equations. Casting it as statistical learning of the flow map, we derive the parametric form by representing the unresolved high wavenumber Fourier modes as functionals of the resolved variables' trajectory. The reduced models are nonlinear autoregression (NAR) time series models, with coefficients estimated from d…
▽ More
We present a class of efficient parametric closure models for 1D stochastic Burgers equations. Casting it as statistical learning of the flow map, we derive the parametric form by representing the unresolved high wavenumber Fourier modes as functionals of the resolved variables' trajectory. The reduced models are nonlinear autoregression (NAR) time series models, with coefficients estimated from data by least squares. The NAR models can accurately reproduce the energy spectrum, the invariant densities, and the autocorrelations. Taking advantage of the simplicity of the NAR models, we investigate maximal and optimal space-time reduction. Reduction in space dimension is unlimited, and NAR models with two Fourier modes can perform well. The NAR model's stability limits time reduction, with a maximal time step smaller than that of the K-mode Galerkin system. We report a potential criterion for optimal space-time reduction: the NAR models achieve minimal relative error in the energy spectrum at the time step where the K-mode Galerkin system's mean CFL number agrees with the full model's.
△ Less
Submitted 9 December, 2020; v1 submitted 1 October, 2020;
originally announced October 2020.
-
Learning interaction kernels in stochastic systems of interacting particles from multiple trajectories
Authors:
Fei Lu,
Mauro Maggioni,
Sui Tang
Abstract:
We consider stochastic systems of interacting particles or agents, with dynamics determined by an interaction kernel which only depends on pairwise distances. We study the problem of inferring this interaction kernel from observations of the positions of the particles, in either continuous or discrete time, along multiple independent trajectories. We introduce a nonparametric inference approach to…
▽ More
We consider stochastic systems of interacting particles or agents, with dynamics determined by an interaction kernel which only depends on pairwise distances. We study the problem of inferring this interaction kernel from observations of the positions of the particles, in either continuous or discrete time, along multiple independent trajectories. We introduce a nonparametric inference approach to this inverse problem, based on a regularized maximum likelihood estimator constrained to suitable hypothesis spaces adaptive to data. We show that a coercivity condition enables us to control the condition number of this problem and prove the consistency of our estimator, and that in fact it converges at a near-optimal learning rate, equal to the min-max rate of $1$-dimensional non-parametric regression. In particular, this rate is independent of the dimension of the state space, which is typically very high. We also analyze the discretization errors in the case of discrete-time observations, showing that it is of order $1/2$ in terms of the time gaps between observations. This term, when large, dominates the sampling error and the approximation error, preventing convergence of the estimator. Finally, we exhibit an efficient parallel algorithm to construct the estimator from data, and we demonstrate the effectiveness of our algorithm with numerical tests on prototype systems including stochastic opinion dynamics and a Lennard-Jones model.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
The exact entire solutions of certain type of nonlinear difference equations
Authors:
Feng Lü,
Cui** Li,
Junfeng Xu
Abstract:
In this paper, we consider the entire solutions of nonlinear difference equation $$f^3+q(z)Δf=p_1 e^{α_1 z}+ p_2 e^{α_2 z} $$ where $q$ is a polynomial, and $p_1, p_2, α_1, α_2$ are nonzero constants with $α_1\neq α_2$. It is showed that if $f$ is a non-constant entire solution of $ρ_2(f)<1$ to the above equation, then $f(z)=e_1e^{\frac{α_1 z}{3}}+e_2e^{\frac{α_2 z}{3}}, $ where $e_1$ and $e_2$ ar…
▽ More
In this paper, we consider the entire solutions of nonlinear difference equation $$f^3+q(z)Δf=p_1 e^{α_1 z}+ p_2 e^{α_2 z} $$ where $q$ is a polynomial, and $p_1, p_2, α_1, α_2$ are nonzero constants with $α_1\neq α_2$. It is showed that if $f$ is a non-constant entire solution of $ρ_2(f)<1$ to the above equation, then $f(z)=e_1e^{\frac{α_1 z}{3}}+e_2e^{\frac{α_2 z}{3}}, $ where $e_1$ and $e_2$ are two constants. Meanwhile, we give an affirmative answer to the conjecture posed by Zhang et al in [18].
△ Less
Submitted 23 July, 2020;
originally announced July 2020.
-
Laminar Tight Cuts in Matching Covered Graphs
Authors:
Guantao Chen,
Xing Feng,
Fuliang Lu,
Cláudio L. Lucchesi,
Lianzhu Zhang
Abstract:
An edge cut $C$ of a graph $G$ is {\it tight} if $|C \cap M|=1$ for every perfect matching $M$ of $G$.~Barrier cuts and 2-separation cuts are called {\it ELP-cuts}, which are two important types of tight cuts in matching covered graphs.~Edmonds, Lovász and Pulleyblank proved that if a matching covered graph has a nontrivial tight cut, then it also has a nontrivial ELP-cut.~Carvalho, Lucchesi, and…
▽ More
An edge cut $C$ of a graph $G$ is {\it tight} if $|C \cap M|=1$ for every perfect matching $M$ of $G$.~Barrier cuts and 2-separation cuts are called {\it ELP-cuts}, which are two important types of tight cuts in matching covered graphs.~Edmonds, Lovász and Pulleyblank proved that if a matching covered graph has a nontrivial tight cut, then it also has a nontrivial ELP-cut.~Carvalho, Lucchesi, and Murty made a stronger conjecture: given any nontrivial tight cut $C$ in a matching covered graph $G$, there exists a nontrivial ELP-cut $D$ in $G$ which does not cross $C$.~We confirm the conjecture in this paper.
△ Less
Submitted 19 March, 2020;
originally announced March 2020.
-
On the identifiability of interaction functions in systems of interacting particles
Authors:
Zhongyang Li,
Fei Lu,
Mauro Maggioni,
Sui Tang,
Cheng Zhang
Abstract:
We address a fundamental issue in the nonparametric inference for systems of interacting particles: the identifiability of the interaction functions. We prove that the interaction functions are identifiable for a class of first-order stochastic systems, including linear systems with general initial laws and nonlinear systems with stationary distributions. We show that a coercivity condition is suf…
▽ More
We address a fundamental issue in the nonparametric inference for systems of interacting particles: the identifiability of the interaction functions. We prove that the interaction functions are identifiable for a class of first-order stochastic systems, including linear systems with general initial laws and nonlinear systems with stationary distributions. We show that a coercivity condition is sufficient for identifiability and becomes necessary when the number of particles approaches infinity. The coercivity is equivalent to the strict positivity of related integral operators, which we prove by showing that their integral kernels are strictly positive definite by using Müntz type theorems.
△ Less
Submitted 31 August, 2020; v1 submitted 26 December, 2019;
originally announced December 2019.
-
Learning interaction kernels in heterogeneous systems of agents from multiple trajectories
Authors:
Fei Lu,
Mauro Maggioni,
Sui Tang
Abstract:
Systems of interacting particles or agents have wide applications in many disciplines such as Physics, Chemistry, Biology and Economics. These systems are governed by interaction laws, which are often unknown: estimating them from observation data is a fundamental task that can provide meaningful insights and accurate predictions of the behaviour of the agents. In this paper, we consider the inver…
▽ More
Systems of interacting particles or agents have wide applications in many disciplines such as Physics, Chemistry, Biology and Economics. These systems are governed by interaction laws, which are often unknown: estimating them from observation data is a fundamental task that can provide meaningful insights and accurate predictions of the behaviour of the agents. In this paper, we consider the inverse problem of learning interaction laws given data from multiple trajectories, in a nonparametric fashion, when the interaction kernels depend on pairwise distances. We establish a condition for learnability of interaction kernels, and construct estimators that are guaranteed to converge in a suitable $L^2$ space, at the optimal min-max rate for 1-dimensional nonparametric regression. We propose an efficient learning algorithm based on least squares, which can be implemented in parallel for multiple trajectories and is therefore well-suited for the high dimensional, big data regime. Numerical simulations on a variety examples, including opinion dynamics, predator-swarm dynamics and heterogeneous particle dynamics, suggest that the learnability condition is satisfied in models used in practice, and the rate of convergence of our estimator is consistent with the theory. These simulations also suggest that our estimators are robust to noise in the observations, and produce accurate predictions of dynamics in relative large time intervals, even when they are learned from data collected in short time intervals.
△ Less
Submitted 14 July, 2020; v1 submitted 10 October, 2019;
originally announced October 2019.
-
Data-driven model reduction, Wiener projections, and the Koopman-Mori-Zwanzig formalism
Authors:
Kevin K. Lin,
Fei Lu
Abstract:
Model reduction methods aim to describe complex dynamic phenomena using only relevant dynamical variables, decreasing computational cost, and potentially highlighting key dynamical mechanisms. In the absence of special dynamical features such as scale separation or symmetries, the time evolution of these variables typically exhibits memory effects. Recent work has found a variety of data-driven mo…
▽ More
Model reduction methods aim to describe complex dynamic phenomena using only relevant dynamical variables, decreasing computational cost, and potentially highlighting key dynamical mechanisms. In the absence of special dynamical features such as scale separation or symmetries, the time evolution of these variables typically exhibits memory effects. Recent work has found a variety of data-driven model reduction methods to be effective for representing such non-Markovian dynamics, but their scope and dynamical underpinning remain incompletely understood. Here, we study data-driven model reduction from a dynamical systems perspective. For both chaotic and randomly-forced systems, we show the problem can be naturally formulated within the framework of Koopman operators and the Mori-Zwanzig projection operator formalism. We give a heuristic derivation of a NARMAX (Nonlinear Auto-Regressive Moving Average with eXogenous input) model from an underlying dynamical model. The derivation is based on a simple construction we call Wiener projection, which links Mori-Zwanzig theory to both NARMAX and to classical Wiener filtering. We apply these ideas to the Kuramoto-Sivashinsky model of spatiotemporal chaos and a viscous Burgers equation with stochastic forcing.
△ Less
Submitted 5 October, 2020; v1 submitted 21 August, 2019;
originally announced August 2019.
-
Planar graphs without normally adjacent short cycles
Authors:
Fangyao Lu,
Mengjiao Rao,
Qianqian Wang,
Tao Wang
Abstract:
Let $\mathscr{G}$ be the class of plane graphs without triangles normally adjacent to $8^{-}$-cycles, without $4$-cycles normally adjacent to $6^{-}$-cycles, and without normally adjacent $5$-cycles. In this paper, it is shown that every graph in $\mathscr{G}$ is $3$-choosable. Instead of proving this result, we directly prove a stronger result in the form of ``weakly'' DP-$3$-coloring. The main t…
▽ More
Let $\mathscr{G}$ be the class of plane graphs without triangles normally adjacent to $8^{-}$-cycles, without $4$-cycles normally adjacent to $6^{-}$-cycles, and without normally adjacent $5$-cycles. In this paper, it is shown that every graph in $\mathscr{G}$ is $3$-choosable. Instead of proving this result, we directly prove a stronger result in the form of ``weakly'' DP-$3$-coloring. The main theorem improves the results in [J. Combin. Theory Ser. B 129 (2018) 38--54; European J. Combin. 82 (2019) 102995]. Consequently, every planar graph without $4$-, $6$-, $8$-cycles is $3$-choosable, and every planar graph without $4$-, $5$-, $7$-, $8$-cycles is $3$-choosable. In the third section, using almost the same technique, we prove that the vertex set of every graph in $\mathscr{G}$ can be partitioned into an independent set and a set that induces a forest, which strengthens the result in [Discrete Appl. Math. 284 (2020) 626--630]. In the final section, tightness is discussed.
△ Less
Submitted 10 June, 2022; v1 submitted 13 August, 2019;
originally announced August 2019.
-
Cover and variable degeneracy
Authors:
Fangyao Lu,
Qianqian Wang,
Tao Wang
Abstract:
Let $f$ be a nonnegative integer valued function on the vertex set of a graph. A graph is \textbf{strictly $f$-degenerate} if each nonempty subgraph $Γ$ has a vertex $v$ such that $\mathrm{deg}_Γ(v) < f(v)$. In this paper, we define a new concept, strictly $f$-degenerate transversal, which generalizes list coloring, signed coloring, DP-coloring, $L$-forested-coloring, and…
▽ More
Let $f$ be a nonnegative integer valued function on the vertex set of a graph. A graph is \textbf{strictly $f$-degenerate} if each nonempty subgraph $Γ$ has a vertex $v$ such that $\mathrm{deg}_Γ(v) < f(v)$. In this paper, we define a new concept, strictly $f$-degenerate transversal, which generalizes list coloring, signed coloring, DP-coloring, $L$-forested-coloring, and $(f_{1}, f_{2}, \dots, f_{s})$-partition. A \textbf{cover} of a graph $G$ is a graph $H$ with vertex set $V(H) = \bigcup_{v \in V(G)} X_{v}$, where $X_{v} = \{(v, 1), (v, 2), \dots, (v, s)\}$; the edge set $\mathscr{M} = \bigcup_{uv \in E(G)}\mathscr{M}_{uv}$, where $\mathscr{M}_{uv}$ is a matching between $X_{u}$ and $X_{v}$. A vertex set $R \subseteq V(H)$ is a \textbf{transversal} of $H$ if $|R \cap X_{v}| = 1$ for each $v \in V(G)$. A transversal $R$ is a \textbf{strictly $f$-degenerate transversal} if $H[R]$ is strictly $f$-degenerate. The main result of this paper is a degree type result, which generalizes Brooks' theorem, Gallai's theorem, degree-choosable result, signed degree-colorable result, and DP-degree-colorable result. We also give some structural results on critical graphs with respect to strictly $f$-degenerate transversal. Using these results, we can uniformly prove many new and known results. In the final section, we pose some open problems.
△ Less
Submitted 27 December, 2021; v1 submitted 12 July, 2019;
originally announced July 2019.
-
$b$-invariant edges in essentially 4-edge-connected near-bipartite cubic bricks
Authors:
Fuliang Lu,
Xing Feng,
Yan Wang
Abstract:
A {\em brick} is a non-bipartite matching covered graph without non-trivial tight cuts. Bricks are building blocks of matching covered graphs. We say that an edge $e$ in a brick $G$ is {\em $b$-invariant} if $G-e$ is matching covered and a tight cut decomposition of $G-e$ contains exactly one brick. A 2-edge-connected cubic graph is {\em essentially 4-edge-connected} if it does not contain nontriv…
▽ More
A {\em brick} is a non-bipartite matching covered graph without non-trivial tight cuts. Bricks are building blocks of matching covered graphs. We say that an edge $e$ in a brick $G$ is {\em $b$-invariant} if $G-e$ is matching covered and a tight cut decomposition of $G-e$ contains exactly one brick. A 2-edge-connected cubic graph is {\em essentially 4-edge-connected} if it does not contain nontrivial 3-cuts. A brick $G$ is {\em near-bipartite} if it has a pair of edges $\{e_1, e_2\}$ such that $G-\{e_1,e_2\}$ is bipartite and matching covered.
Kothari, de Carvalho, Lucchesi and Little proved that each essentially 4-edge-connected cubic non-near-bipartite brick $G$, distinct from the Petersen graph, has at least $|V(G)|$ $b$-invariant edges. Moreover, they made a conjecture: every essentially 4-edge-connected cubic near-bipartite brick $G$, distinct from $K_4$, has at least $|V(G)|/2$ $b$-invariant edges. We confirm the conjecture in this paper. Furthermore, all the essentially 4-edge-connected cubic near-bipartite bricks, the numbers of $b$-invariant edges of which attain the lower bound, are presented.
△ Less
Submitted 12 February, 2020; v1 submitted 17 May, 2019;
originally announced May 2019.
-
Joint state-parameter estimation of a nonlinear stochastic energy balance model from sparse noisy data
Authors:
Fei Lu,
Nils Weitzel,
Adam H. Monahan
Abstract:
While nonlinear stochastic partial differential equations arise naturally in spatiotemporal modeling, inference for such systems often faces two major challenges: sparse noisy data and ill-posedness of the inverse problem of parameter estimation. To overcome the challenges, we introduce a strongly regularized posterior by normalizing the likelihood and by imposing physical constraints through prio…
▽ More
While nonlinear stochastic partial differential equations arise naturally in spatiotemporal modeling, inference for such systems often faces two major challenges: sparse noisy data and ill-posedness of the inverse problem of parameter estimation. To overcome the challenges, we introduce a strongly regularized posterior by normalizing the likelihood and by imposing physical constraints through priors of the parameters and states. We investigate joint parameter-state estimation by the regularized posterior in a physically motivated nonlinear stochastic energy balance model (SEBM) for paleoclimate reconstruction. The high-dimensional posterior is sampled by a particle Gibbs sampler that combines MCMC with an optimal particle filter exploiting the structure of the SEBM. In tests using either Gaussian or uniform priors based on the physical range of parameters, the regularized posteriors overcome the ill-posedness and lead to samples within physical ranges, quantifying the uncertainty in estimation. Due to the ill-posedness and the regularization, the posterior of parameters presents a relatively large uncertainty, and consequently, the maximum of the posterior, which is the minimizer in a variational approach, can have a large variation. In contrast, the posterior of states generally concentrates near the truth, substantially filtering out observation noise and reducing uncertainty in the unconstrained SEBM.
△ Less
Submitted 10 April, 2019;
originally announced April 2019.
-
Equivalence classes in matching covered graphs
Authors:
Fuliang Lu,
Nishad Kothari,
Xing Feng,
Lianzhu Zhang
Abstract:
A connected graph $G$, of order two or more, is matching covered if each edge lies in some \pema. The tight cut decomposition of a matching covered graph $G$ yields a list of bricks and braces; as per a theorem of Lov{á}sz~\cite{lova87}, this list is unique (up to multiple edges); $b(G)$ denotes the number of bricks, and $c_4(G)$ denotes the number of braces that are isomorphic to the cycle $C_4$…
▽ More
A connected graph $G$, of order two or more, is matching covered if each edge lies in some \pema. The tight cut decomposition of a matching covered graph $G$ yields a list of bricks and braces; as per a theorem of Lov{á}sz~\cite{lova87}, this list is unique (up to multiple edges); $b(G)$ denotes the number of bricks, and $c_4(G)$ denotes the number of braces that are isomorphic to the cycle $C_4$ (up to multiple edges).
Two edges $e$ and $f$ are mutually dependent if, for each perfect matching $M$, $e \in M$ if and only if $f \in M$; Carvalho, Lucchesi and Murty investigated this notion in their landmark paper~\cite{clm99}. For any matching covered graph $G$, mutual dependence is an equivalence relation, and it partitions $E(G)$ into equivalence classes; this equivalence class partition is denoted by $\mathcal{E}_G$ and we refer to its parts as equivalence classes of $G$; we use $\varepsilon(G)$ to denote the cardinality of the largest equivalence class.
The operation of `splicing' may be used to construct bigger matching covered graphs from smaller ones; see~\cite{lckm18}; `tight splicing' is a stronger version of `splicing'. (These are converses of the notions of `separating cut' and `tight cut'.) In this article, we answer the following basic question: if a matching covered graph $G$ is obtained by `splicing' (or by `tight splicing') two smaller matching covered graphs, say~$G_1$~and~$G_2$, then how is $\mathcal{E}_G$ related to $\mathcal{E}_{G_1}$ and to $\mathcal{E}_{G_2}$ (and vice versa)?
As applications of our findings: firstly, we establish tight upper bounds on $\varepsilon(G)$ in terms of $b(G)$ and $c_4(G)$; secondly, we answer a recent question of He, Wei, Ye and Zhai~\cite{hwyz19}, in the affirmative, by constructing graphs that have arbitrarily high $κ(G)$~and~$\varepsilon(G)$ simultaneously, where $κ(G)$ denotes the vertex-connectivity.
△ Less
Submitted 17 December, 2019; v1 submitted 25 February, 2019;
originally announced February 2019.
-
Hamiltonicity of edge-chromatic critical graphs
Authors:
Yan Cao,
Guantao Chen,
Suyun Jiang,
Huiqing Liu,
Fuliang Lu
Abstract:
Given a graph $G$, denote by $Δ$ and $χ^\prime$ the maximum degree and the chromatic index of $G$, respectively. A simple graph $G$ is called {\it edge-$Δ$-critical} if $χ^\prime(G)=Δ+1$ and $χ^\prime(H)\leΔ$ for every proper subgraph $H$ of $G$. We proved that every edge chromatic critical graph of order $n$ with maximum degree at least $\frac{2n}{3}+12$ is Hamiltonian.
Given a graph $G$, denote by $Δ$ and $χ^\prime$ the maximum degree and the chromatic index of $G$, respectively. A simple graph $G$ is called {\it edge-$Δ$-critical} if $χ^\prime(G)=Δ+1$ and $χ^\prime(H)\leΔ$ for every proper subgraph $H$ of $G$. We proved that every edge chromatic critical graph of order $n$ with maximum degree at least $\frac{2n}{3}+12$ is Hamiltonian.
△ Less
Submitted 29 August, 2017;
originally announced August 2017.
-
Average degrees of edge-chromatic critical graphs
Authors:
Yan Cao,
Guantao Chen,
Suyun Jiang,
Huiqing Liu,
Fuliang Lu
Abstract:
Given a graph $G$, denote by $Δ$, $\bar{d}$ and $χ^\prime$ the maximum degree, the average degree and the chromatic index of $G$, respectively. A simple graph $G$ is called {\it edge-$Δ$-critical} if $χ^\prime(G)=Δ+1$ and $χ^\prime(H)\leΔ$ for every proper subgraph $H$ of $G$. Vizing in 1968 conjectured that if $G$ is edge-$Δ$-critical, then $\bar{d}\geq Δ-1+ \frac{3}{n}$. We show that…
▽ More
Given a graph $G$, denote by $Δ$, $\bar{d}$ and $χ^\prime$ the maximum degree, the average degree and the chromatic index of $G$, respectively. A simple graph $G$ is called {\it edge-$Δ$-critical} if $χ^\prime(G)=Δ+1$ and $χ^\prime(H)\leΔ$ for every proper subgraph $H$ of $G$. Vizing in 1968 conjectured that if $G$ is edge-$Δ$-critical, then $\bar{d}\geq Δ-1+ \frac{3}{n}$. We show that $$ \begin{displaystyle} \avd \ge \begin{cases}
0.69241\D-0.15658 \quad\,\: \mbox{ if } Δ\geq 66,
0.69392\D-0.20642\quad\;\,\mbox{ if } Δ=65, \mbox{ and }
0.68706\D+0.19815\quad\! \quad\mbox{if } 56\leq Δ\leq64.
\end{cases}
\end{displaystyle}
$$
This result improves the best known bound $\frac{2}{3}(Δ+2)$ obtained by Woodall in 2007 for $Δ\geq 56$. Additionally, Woodall constructed an infinite family of graphs showing his result cannot be improved by well-known Vizing's Adjacency Lemma and other known edge-coloring techniques. To over come the barrier, we follow the recently developed recoloring technique of Tashkinov trees to expand Vizing fans technique to a larger class of trees.
△ Less
Submitted 3 August, 2017;
originally announced August 2017.
-
On the bifurcation set of unique expansions
Authors:
Charlene Kalle,
Derong Kong,
Wenxia Li,
Fan Lü
Abstract:
Given a positive integer $M$, for $q\in(1, M+1]$ let ${\mathcal{U}}_q$ be the set of $x\in[0, M/(q-1)]$ having a unique $q$-expansion with the digit set $\{0, 1,\ldots, M\}$, and let $\mathbf{U}_q$ be the set of corresponding $q$-expansions. Recently, Komornik et al.~(Adv. Math., 2017) showed that the topological entropy function $H: q \mapsto h_{top}(\mathbf{U}_q)$ is a Devil's staircase in…
▽ More
Given a positive integer $M$, for $q\in(1, M+1]$ let ${\mathcal{U}}_q$ be the set of $x\in[0, M/(q-1)]$ having a unique $q$-expansion with the digit set $\{0, 1,\ldots, M\}$, and let $\mathbf{U}_q$ be the set of corresponding $q$-expansions. Recently, Komornik et al.~(Adv. Math., 2017) showed that the topological entropy function $H: q \mapsto h_{top}(\mathbf{U}_q)$ is a Devil's staircase in $(1, M+1]$. Let $\mathcal{B}$ be the bifurcation set of $H$ defined by
\[
\mathcal{B}=\{q\in(1, M+1]: H(p)\ne H(q)\quad\textrm{for any}\quad p\ne q\}.
\]
In this paper we analyze the fractal properties of $\mathcal{B}$, and show that for any $q\in \mathcal{B}$,
\[
\lim_{δ\rightarrow 0} \dim_H(\mathcal{B}\cap(q-δ, q+δ))=\dim_H\mathcal{U}_q,
\] where $\dim_H$ denotes the Hausdorff dimension. Moreover, when $q\in\mathcal{B}$ the univoque set $\mathcal{U}_q$ is dimensionally homogeneous, i.e., $
\dim_H(\mathcal{U}_q\cap V)=\dim_H\mathcal{U}_q $ for any open set $V$ that intersect $\mathcal{U}_q$.
As an application we obtain a dimensional spectrum result for the set $\mathcal{U}$ containing all bases $q\in(1, M+1]$ such that $1$ admits a unique $q$-expansion. In particular, we prove that for any $t>1$ we have \[
\dim_H(\mathcal{U}\cap(1, t])=\max_{ q\le t}\dim_H\mathcal{U}_q. \] We also consider the variations of the sets $\mathcal{U}=\mathcal{U}(M)$ when $M$ changes.
△ Less
Submitted 11 July, 2018; v1 submitted 23 December, 2016;
originally announced December 2016.
-
On the functional equation $f^n(z)+g^n(z)=e^{αz+β}$
Authors:
Qi Han,
Feng Lü
Abstract:
We describe meromorphic solutions to the equations $f^n(z)+\left(f'\right)^n(z)=e^{αz+β}$ and $f^n(z)+f^n(z+c)=e^{αz+β}$ ($c\neq0$) over the complex plane $\mathbf{C}$ for integers $n\geq1$.
We describe meromorphic solutions to the equations $f^n(z)+\left(f'\right)^n(z)=e^{αz+β}$ and $f^n(z)+f^n(z+c)=e^{αz+β}$ ($c\neq0$) over the complex plane $\mathbf{C}$ for integers $n\geq1$.
△ Less
Submitted 20 December, 2016;
originally announced December 2016.
-
Univoque bases and Hausdorff dimension
Authors:
Derong Kong,
Wenxia Li,
Fan Lü,
Martijn de Vries
Abstract:
Given a positive integer $M$ and a real number $q >1$, a \emph{$q$-expansion} of a real number $x$ is a sequence $(c_i)=c_1c_2\cdots$ with $(c_i) \in \{0,\ldots,M\}^\infty$ such that \[x=\sum_{i=1}^{\infty} c_iq^{-i}.\]
It is well known that if $q \in (1,M+1]$, then each $x \in I_q:=\left[0,M/(q-1)\right]$ has a $q$-expansion. Let $\mathcal{U}=\mathcal{U}(M)$ be the set of \emph{univoque bases}…
▽ More
Given a positive integer $M$ and a real number $q >1$, a \emph{$q$-expansion} of a real number $x$ is a sequence $(c_i)=c_1c_2\cdots$ with $(c_i) \in \{0,\ldots,M\}^\infty$ such that \[x=\sum_{i=1}^{\infty} c_iq^{-i}.\]
It is well known that if $q \in (1,M+1]$, then each $x \in I_q:=\left[0,M/(q-1)\right]$ has a $q$-expansion. Let $\mathcal{U}=\mathcal{U}(M)$ be the set of \emph{univoque bases} $q>1$ for which $1$ has a unique $q$-expansion.
The main object of this paper is to provide new characterizations of $\mathcal{U}$ and to show that the Hausdorff dimension of the set of numbers $x \in I_q$ with a unique $q$-expansion changes the most if $q$ "crosses" a univoque base.
Denote by $\mathcal{B}_2=\mathcal{B}_2(M)$ the set of $q \in (1,M+1]$ such that there exist numbers having precisely two distinct $q$-expansions. As a by-product of our results, we obtain an answer to a question of Sidorov (2009) and prove that \[\dim_H(\mathcal{B}_2\cap(q',q'+δ))>0\quad\textrm{for any}\quad δ>0,\] where $q'=q'(M)$ is the Komornik-Loreti constant.
△ Less
Submitted 2 April, 2017; v1 submitted 12 June, 2016;
originally announced June 2016.
-
Comparison of continuous and discrete-time data-based modeling for hypoelliptic systems
Authors:
Fei Lu,
Kevin K. Lin,
Alexandre J. Chorin
Abstract:
We compare two approaches to the predictive modeling of dynamical systems from partial observations at discrete times. The first is continuous in time, where one uses data to infer a model in the form of stochastic differential equations, which are then discretized for numerical solution. The second is discrete in time, where one directly infers a discrete-time model in the form of a nonlinear aut…
▽ More
We compare two approaches to the predictive modeling of dynamical systems from partial observations at discrete times. The first is continuous in time, where one uses data to infer a model in the form of stochastic differential equations, which are then discretized for numerical solution. The second is discrete in time, where one directly infers a discrete-time model in the form of a nonlinear autoregression moving average model. The comparison is performed in a special case where the observations are known to have been obtained from a hypoelliptic stochastic differential equation. We show that the discrete-time approach has better predictive skills, especially when the data are relatively sparse in time. We discuss open questions as well as the broader significance of the results.
△ Less
Submitted 7 December, 2016; v1 submitted 8 May, 2016;
originally announced May 2016.
-
Data-based stochastic model reduction for the Kuramoto--Sivashinsky equation
Authors:
Fei Lu,
Kevin Lin,
Alexandre J. Chorin
Abstract:
The problem of constructing data-based, predictive, reduced models for the Kuramoto-Sivashinsky equation is considered, under circumstances where one has observation data only for a small subset of the dynamical variables. Accurate prediction is achieved by develo** a discrete-time stochastic reduced system, based on a NARMAX (Nonlinear Autoregressive Moving Average with eXogenous input) represe…
▽ More
The problem of constructing data-based, predictive, reduced models for the Kuramoto-Sivashinsky equation is considered, under circumstances where one has observation data only for a small subset of the dynamical variables. Accurate prediction is achieved by develo** a discrete-time stochastic reduced system, based on a NARMAX (Nonlinear Autoregressive Moving Average with eXogenous input) representation. The practical issue, with the NARMAX representation as with any other, is to identify an efficient structure, i.e., one with a small number of terms and coefficients. This is accomplished here by estimating coefficients for an approximate inertial form. The broader significance of the results is discussed.
△ Less
Submitted 9 August, 2016; v1 submitted 30 September, 2015;
originally announced September 2015.
-
Typical points of univoque sets
Authors:
Derong Kong,
Fan Lü
Abstract:
Given a positive integer $M$ and a real number $q>1$, we consider the univoque set $\mathcal{U}_q$ of reals which have a unique $q$-expansion over the alphabet $\set{0,1,\cdots,M}$. In this paper we show that for any $x\in\mathcal{U}_q$ and all sufficiently small $\varepsilon>0$ the Hausdorff dimension $\dim_H\mathcal{U}_q\cap(x-\varepsilon, x+\varepsilon)$ equals either $\dim_H\mathcal{U}_q$ {or}…
▽ More
Given a positive integer $M$ and a real number $q>1$, we consider the univoque set $\mathcal{U}_q$ of reals which have a unique $q$-expansion over the alphabet $\set{0,1,\cdots,M}$. In this paper we show that for any $x\in\mathcal{U}_q$ and all sufficiently small $\varepsilon>0$ the Hausdorff dimension $\dim_H\mathcal{U}_q\cap(x-\varepsilon, x+\varepsilon)$ equals either $\dim_H\mathcal{U}_q$ {or} zero.
Moreover, we give a complete description of the typical points $x\in\mathcal{U}_q$ which satisfy \[ \dim_H\mathcal{U}_q\cap(x-\varepsilon, x+\varepsilon)=\dim_H\mathcal{U}_q\quad\textrm{for any}\quad \varepsilon>0, \] and prove that the set of typical points of $\mathcal{U}_q$ has full Hausdorff dimension. In particular, we show that if $\mathcal{U}_q$ is a Cantor set, then all points of $\mathcal{U}_q$ are typical points. This strengthen a result of de Vries and Komornik (Adv. Math., 2009).
△ Less
Submitted 4 November, 2015; v1 submitted 5 July, 2015;
originally announced July 2015.
-
Meromorphic functions share three values with their difference operators
Authors:
Feng Lü,
Weiran Lü
Abstract:
In the work, we focus on a conjecture due to Z.X. Chen and H.X. Yi[1] which is concerning the uniqueness problem of meromorphic functions share three distinct values with their difference operators. We prove that the conjecture is right for meromorphic function of finite order. Meanwhile, a result of J. Zhang and L.W. Liao[10] is generalized from entire functions to meromorphic functions.
In the work, we focus on a conjecture due to Z.X. Chen and H.X. Yi[1] which is concerning the uniqueness problem of meromorphic functions share three distinct values with their difference operators. We prove that the conjecture is right for meromorphic function of finite order. Meanwhile, a result of J. Zhang and L.W. Liao[10] is generalized from entire functions to meromorphic functions.
△ Less
Submitted 13 April, 2015;
originally announced April 2015.
-
A discrete approach to stochastic parametrization and dimensional reduction in nonlinear dynamics
Authors:
Alexandre J. Chorin,
Fei Lu
Abstract:
Many physical systems are described by nonlinear differential equations that are too complicated to solve in full. A natural way to proceed is to divide the variables into those that are of direct interest and those that are not, formulate solvable approximate equations for the variables of greater interest, and use data and statistical methods to account for the impact of the other variables. In…
▽ More
Many physical systems are described by nonlinear differential equations that are too complicated to solve in full. A natural way to proceed is to divide the variables into those that are of direct interest and those that are not, formulate solvable approximate equations for the variables of greater interest, and use data and statistical methods to account for the impact of the other variables. In the present paper the problem is considered in a fully discrete-time setting, which simplifies both the analysis of the data and the numerical algorithms. The resulting time series are identified by a NARMAX (nonlinear autoregression moving average with exogenous input) representation familiar from engineering practice. The connections with the Mori-Zwanzig formalism of statistical physics are discussed, as well as an application to the Lorenz 96 system.
△ Less
Submitted 30 March, 2015;
originally announced March 2015.
-
Uniform disconnectedness and Quasi-Assouad Dimension
Authors:
Fan Lü,
Li-Feng Xi
Abstract:
The uniform disconnectedness is an important invariant property under bi-Lipschitz map**, and the Assouad dimension $\dim _{A}X<1$ implies the uniform disconnectedness of $X$. According to quasi-Lipschitz map**, we introduce the quasi-Assouad dimension $\dim _{qA}$ such that $\dim _{qA}X<1$ implies its quasi uniform disconnectedness. We obtain…
▽ More
The uniform disconnectedness is an important invariant property under bi-Lipschitz map**, and the Assouad dimension $\dim _{A}X<1$ implies the uniform disconnectedness of $X$. According to quasi-Lipschitz map**, we introduce the quasi-Assouad dimension $\dim _{qA}$ such that $\dim _{qA}X<1$ implies its quasi uniform disconnectedness. We obtain $\overline{\dim } _{B}X\leq \dim _{qA}X\leq \dim _{A}X$ and compute the quasi-Assouad dimension of Moran set.
△ Less
Submitted 6 September, 2014;
originally announced September 2014.
-
A note on a famous theorem of Pang and Zalcman
Authors:
Feng Lü,
Junfeng Xu,
Hongxun Yi
Abstract:
In this paper, by studying the famous theorem of Pang and Zalcman, we find a normal family and obtain a result, which is an improvement of Pang and Zalcman's theorem in some sense. Meanwhile, several examples are provided to show that our result's conditions are necessary.
In this paper, by studying the famous theorem of Pang and Zalcman, we find a normal family and obtain a result, which is an improvement of Pang and Zalcman's theorem in some sense. Meanwhile, several examples are provided to show that our result's conditions are necessary.
△ Less
Submitted 27 August, 2014;
originally announced August 2014.