-
High-order Adaptive Rank Integrators for Multi-scale Linear Kinetic Transport Equations in the Hierarchical Tucker Format
Authors:
William A. Sands,
Wei Guo,
**g-Mei Qiu,
Tao Xiong
Abstract:
In this paper, we present a new adaptive rank approximation technique for computing solutions to the high-dimensional linear kinetic transport equation. The approach we propose is based on a macro-micro decomposition of the kinetic model in which the angular domain is discretized with a tensor product quadrature rule under the discrete ordinates method. To address the challenges associated with th…
▽ More
In this paper, we present a new adaptive rank approximation technique for computing solutions to the high-dimensional linear kinetic transport equation. The approach we propose is based on a macro-micro decomposition of the kinetic model in which the angular domain is discretized with a tensor product quadrature rule under the discrete ordinates method. To address the challenges associated with the curse of dimensionality, the proposed low-rank method is cast in the framework of the hierarchical Tucker decomposition. The adaptive rank integrators we propose are built upon high-order discretizations for both time and space. In particular, this work considers implicit-explicit discretizations for time and finite-difference weighted-essentially non-oscillatory discretizations for space. The high-order singular value decomposition is used to perform low-rank truncation of the high-dimensional time-dependent distribution function. The methods are applied to several benchmark problems, where we compare the solution quality and measure compression achieved by the adaptive rank methods against their corresponding full-grid methods. We also demonstrate the benefits of high-order discretizations in the proposed low-rank framework.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
On Naive Mean-Field Approximation for high-dimensional canonical GLMs
Authors:
Sumit Mukherjee,
Jiaze Qiu,
Subhabrata Sen
Abstract:
We study the validity of the Naive Mean Field (NMF) approximation for canonical GLMs with product priors. This setting is challenging due to the non-conjugacy of the likelihood and the prior. Using the theory of non-linear large deviations (Austin 2019, Chatterjee, Dembo 2016, Eldan 2018), we derive sufficient conditions for the tightness of the NMF approximation to the log-normalizing constant of…
▽ More
We study the validity of the Naive Mean Field (NMF) approximation for canonical GLMs with product priors. This setting is challenging due to the non-conjugacy of the likelihood and the prior. Using the theory of non-linear large deviations (Austin 2019, Chatterjee, Dembo 2016, Eldan 2018), we derive sufficient conditions for the tightness of the NMF approximation to the log-normalizing constant of the posterior distribution. As a second contribution, we establish that under minor conditions on the design, any NMF optimizer is a product distribution where each component is a quadratic tilt of the prior. In turn, this suggests novel iterative algorithms for fitting the NMF optimizer to the target posterior. Finally, we establish that if the NMF optimization problem has a "well-separated maximizer", then this optimizer governs the probabilistic properties of the posterior. Specifically, we derive credible intervals with average coverage guarantees, and characterize the prediction performance on an out-of-sample datapoint in terms of this dominant optimizer.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
A Generalized Version of Chung's Lemma and its Applications
Authors:
Li Jiang,
Xiao Li,
Andre Milzarek,
Junwen Qiu
Abstract:
Chung's lemma is a classical tool for establishing asymptotic convergence rates of (stochastic) optimization methods under strong convexity-type assumptions and appropriate polynomial diminishing step sizes. In this work, we develop a generalized version of Chung's lemma, which provides a simple non-asymptotic convergence framework for a more general family of step size rules. We demonstrate broad…
▽ More
Chung's lemma is a classical tool for establishing asymptotic convergence rates of (stochastic) optimization methods under strong convexity-type assumptions and appropriate polynomial diminishing step sizes. In this work, we develop a generalized version of Chung's lemma, which provides a simple non-asymptotic convergence framework for a more general family of step size rules. We demonstrate broad applicability of the proposed generalized Chung's lemma by deriving tight non-asymptotic convergence rates for a large variety of stochastic methods. In particular, we obtain partially new non-asymptotic complexity results for stochastic optimization methods, such as stochastic gradient descent and random reshuffling, under a general $(θ,μ)$-Polyak-Lojasiewicz (PL) condition and for various step sizes strategies, including polynomial, constant, exponential, and cosine step sizes rules. Notably, as a by-product of our analysis, we observe that exponential step sizes can adapt to the objective function's geometry, achieving the optimal convergence rate without requiring exact knowledge of the underlying landscape. Our results demonstrate that the developed variant of Chung's lemma offers a versatile, systematic, and streamlined approach to establish non-asymptotic convergence rates under general step size rules.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
A KL-based Analysis Framework with Applications to Non-Descent Optimization Methods
Authors:
Junwen Qiu,
Bohao Ma,
Xiao Li,
Andre Milzarek
Abstract:
We propose a novel analysis framework for non-descent-type optimization methodologies in nonconvex scenarios based on the Kurdyka-Lojasiewicz property. Our framework allows covering a broad class of algorithms, including those commonly employed in stochastic and distributed optimization. Specifically, it enables the analysis of first-order methods that lack a sufficient descent property and do not…
▽ More
We propose a novel analysis framework for non-descent-type optimization methodologies in nonconvex scenarios based on the Kurdyka-Lojasiewicz property. Our framework allows covering a broad class of algorithms, including those commonly employed in stochastic and distributed optimization. Specifically, it enables the analysis of first-order methods that lack a sufficient descent property and do not require access to full (deterministic) gradient information. We leverage this framework to establish, for the first time, iterate convergence and the corresponding rates for the decentralized gradient method and federated averaging under mild assumptions. Furthermore, based on the new analysis techniques, we show the convergence of the random reshuffling and stochastic gradient descent method without necessitating typical a priori bounded iterates assumptions.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Non-splitting Eulerian-Lagrangian WENO schemes for two-dimensional nonlinear convection-diffusion equations
Authors:
Nanyi Zheng,
Xiaofeng Cai,
**g-Mei Qiu,
Jianxian Qiu
Abstract:
In this paper, we develop high-order, conservative, non-splitting Eulerian-Lagrangian (EL) Runge-Kutta (RK) finite volume (FV) weighted essentially non-oscillatory (WENO) schemes for convection-diffusion equations. The proposed EL-RK-FV-WENO scheme defines modified characteristic lines and evolves the solution along them, significantly relaxing the time-step constraint for the convection term. The…
▽ More
In this paper, we develop high-order, conservative, non-splitting Eulerian-Lagrangian (EL) Runge-Kutta (RK) finite volume (FV) weighted essentially non-oscillatory (WENO) schemes for convection-diffusion equations. The proposed EL-RK-FV-WENO scheme defines modified characteristic lines and evolves the solution along them, significantly relaxing the time-step constraint for the convection term. The main algorithm design challenge arises from the complexity of constructing accurate and robust reconstructions on dynamically varying Lagrangian meshes. This reconstruction process is needed for flux evaluations on time-dependent upstream quadrilaterals and time integrations along moving characteristics. To address this, we propose a strategy that utilizes a WENO reconstruction on a fixed Eulerian mesh for spatial reconstruction, and updates intermediate solutions on the Eulerian background mesh for implicit-explicit RK temporal integration. This strategy leverages efficient reconstruction and remap** algorithms to manage the complexities of polynomial reconstructions on time-dependent quadrilaterals, while ensuring local mass conservation. The proposed scheme ensures mass conservation due to the flux-form semi-discretization and the mass-conservative reconstruction on both background and upstream cells. Extensive numerical tests have been performed to verify the effectiveness of the proposed scheme.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
On a problem of Pavlović involving harmonic quasiconformal map**s
Authors:
Zhi-Gang Wang,
Xiao-Yuan Wang,
Antti Rasila,
Jia-Le Qiu
Abstract:
We obtain a sharp result on order of certain affine and linear invariant families of harmonic quasiconformal map**s with bounded Schwarzian norm. This problem is motivated by the work of Chuaqui, Hernández and Martín [Math. Ann. 367: 1099--1122, 2017]. Firstly, for $K\ge1$, we construct a harmonic $K$-quasiconformal counterpart of the classical Koebe function and use it to formulate the correspo…
▽ More
We obtain a sharp result on order of certain affine and linear invariant families of harmonic quasiconformal map**s with bounded Schwarzian norm. This problem is motivated by the work of Chuaqui, Hernández and Martín [Math. Ann. 367: 1099--1122, 2017]. Firstly, for $K\ge1$, we construct a harmonic $K$-quasiconformal counterpart of the classical Koebe function and use it to formulate the corresponding conjectures. Then we consider Hardy spaces $H^p$ of harmonic quasiconformal map**s by applying results for quasiconformal map**s obtained by Astala and Koskela [Pure Appl. Math. Q. 7: 19--50, 2011]. In particular, we determine the optimal order of the family of harmonic quasiconformal map**s with bounded Schwarzian norm to belong to a harmonic Hardy space. This partially solves an open problem posed by Pavlović in 2014. Finally, we derive pre-Schwarzian and Schwarzian norm estimates of certain harmonic map**s.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
A Mathematical Theory of Integer Quantum Hall Effect in Photonics
Authors:
Jiayu Qiu,
Hai Zhang
Abstract:
This paper investigates interface modes in a square lattice of photonic crystal composed of gyromagnetic particles with $C_{4v}$ point group symmetry. The study shows that Dirac or linear degenerate points cannot occur at the three high symmetry points in the Brillouin zone where two Bloch bands touch. Instead, a touch point at the M-point has a quadratic degeneracy in the generic case. It is furt…
▽ More
This paper investigates interface modes in a square lattice of photonic crystal composed of gyromagnetic particles with $C_{4v}$ point group symmetry. The study shows that Dirac or linear degenerate points cannot occur at the three high symmetry points in the Brillouin zone where two Bloch bands touch. Instead, a touch point at the M-point has a quadratic degeneracy in the generic case. It is further proved that when a magnetic field is applied to the two sides of an interface in opposite directions, two interface modes that are supported along that interface can be bifurcated from the quadratic degenerate point. The results provide a mathematical foundation for the first experiment realization of the integer quantum Hall effect in the context of photonics.
△ Less
Submitted 27 May, 2024; v1 submitted 27 May, 2024;
originally announced May 2024.
-
Convergence of SGD with momentum in the nonconvex case: A time window-based analysis
Authors:
Junwen Qiu,
Bohao Ma,
Andre Milzarek
Abstract:
We propose a novel time window-based analysis technique to investigate the convergence properties of the stochastic gradient descent method with momentum (SGDM) in nonconvex settings. Despite its popularity, the convergence behavior of SGDM remains less understood in nonconvex scenarios. This is primarily due to the absence of a sufficient descent property and challenges in simultaneously controll…
▽ More
We propose a novel time window-based analysis technique to investigate the convergence properties of the stochastic gradient descent method with momentum (SGDM) in nonconvex settings. Despite its popularity, the convergence behavior of SGDM remains less understood in nonconvex scenarios. This is primarily due to the absence of a sufficient descent property and challenges in simultaneously controlling the momentum and stochastic errors in an almost sure sense. To address these challenges, we investigate the behavior of SGDM over specific time windows, rather than examining the descent of consecutive iterates as in traditional studies. This time window-based approach simplifies the convergence analysis and enables us to establish the first iterate convergence result for SGDM under the Kurdyka-Lojasiewicz (KL) property. We further provide local convergence rates which depend on the underlying KL exponent and the utilized step size schemes.
△ Less
Submitted 23 June, 2024; v1 submitted 27 May, 2024;
originally announced May 2024.
-
A high-order Eulerian-Lagrangian Runge-Kutta finite volume (EL-RK-FV) method for scalar nonlinear conservation laws
Authors:
Jiajie Chen,
Joseph Nakao,
**g-Mei Qiu,
Yang Yang
Abstract:
We present a class of high-order Eulerian-Lagrangian Runge-Kutta finite volume methods that can numerically solve Burgers' equation with shock formations, which could be extended to general scalar conservation laws. Eulerian-Lagrangian (EL) and semi-Lagrangian (SL) methods have recently seen increased development and have become a staple for allowing large time-step** sizes. Yet, maintaining rel…
▽ More
We present a class of high-order Eulerian-Lagrangian Runge-Kutta finite volume methods that can numerically solve Burgers' equation with shock formations, which could be extended to general scalar conservation laws. Eulerian-Lagrangian (EL) and semi-Lagrangian (SL) methods have recently seen increased development and have become a staple for allowing large time-step** sizes. Yet, maintaining relatively large time-step** sizes post shock formation remains quite challenging. Our proposed scheme integrates the partial differential equation on a space-time region partitioned by linear approximations to the characteristics determined by the Rankine-Hugoniot jump condition. We trace the characteristics forward in time and present a merging procedure for the mesh cells to handle intersecting characteristics due to shocks. Following this partitioning, we write the equation in a time-differential form and evolve with Runge-Kutta methods in a method-of-lines fashion. High-resolution methods such as ENO and WENO-AO schemes are used for spatial reconstruction. Extension to higher dimensions is done via dimensional splitting. Numerical experiments demonstrate our scheme's high-order accuracy and ability to sharply capture post-shock solutions with large time-step** sizes.
△ Less
Submitted 29 May, 2024; v1 submitted 16 May, 2024;
originally announced May 2024.
-
Interface Modes in Honeycomb Topological Photonic Structures with Broken Reflection Symmetry
Authors:
Wei Li,
Junshan Lin,
Jiayu Qiu,
Hai Zhang
Abstract:
In this work, we present a mathematical theory for Dirac points and interface modes in honeycomb topological photonic structures consisting of impenetrable obstacles. Starting from a honeycomb lattice of obstacles attaining $120^\circ$-rotation symmetry and horizontal reflection symmetry, we apply the boundary integral equation method to show the existence of Dirac points for the first two bands a…
▽ More
In this work, we present a mathematical theory for Dirac points and interface modes in honeycomb topological photonic structures consisting of impenetrable obstacles. Starting from a honeycomb lattice of obstacles attaining $120^\circ$-rotation symmetry and horizontal reflection symmetry, we apply the boundary integral equation method to show the existence of Dirac points for the first two bands at the vertices of the Brillouin zone. We then study interface modes in a joint honeycomb photonic structure, which consists of two periodic lattices obtained by perturbing the honeycomb one with Dirac points differently. The perturbations break the reflection symmetry of the system, as a result, they annihilate the Dirac points and generate two structures with different topological phases, which mimics the quantum valley Hall effect in topological insulators. We investigate the interface modes that decay exponentially away from the interface of the joint structure in several configurations with different interface geometries, including the zigzag interface, the armchair interface, and the rational interfaces. Using the layer potential technique and asymptotic analysis, we first characterize the band-gap opening for the two perturbed periodic structures and derive the asymptotic expansions of the Bloch modes near the band gap surfaces. By formulating the eigenvalue problem for each joint honeycomb structure using boundary integral equations over the interface and analyzing the characteristic values of the associated boundary integral operators, we prove the existence of interface modes when the perturbation is small.
△ Less
Submitted 6 May, 2024; v1 submitted 6 May, 2024;
originally announced May 2024.
-
Random Reshuffling with Momentum for Nonconvex Problems: Iteration Complexity and Last Iterate Convergence
Authors:
Junwen Qiu,
Andre Milzarek
Abstract:
Random reshuffling with momentum (RRM) corresponds to the SGD optimizer with momentum option enabled, as found in popular machine learning libraries like PyTorch and TensorFlow. Despite its widespread use in practical applications, the understanding of its convergence properties in nonconvex scenarios remains limited. Under a Lipschitz smoothness assumption, this paper provides one of the first it…
▽ More
Random reshuffling with momentum (RRM) corresponds to the SGD optimizer with momentum option enabled, as found in popular machine learning libraries like PyTorch and TensorFlow. Despite its widespread use in practical applications, the understanding of its convergence properties in nonconvex scenarios remains limited. Under a Lipschitz smoothness assumption, this paper provides one of the first iteration complexities for RRM. Specifically, we prove that RRM achieves the iteration complexity $O(n^{-1/3}((1-β^n)T)^{-2/3})$ where $n$ denotes the number of component functions $f(\cdot;i)$ and $β\in [0,1)$ is the momentum parameter. Furthermore, every accumulation point of a sequence of iterates $\{x^k\}_k$ generated by RRM is shown to be a stationary point of the problem. In addition, under the Kurdyka-Lojasiewicz inequality - a local geometric property - the iterates $\{x^k\}_k$ provably converge to a unique stationary point $x^*$ of the objective function. Importantly, in our analysis, this last iterate convergence is obtained without requiring convexity nor a priori boundedness of the iterates. Finally, for polynomial step size schemes, convergence rates of the form $\|x^k - x^*\| = O(k^{-p})$, $\|\nabla f(x^k)\|^2 = O(k^{-q})$, and $|f(x^k) - f(x^*)| = O(k^{-q})$, $p \in (0,1]$, $q \in (0,2]$ are derived.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Provably Convergent and Robust Newton-Raphson Method: A New Dawn in Primitive Variable Recovery for Relativistic MHD
Authors:
Chaoyi Cai,
Jianxian Qiu,
Kailiang Wu
Abstract:
A long-standing and formidable challenge faced by all conservative schemes for relativistic magnetohydrodynamics (RMHD) is the recovery of primitive variables from conservative ones. This process involves solving highly nonlinear equations subject to physical constraints. An ideal solver should be "robust, accurate, and fast -- it is at the heart of all conservative RMHD schemes," as emphasized in…
▽ More
A long-standing and formidable challenge faced by all conservative schemes for relativistic magnetohydrodynamics (RMHD) is the recovery of primitive variables from conservative ones. This process involves solving highly nonlinear equations subject to physical constraints. An ideal solver should be "robust, accurate, and fast -- it is at the heart of all conservative RMHD schemes," as emphasized in [S.C. Noble et al., ApJ, 641:626-637, 2006]. Despite over three decades of research, seeking efficient solvers that can provably guarantee stability and convergence remains an open problem.
This paper presents the first theoretical analysis for designing a robust, physical-constraint-preserving (PCP), and provably (quadratically) convergent Newton-Raphson (NR) method for primitive variable recovery in RMHD. Our key innovation is a unified approach for the initial guess, devised based on sophisticated analysis. It ensures that the NR iteration consistently converges and adheres to physical constraints. Given the extreme nonlinearity and complexity of the iterative function, the theoretical analysis is highly nontrivial and technical. We discover a pivotal inequality for delineating the convexity and concavity of the iterative function and establish theories to guarantee the PCP property and convergence. We also develop theories to determine a computable initial guess within a theoretical "safe" interval. Intriguingly, we find that the unique positive root of a cubic polynomial always falls within this interval. Our PCP NR method is versatile and can be seamlessly integrated into any RMHD scheme that requires the recovery of primitive variables, potentially leading to a broad impact in this field. As an application, we incorporate it into a discontinuous Galerkin method, resulting in fully PCP schemes. Several numerical experiments demonstrate the efficiency and robustness of the PCP NR method.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Krylov-based Adaptive-Rank Implicit Time Integrators for Stiff Problems with Application to Nonlinear Fokker-Planck Kinetic Models
Authors:
Hamad El Kahza,
William Taitano,
**g-Mei Qiu,
Luis Chacón
Abstract:
We propose a high order adaptive-rank implicit integrators for stiff time-dependent PDEs, leveraging extended Krylov subspaces to efficiently and adaptively populate low-rank solution bases. This allows for the accurate representation of solutions with significantly reduced computational costs. We further introduce an efficient mechanism for residual evaluation and an adaptive rank-seeking strateg…
▽ More
We propose a high order adaptive-rank implicit integrators for stiff time-dependent PDEs, leveraging extended Krylov subspaces to efficiently and adaptively populate low-rank solution bases. This allows for the accurate representation of solutions with significantly reduced computational costs. We further introduce an efficient mechanism for residual evaluation and an adaptive rank-seeking strategy that optimizes low-rank settings based on a comparison between the residual size and the local truncation errors of the time-step** discretization. We demonstrate our approach with the challenging Lenard-Bernstein Fokker-Planck (LBFP) nonlinear equation, which describes collisional processes in a fully ionized plasma. The preservation of {the equilibrium state} is achieved through the Chang-Cooper discretization, and strict conservation of mass, momentum and energy via a Locally Macroscopic Conservative (LoMaC) procedure. The development of implicit adaptive-rank integrators, demonstrated here up to third-order temporal accuracy via diagonally implicit Runge-Kutta schemes, showcases superior performance in terms of accuracy, computational efficiency, equilibrium preservation, and conservation of macroscopic moments. This study offers a starting point for develo** scalable, efficient, and accurate methods for high-dimensional time-dependent problems.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
A moment-based Hermite WENO scheme with unified stencils for hyperbolic conservation laws
Authors:
Chuan Fan,
Jianxian Qiu,
Zhuang Zhao
Abstract:
In this paper, a fifth-order moment-based Hermite weighted essentially non-oscillatory scheme with unified stencils (termed as HWENO-U) is proposed for hyperbolic conservation laws. The main idea of the HWENO-U scheme is to modify the first-order moment by a HWENO limiter only in the time discretizations using the same information of spatial reconstructions, in which the limiter not only overcomes…
▽ More
In this paper, a fifth-order moment-based Hermite weighted essentially non-oscillatory scheme with unified stencils (termed as HWENO-U) is proposed for hyperbolic conservation laws. The main idea of the HWENO-U scheme is to modify the first-order moment by a HWENO limiter only in the time discretizations using the same information of spatial reconstructions, in which the limiter not only overcomes spurious oscillations well, but also ensures the stability of the fully-discrete scheme. For the HWENO reconstructions, a new scale-invariant nonlinear weight is designed by incorporating only the integral average values of the solution, which keeps all properties of the original one while is more robust for simulating challenging problems with sharp scale variations. Compared with previous HWENO schemes, the advantages of the HWENO-U scheme are: (1) a simpler implemented process involving only a single HWENO reconstruction applied throughout the entire procedures without any modifications for the governing equations; (2) increased efficiency by utilizing the same candidate stencils, reconstructed polynomials, and linear and nonlinear weights in both the HWENO limiter and spatial reconstructions; (3) reduced problem-specific dependencies and improved rationality, as the nonlinear weights are identical for the function $u$ and its non-zero multiple $ζu$. Besides, the proposed scheme retains the advantages of previous HWENO schemes, including compact reconstructed stencils and the utilization of artificial linear weights. Extensive benchmarks are carried out to validate the accuracy, efficiency, resolution, and robustness of the proposed scheme.
△ Less
Submitted 19 February, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Plücker Coordinates and the Rosenfeld Planes
Authors:
Jian Qiu
Abstract:
The exceptional compact hermitian symmetric space EIII is the quotient $E_6/Spin(10)\times_{\mathbb{Z}_4}U(1)$. We introduce the Plücker coordinates which give an embedding of EIII into $\mathbb{C}P^{26}$ as a projective subvariety. The subvariety is cut out by 27 Plücker relations. We show that, using Clifford algebra, one can solve this over-determined system of relations, giving local coordinat…
▽ More
The exceptional compact hermitian symmetric space EIII is the quotient $E_6/Spin(10)\times_{\mathbb{Z}_4}U(1)$. We introduce the Plücker coordinates which give an embedding of EIII into $\mathbb{C}P^{26}$ as a projective subvariety. The subvariety is cut out by 27 Plücker relations. We show that, using Clifford algebra, one can solve this over-determined system of relations, giving local coordinate charts to the space.
Our motivation is to understand EIII as the complex projective octonion plane $(\mathbb{C}\otimes\mathbb{O})P^2$, whose construction is somewhat scattered across the literature. We will see that the EIII has an atlas whose transition functions have clear octonion interpretations, apart from those covering a sub-variety $X_{\infty}$ of dimension 10. This subvariety is itself a hermitian symmetric space known as DIII, with no apparent octonion interpretation. We give detailed analysis of the geometry in the neighbourhood of $X_{\infty}$.
We further decompose $X={\rm EIII}$ into $F_4$-orbits: $X=Y_0\cup Y_{\infty}$, where $Y_0\sim(\mathbb{O}P^2)_{\mathbb{C}}$ is an open $F_4$-orbit and is the complexification of $\mathbb{O}P^2$, whereas $Y_{\infty}$ has co-dimension 1, thus EIII could be more appropriately denoted as $\overline{(\mathbb{O}P^2)_{\mathbb{C}}}$. This decomposition appears in the classification of equivariant completion of homogeneous algebraic varieties by Ahiezer \cite{Ahiezer}.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
On eigenvalues of sample covariance matrices based on high dimensional compositional data
Authors:
Qianqian Jiang,
Jiaxin Qiu,
Zeng Li
Abstract:
This paper studies the asymptotic spectral properties of the sample covariance matrix for high dimensional compositional data, including the limiting spectral distribution, the limit of extreme eigenvalues, and the central limit theorem for linear spectral statistics. All asymptotic results are derived under the high-dimensional regime where the data dimension increases to infinity proportionally…
▽ More
This paper studies the asymptotic spectral properties of the sample covariance matrix for high dimensional compositional data, including the limiting spectral distribution, the limit of extreme eigenvalues, and the central limit theorem for linear spectral statistics. All asymptotic results are derived under the high-dimensional regime where the data dimension increases to infinity proportionally with the sample size. The findings reveal that the limiting spectral distribution is the well-known Marchenko-Pastur law. The largest (or smallest non-zero) eigenvalue converges almost surely to the left (or right) endpoint of the limiting spectral distribution, respectively. Moreover, the linear spectral statistics demonstrate a Gaussian limit. Simulation experiments demonstrate the accuracy of theoretical results.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Viscosity Solutions of a class of Second Order Hamilton-Jacobi-Bellman Equations in the Wasserstein Space
Authors:
Hang Cheung,
Ho Man Tai,
**niao Qiu
Abstract:
This paper is devoted to solving a class of second order Hamilton-Jacobi-Bellman (HJB) equations in the Wasserstein space, associated with mean field control problems involving common noise. We provide the well-posedness of viscosity solutions to the HJB equation in the sense of Crandall-Lions' definition, under general assumptions on the coefficients. Our approach adopts the smooth metric develop…
▽ More
This paper is devoted to solving a class of second order Hamilton-Jacobi-Bellman (HJB) equations in the Wasserstein space, associated with mean field control problems involving common noise. We provide the well-posedness of viscosity solutions to the HJB equation in the sense of Crandall-Lions' definition, under general assumptions on the coefficients. Our approach adopts the smooth metric developed by Bayraktar, Ekren, and Zhang [Proc. Amer. Math. Soc. (2023)] as our gauge function for the purpose of smooth variational principle used in the proof of comparison theorem. Subsequently, we derive further estimates and regularity of the metric, including a novel second order derivative estimate with respect to the measure variable, in order to ensure the uniqueness and existence.
△ Less
Submitted 21 February, 2024; v1 submitted 16 December, 2023;
originally announced December 2023.
-
A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization
Authors:
Junwen Qiu,
Xiao Li,
Andre Milzarek
Abstract:
Random reshuffling techniques are prevalent in large-scale applications, such as training neural networks. While the convergence and acceleration effects of random reshuffling-type methods are fairly well understood in the smooth setting, much less studies seem available in the nonsmooth case. In this work, we design a new normal map-based proximal random reshuffling (norm-PRR) method for nonsmoot…
▽ More
Random reshuffling techniques are prevalent in large-scale applications, such as training neural networks. While the convergence and acceleration effects of random reshuffling-type methods are fairly well understood in the smooth setting, much less studies seem available in the nonsmooth case. In this work, we design a new normal map-based proximal random reshuffling (norm-PRR) method for nonsmooth nonconvex finite-sum problems. We show that norm-PRR achieves the iteration complexity $O(n^{-1/3}T^{-2/3})$ where $n$ denotes the number of component functions $f(\cdot,i)$ and $T$ counts the total number of iterations. This improves the currently known complexity bounds for this class of problems by a factor of $n^{-1/3}$. In addition, we prove that norm-PRR converges linearly under the (global) Polyak-Lojasiewicz condition and in the interpolation setting. We further complement these non-asymptotic results and provide an in-depth analysis of the asymptotic properties of norm-PRR. Specifically, under the (local) Kurdyka-Lojasiewicz inequality, the whole sequence of iterates generated by norm-PRR is shown to converge to a single stationary point. Moreover, we derive last iterate convergence rates that can match those in the smooth, strongly convex setting. Finally, numerical experiments are performed on nonconvex classification tasks to illustrate the efficiency of the proposed approach.
△ Less
Submitted 30 April, 2024; v1 submitted 2 December, 2023;
originally announced December 2023.
-
Reduced Augmentation Implicit Low-rank (RAIL) integrators for advection-diffusion and Fokker-Planck models
Authors:
Joseph Nakao,
**g-Mei Qiu,
Lukas Einkemmer
Abstract:
This paper introduces a novel computational approach termed the Reduced Augmentation Implicit Low-rank (RAIL) method by investigating two predominant research directions in low-rank solutions to time-dependent partial differential equations (PDEs): dynamical low-rank (DLR), and step and truncation (SAT) tensor methods. The RAIL method, along with the development of the SAT approach, is designed to…
▽ More
This paper introduces a novel computational approach termed the Reduced Augmentation Implicit Low-rank (RAIL) method by investigating two predominant research directions in low-rank solutions to time-dependent partial differential equations (PDEs): dynamical low-rank (DLR), and step and truncation (SAT) tensor methods. The RAIL method, along with the development of the SAT approach, is designed to enhance the efficiency of traditional full-rank implicit solvers from method-of-lines discretizations of time-dependent PDEs, while maintaining accuracy and stability. We consider spectral methods for spatial discretization, and diagonally implicit Runge-Kutta (DIRK) and implicit-explicit (IMEX) RK methods for time discretization. The efficiency gain is achieved by investigating low-rank structures within solutions at each RK stage using a singular value decomposition (SVD). In particular, we develop a reduced augmentation procedure to predict the basis functions to construct projection subspaces. This procedure balances algorithm accuracy and efficiency by incorporating as many bases as possible from previous RK stages and predictions, and by optimizing the basis representation through SVD truncation. As such, one can form implicit schemes for updating basis functions in a dimension-by-dimension manner, similar in spirit to the K-L step in the DLR framework. We also apply a globally mass conservative post-processing step at the end of each RK stage. We validate the RAIL method through numerical simulations of advection-diffusion problems and a Fokker-Planck model, showcasing its ability to efficiently handle time-dependent PDEs while maintaining global mass conservation. Our approach generalizes and bridges the DLR and SAT approaches, offering a comprehensive framework for efficiently and accurately solving time-dependent PDEs with implicit treatment.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
Saturated theorem along cubes for a measure and applications
Authors:
Jiahao Qiu,
Jiaqi Yu
Abstract:
We show that for a minimal system $(X,T)$, the set of saturated points along cubes with respect to its maximal $\infty$-step pro-nilfactor $X_\infty$ has a full measure. As an application, it is shown that if a minimal system $(X,T)$ has no non-trivial $(k+1)$-tuples with arbitrarily long finite IP-independence sets, then it has only at most $k$ ergodic measures and is an almost $k'$ to one extens…
▽ More
We show that for a minimal system $(X,T)$, the set of saturated points along cubes with respect to its maximal $\infty$-step pro-nilfactor $X_\infty$ has a full measure. As an application, it is shown that if a minimal system $(X,T)$ has no non-trivial $(k+1)$-tuples with arbitrarily long finite IP-independence sets, then it has only at most $k$ ergodic measures and is an almost $k'$ to one extension of $X_\infty$ for some $k'\leqslant k$. Particularly, for $k=1$ we prove that $(X,T)$ is uniquely ergodic (even regular with respect to $X_\infty$), which answers a conjecture stated in [3].
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
A consensus-based algorithm for non-convex multiplayer games
Authors:
Enis Chenchene,
Hui Huang,
**niao Qiu
Abstract:
In this paper, we present a novel consensus-based zeroth-order algorithm tailored for non-convex multiplayer games. The proposed method leverages a metaheuristic approach using concepts from swarm intelligence to reliably identify global Nash equilibria. We utilize a group of interacting particles, each agreeing on a specific consensus point, asymptotically converging to the corresponding optimal…
▽ More
In this paper, we present a novel consensus-based zeroth-order algorithm tailored for non-convex multiplayer games. The proposed method leverages a metaheuristic approach using concepts from swarm intelligence to reliably identify global Nash equilibria. We utilize a group of interacting particles, each agreeing on a specific consensus point, asymptotically converging to the corresponding optimal strategy. This paradigm permits a passage to the mean-field limit, allowing us to establish convergence guarantees under appropriate assumptions regarding initialization and objective functions. Finally, we conduct a series of numerical experiments to unveil the dependency of the proposed method on its parameters and apply it to solve a nonlinear Cournot oligopoly game involving multiple goods.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
A Viscosity Solution Theory of Stochastic Hamilton-Jacobi-Bellman equations in the Wasserstein Space
Authors:
Hang Cheung,
**niao Qiu,
Alexandru Badescu
Abstract:
This paper is devoted to a viscosity solution theory of the stochastic Hamilton-Jacobi-Bellman equation in the Wasserstein spaces for the mean-field type control problem which allows for random coefficients and may thus be non-Markovian. The value function of the control problem is proven to be the unique viscosity solution. The major challenge lies in the mixture of the lack of local compactness…
▽ More
This paper is devoted to a viscosity solution theory of the stochastic Hamilton-Jacobi-Bellman equation in the Wasserstein spaces for the mean-field type control problem which allows for random coefficients and may thus be non-Markovian. The value function of the control problem is proven to be the unique viscosity solution. The major challenge lies in the mixture of the lack of local compactness of the Wasserstein spaces and the non-Markovian setting with random coefficients and various techniques are used, including Ito processes parameterized by random measures, the conditional law invariance of the value function, a novel tailor-made compact subset of measure-valued processes, finite dimensional approximations via stochastic n-player differential games with common noises, and so on.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
Sub-optimality of the Naive Mean Field approximation for proportional high-dimensional Linear Regression
Authors:
Jiaze Qiu
Abstract:
The Naïve Mean Field (NMF) approximation is widely employed in modern Machine Learning due to the huge computational gains it bestows on the statistician. Despite its popularity in practice, theoretical guarantees for high-dimensional problems are only available under strong structural assumptions (e.g., sparsity). Moreover, existing theory often does not explain empirical observations noted in th…
▽ More
The Naïve Mean Field (NMF) approximation is widely employed in modern Machine Learning due to the huge computational gains it bestows on the statistician. Despite its popularity in practice, theoretical guarantees for high-dimensional problems are only available under strong structural assumptions (e.g., sparsity). Moreover, existing theory often does not explain empirical observations noted in the existing literature.
In this paper, we take a step towards addressing these problems by deriving sharp asymptotic characterizations for the NMF approximation in high-dimensional linear regression. Our results apply to a wide class of natural priors and allow for model mismatch (i.e., the underlying statistical model can be different from the fitted model). We work under an \textit{iid} Gaussian design and the proportional asymptotic regime, where the number of features and the number of observations grow at a proportional rate. As a consequence of our asymptotic characterization, we establish two concrete corollaries: (a) we establish the inaccuracy of the NMF approximation for the log-normalizing constant in this regime, and (b) we provide theoretical results backing the empirical observation that the NMF approximation can be overconfident in terms of uncertainty quantification.
Our results utilize recent advances in the theory of Gaussian comparison inequalities. To the best of our knowledge, this is the first application of these ideas to the analysis of Bayesian variational inference problems. Our theoretical results are corroborated by numerical experiments. Lastly, we believe our results can be generalized to non-Gaussian designs and provide empirical evidence to support it.
△ Less
Submitted 15 October, 2023;
originally announced October 2023.
-
Optimal control of infinite-dimensional differential systems with randomness and path-dependence and stochastic path-dependent Hamilton-Jacobi equations
Authors:
**niao Qiu,
Yang Yang
Abstract:
This paper is devoted to the stochastic optimal control problem of infinite-dimensional differential systems allowing for both path-dependence and measurable randomness. As opposed to the deterministic path-dependent cases studied by Bayraktar and Keller [J. Funct. Anal. 275 (2018), 2096--2161], the value function turns out to be a random field on the path space and it is characterized by a stocha…
▽ More
This paper is devoted to the stochastic optimal control problem of infinite-dimensional differential systems allowing for both path-dependence and measurable randomness. As opposed to the deterministic path-dependent cases studied by Bayraktar and Keller [J. Funct. Anal. 275 (2018), 2096--2161], the value function turns out to be a random field on the path space and it is characterized by a stochastic path-dependent Hamilton-Jacobi (SPHJ) equation. A notion of viscosity solution is proposed and the value function is proved to be the unique viscosity solution to the associated SPHJ equation.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Provably convergent Newton-Raphson methods for recovering primitive variables with applications to physical-constraint-preserving Hermite WENO schemes for relativistic hydrodynamics
Authors:
Chaoyi Cai,
Jianxian Qiu,
Kailiang Wu
Abstract:
The relativistic hydrodynamics (RHD) equations have three crucial intrinsic physical constraints on the primitive variables: positivity of pressure and density, and subluminal fluid velocity. However, numerical simulations can violate these constraints, leading to nonphysical results or even simulation failure. Designing genuinely physical-constraint-preserving (PCP) schemes is very difficult, as…
▽ More
The relativistic hydrodynamics (RHD) equations have three crucial intrinsic physical constraints on the primitive variables: positivity of pressure and density, and subluminal fluid velocity. However, numerical simulations can violate these constraints, leading to nonphysical results or even simulation failure. Designing genuinely physical-constraint-preserving (PCP) schemes is very difficult, as the primitive variables cannot be explicitly reformulated using conservative variables due to relativistic effects. In this paper, we propose three efficient Newton--Raphson (NR) methods for robustly recovering primitive variables from conservative variables. Importantly, we rigorously prove that these NR methods are always convergent and PCP, meaning they preserve the physical constraints throughout the NR iterations. The discovery of these robust NR methods and their PCP convergence analyses are highly nontrivial and technical. As an application, we apply the proposed NR methods to design PCP finite volume Hermite weighted essentially non-oscillatory (HWENO) schemes for solving the RHD equations. Our PCP HWENO schemes incorporate high-order HWENO reconstruction, a PCP limiter, and strong-stability-preserving time discretization. We rigorously prove the PCP property of the fully discrete schemes using convex decomposition techniques. Moreover, we suggest the characteristic decomposition with rescaled eigenvectors and scale-invariant nonlinear weights to enhance the performance of the HWENO schemes in simulating large-scale RHD problems. Several demanding numerical tests are conducted to demonstrate the robustness, accuracy, and high resolution of the proposed PCP HWENO schemes and to validate the efficiency of our NR methods.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Convergence of a Normal Map-based Prox-SGD Method under the KL Inequality
Authors:
Andre Milzarek,
Junwen Qiu
Abstract:
In this paper, we present a novel stochastic normal map-based algorithm ($\mathsf{norM}\text{-}\mathsf{SGD}$) for nonconvex composite-type optimization problems and discuss its convergence properties. Using a time window-based strategy, we first analyze the global convergence behavior of $\mathsf{norM}\text{-}\mathsf{SGD}$ and it is shown that every accumulation point of the generated sequence of…
▽ More
In this paper, we present a novel stochastic normal map-based algorithm ($\mathsf{norM}\text{-}\mathsf{SGD}$) for nonconvex composite-type optimization problems and discuss its convergence properties. Using a time window-based strategy, we first analyze the global convergence behavior of $\mathsf{norM}\text{-}\mathsf{SGD}$ and it is shown that every accumulation point of the generated sequence of iterates $\{\boldsymbol{x}^k\}_k$ corresponds to a stationary point almost surely and in an expectation sense. The obtained results hold under standard assumptions and extend the more limited convergence guarantees of the basic proximal stochastic gradient method. In addition, based on the well-known Kurdyka-Łojasiewicz (KL) analysis framework, we provide novel point-wise convergence results for the iterates $\{\boldsymbol{x}^k\}_k$ and derive convergence rates that depend on the underlying KL exponent $\boldsymbolθ$ and the step size dynamics $\{α_k\}_k$. Specifically, for the popular step size scheme $α_k=\mathcal{O}(1/k^γ)$, $γ\in (\frac23,1]$, (almost sure) rates of the form $\|\boldsymbol{x}^k-\boldsymbol{x}^*\| = \mathcal{O}(1/k^p)$, $p \in (0,\frac12)$, can be established. The obtained rates are faster than related and existing convergence rates for $\mathsf{SGD}$ and improve on the non-asymptotic complexity bounds for $\mathsf{norM}\text{-}\mathsf{SGD}$.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
Mathematical theory for the interface mode in a waveguide bifurcated from a Dirac point
Authors:
Jiayu Qiu,
Junshan Lin,
Peng Xie,
Hai Zhang
Abstract:
In this paper, we prove the existence of a bound state in a waveguide that consists of two semi-infinite periodic structures separated by an interface. The two periodic structures are perturbed from the same periodic medium with a Dirac point and they possess a common band gap enclosing the Dirac point. The bound state, which is called interface mode here, decays exponentially away from the interf…
▽ More
In this paper, we prove the existence of a bound state in a waveguide that consists of two semi-infinite periodic structures separated by an interface. The two periodic structures are perturbed from the same periodic medium with a Dirac point and they possess a common band gap enclosing the Dirac point. The bound state, which is called interface mode here, decays exponentially away from the interface with a frequency located in the common band gap and can be viewed as a bifurcation from the Dirac point. Using the layer potential technique and asymptotic analysis, we first characterize the band gap opening for the two perturbed periodic media and derive the asymptotics of the Bloch modes near the band gap edges. By formulating the eigenvalue problem for the waveguide with two semi-infinite structures using a boundary integral equation over the interface and analyzing the characteristic values of the associated boundary integral operator, we prove the existence of the interface mode for the waveguide when the perturbation of the periodic medium is small.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
A compact simple HWENO scheme with ADER time discretization for hyperbolic conservation laws I: structured meshes
Authors:
Dongmi Luo,
Shiyi Li,
Jianxian Qiu,
Jun Zhu,
Yibing Chen
Abstract:
In this paper, a compact and high order ADER (Arbitrary high order using DERivatives) scheme using the simple HWENO method (ADER-SHWENO) is proposed for hyperbolic conservation laws. The newly-developed method employs the Lax-Wendroff procedure to convert time derivatives to spatial derivatives, which provides the time evolution of the variables at the cell interfaces. This information is required…
▽ More
In this paper, a compact and high order ADER (Arbitrary high order using DERivatives) scheme using the simple HWENO method (ADER-SHWENO) is proposed for hyperbolic conservation laws. The newly-developed method employs the Lax-Wendroff procedure to convert time derivatives to spatial derivatives, which provides the time evolution of the variables at the cell interfaces. This information is required for the simple HWENO reconstructions, which take advantages of the simple WENO and the classic HWENO. Compared with the original Runge-Kutta HWENO method (RK-HWENO), the new method has two advantages. Firstly, RK-HWENO method must solve the additional equations for reconstructions, which is avoided for the new method. Secondly, the SHWENO reconstruction is performed once with one stencil and is different from the classic HWENO methods, in which both the function and its derivative values are reconstructed with two different stencils, respectively. Thus the new method is more efficient than the RK-HWENO method. Moreover, the new method is more compact than the existing ADER-WENO method. Besides, the new method makes the best use of the information in the ADER method. Thus, the time evolution of the cell averages of the derivatives is simpler than that developed in the work [Li et. al., 447 (2021), 110661.]. Numerical tests indicate that the new method can achieve high order for smooth solutions both in space and time, keep non-oscillatory at discontinuities.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
Quantisation via Branes and Minimal Resolution
Authors:
Jian Qiu
Abstract:
The `brane quantisation' procedure is developed by Gukov and Witten \cite{Gukov:2008ve}.
We implement this idea by combining it with the tilting theory and the minimal resolutions. This way, we can realistically compute the deformation quantisation on the space of observables acting on the Hilbert space. We apply this procedure to certain quantisation problem in the context of generalised Kähler…
▽ More
The `brane quantisation' procedure is developed by Gukov and Witten \cite{Gukov:2008ve}.
We implement this idea by combining it with the tilting theory and the minimal resolutions. This way, we can realistically compute the deformation quantisation on the space of observables acting on the Hilbert space. We apply this procedure to certain quantisation problem in the context of generalised Kähler structure on $\mathbb{P}^2$. Our approach differs from and complements that of Bischoff and Gualtieri \cite{Bischoff:2021ixy}.
We also benefitted from an important technical tool: a combinatorial criterion for the Maurer-Cartan equation, developed in \cite{BarmeierWang} by Barmeier and Wang.
△ Less
Submitted 8 May, 2023; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Swarming models with specular boundary condition and environmental noise
Authors:
Razvan C. Fetecau,
Hui Huang,
**niao Qiu
Abstract:
We investigate a general class of models for swarming/self-collective behaviour in domains with boundaries. The model is expressed as a stochastic system of interacting particles subject to both reflecting boundary condition and common environmental noise. We rigorously derive its corresponding macroscopic mean-field equation, which is a new type of stochastic partial differential equation due to…
▽ More
We investigate a general class of models for swarming/self-collective behaviour in domains with boundaries. The model is expressed as a stochastic system of interacting particles subject to both reflecting boundary condition and common environmental noise. We rigorously derive its corresponding macroscopic mean-field equation, which is a new type of stochastic partial differential equation due to the presence of common noise. The approach relies on a compactness argument, in which we first establish the tightness of the empirical measures associated with the particle system and then demonstrate that the time marginal of the limit measure is a solution to the mean-field equation.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
Stability analysis of the Eulerian-Lagrangian finite volume methods for nonlinear hyperbolic equations in one space dimension
Authors:
Yang Yang,
Jiajie Chen,
**g-Mei Qiu
Abstract:
In this paper, we construct a novel Eulerian-Lagrangian finite volume (ELFV) method for nonlinear scalar hyperbolic equations in one space dimension. It is well known that the exact solutions to such problems may contain shocks though the initial conditions are smooth, and direct numerical methods may suffer from restricted time step sizes. To relieve the restriction, we propose an ELFV method, wh…
▽ More
In this paper, we construct a novel Eulerian-Lagrangian finite volume (ELFV) method for nonlinear scalar hyperbolic equations in one space dimension. It is well known that the exact solutions to such problems may contain shocks though the initial conditions are smooth, and direct numerical methods may suffer from restricted time step sizes. To relieve the restriction, we propose an ELFV method, where the space-time domain was separated by the partition lines originated from the cell interfaces whose slopes are obtained following the Rakine-Hugoniot junmp condition. Unfortunately, to avoid the intersection of the partition lines, the time step sizes are still limited. To fix this gap, we detect effective troubled cells (ETCs) and carefully design the influence region of each ETC, within which the partitioned space-time regions are merged together to form a new one. Then with the new partition of the space-time domain, we theoretically prove that the proposed first-order scheme with Euler forward time discretization is total-variation-diminishing and maximum-principle-preserving with {at least twice} larger time step constraints than the classical first order Eulerian method for Burgers' equation. Numerical experiments verify the optimality of the designed time step sizes.
△ Less
Submitted 14 February, 2023;
originally announced February 2023.
-
Consensus-Based Optimization for Saddle Point Problems
Authors:
Hui Huang,
**niao Qiu,
Konstantin Riedl
Abstract:
In this paper, we propose consensus-based optimization for saddle point problems (CBO-SP), a novel multi-particle metaheuristic derivative-free optimization method capable of provably finding global Nash equilibria. Following the idea of swarm intelligence, the method employs a group of interacting particles, which perform a minimization over one variable and a maximization over the other. This pa…
▽ More
In this paper, we propose consensus-based optimization for saddle point problems (CBO-SP), a novel multi-particle metaheuristic derivative-free optimization method capable of provably finding global Nash equilibria. Following the idea of swarm intelligence, the method employs a group of interacting particles, which perform a minimization over one variable and a maximization over the other. This paradigm permits a passage to the mean-field limit, which makes the method amenable to theoretical analysis and allows to obtain rigorous convergence guarantees under reasonable assumptions about the initialization and the objective function, which most notably include nonconvex-nonconcave objectives.
△ Less
Submitted 12 September, 2023; v1 submitted 23 December, 2022;
originally announced December 2022.
-
A Local Macroscopic Conservative (LoMaC) low rank tensor method with the discontinuous Galerkin method for the Vlasov dynamics
Authors:
Wei Guo,
Jannatul Ferdous Ema,
**g-Mei Qiu
Abstract:
In this paper, we propose a novel Local Macroscopic Conservative (LoMaC) low rank tensor method with discontinuous Galerkin (DG) discretization for the physical and phase spaces for simulating the Vlasov-Poisson (VP) system. The LoMaC property refers to the exact local conservation of macroscopic mass, momentum and energy at the discrete level. The recently developed LoMaC low rank tensor algorith…
▽ More
In this paper, we propose a novel Local Macroscopic Conservative (LoMaC) low rank tensor method with discontinuous Galerkin (DG) discretization for the physical and phase spaces for simulating the Vlasov-Poisson (VP) system. The LoMaC property refers to the exact local conservation of macroscopic mass, momentum and energy at the discrete level. The recently developed LoMaC low rank tensor algorithm (arXiv:2207.00518) simultaneously evolves the macroscopic conservation laws of mass, momentum and energy using the kinetic flux vector splitting; then the LoMaC property is realized by projecting the low rank kinetic solution onto a subspace that shares the same macroscopic observables.
This paper is a generalization of our previous work, but with DG discretization to take advantage of its compactness and flexibility in handling boundary conditions and its superior accuracy in the long term. The algorithm is developed in a similar fashion as that for a finite difference scheme, by observing that the DG method can be viewed equivalently in a nodal fashion. With the nodal DG method, assuming a tensorized computational grid, one will be able to (1) derive differentiation matrices for different nodal points based on a DG upwind discretization of transport terms, and (2) define a weighted inner product space based on the nodal DG grid points. The algorithm can be extended to the high dimensional problems by hierarchical Tucker decomposition of solution tensors and a corresponding conservative projection algorithm. In a similar spirit, the algorithm can be extended to DG methods on nodal points of an unstructured mesh, or to other types of discretization, e.g. the spectral method in velocity direction. Extensive numerical results are performed to showcase the efficacy of the method.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
Fourth-order conservative non-splitting semi-Lagrangian Hermite WENO schemes for kinetic and fluid simulations
Authors:
Nanyi Zheng,
Xiaofeng Cai,
**g-Mei Qiu,
Jianxian Qiu
Abstract:
We present fourth-order conservative non-splitting semi-Lagrangian (SL) Hermite essentially non-oscillatory (HWENO) schemes for linear transport equations with applications for nonlinear problems including the Vlasov-Poisson system, the guiding center Vlasov model, and the incompressible Euler equations in the vorticity-stream function formulation. The proposed SL HWENO schemes combine a weak form…
▽ More
We present fourth-order conservative non-splitting semi-Lagrangian (SL) Hermite essentially non-oscillatory (HWENO) schemes for linear transport equations with applications for nonlinear problems including the Vlasov-Poisson system, the guiding center Vlasov model, and the incompressible Euler equations in the vorticity-stream function formulation. The proposed SL HWENO schemes combine a weak formulation of the characteristic Galerkin method with two newly constructed HWENO reconstruction methods. Fourth-order accuracy is accomplished in both space and time under a non-splitting setting. Mass conservation naturally holds due to the weak formulation of the characteristic Galerkin method and the design of the HWENO reconstructions. We apply a positive-preserving limiter to maintain the positivity of numerical solutions when needed. Although the proposed SL framework allows us to take large time steps for improving computational efficiency, it also brings challenges to the spatial reconstruction technique; we construct two kind of novel HWENO reconstructions to fit the need for the proposed SL framework. Abundant benchmark tests are performed to verify the effectiveness of the proposed SL HWENO schemes.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
A mass conservative Eulerian-Lagrangian Runge-Kutta discontinuous Galerkin method for wave equations with large time step**
Authors:
Xue Hong,
**g-Mei Qiu
Abstract:
We propose an Eulerian-Lagrangian (EL) Runge-Kutta (RK) discontinuous Galerkin (DG) method for wave equations. The method is designed based on the ELDG method for transport problems [J. Comput. Phy. 446: 110632, 2021.], which tracks solution along approximations to characteristics in the DG framework, allowing extra large time step** sizes with stability. The wave equation can be written as a fi…
▽ More
We propose an Eulerian-Lagrangian (EL) Runge-Kutta (RK) discontinuous Galerkin (DG) method for wave equations. The method is designed based on the ELDG method for transport problems [J. Comput. Phy. 446: 110632, 2021.], which tracks solution along approximations to characteristics in the DG framework, allowing extra large time step** sizes with stability. The wave equation can be written as a first order hyperbolic system. Considering each characteristic family, a straightforward application of ELDG will be to transform to the characteristic variables, evolve them on associated characteristic related space-time regions, and transform them back to the original variables. However, the mass conservation could not be guaranteed in a general setting. In this paper, we formulate a mass conservative semi-discrete ELDG method by decomposing each variable into two parts, each of them associated with a different characteristic family. As a result, four different quantities are evolved in EL fashion and recombined to update the solution. The fully discrete scheme is formulated by using method-of-lines RK methods, with intermediate RK solutions updated on the background mesh. Numerical results on 1D and 2D wave equations are presented to demonstrate the performance of the proposed ELDG method. These include the high order spatial and temporal accuracy, stability with extra large time step** size, and mass conservative property.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Cooperative Trajectory Control for Synchronizing the Movement of Two Connected and Autonomous Vehicles Separated in a Mixed Traffic Flow
Authors:
Jiahua Qiu,
Lili Du
Abstract:
When connected and autonomous vehicles (CAVs) are widely used in the future, we can foresee many essential applications, such as platoon formation and autonomous police patrolling, which need two CAVs, originally separated in a mixed traffic flow involving CAVs and human-drive vehicles (HDVs), to quickly approach each other and then keep a stable car-following mode. The entire process should not j…
▽ More
When connected and autonomous vehicles (CAVs) are widely used in the future, we can foresee many essential applications, such as platoon formation and autonomous police patrolling, which need two CAVs, originally separated in a mixed traffic flow involving CAVs and human-drive vehicles (HDVs), to quickly approach each other and then keep a stable car-following mode. The entire process should not jeopardize surrounding traffic safety and efficiency. The existing literature has not studied this CAV synchronization control well, and this study seeks to partially make up this gap.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
A Local Macroscopic Conservative (LoMaC) low rank tensor method for the Vlasov dynamics
Authors:
Wei Guo,
**g-Mei Qiu
Abstract:
In this paper, we propose a novel Local Macroscopic Conservative (LoMaC) low rank tensor method for simulating the Vlasov-Poisson (VP) system. The LoMaC property refers to the exact local conservation of macroscopic mass, momentum and energy at the discrete level. This is a follow-up work of our previous development of a conservative low rank tensor approach for Vlasov dynamics (arXiv:2201.10397).…
▽ More
In this paper, we propose a novel Local Macroscopic Conservative (LoMaC) low rank tensor method for simulating the Vlasov-Poisson (VP) system. The LoMaC property refers to the exact local conservation of macroscopic mass, momentum and energy at the discrete level. This is a follow-up work of our previous development of a conservative low rank tensor approach for Vlasov dynamics (arXiv:2201.10397). In that work, we applied a low rank tensor method with a conservative singular value decomposition (SVD) to the high dimensional VP system to mitigate the curse of dimensionality, while maintaining the local conservation of mass and momentum. However, energy conservation is not guaranteed, which is a critical property to avoid unphysical plasma self-heating or cooling. The new ingredient in the LoMaC low rank tensor algorithm is that we simultaneously evolve the macroscopic conservation laws of mass, momentum and energy using a flux-difference form with kinetic flux vector splitting; then the LoMaC property is realized by projecting the low rank kinetic solution onto a subspace that shares the same macroscopic observables by a conservative orthogonal projection. The algorithm is extended to the high dimensional problems by hierarchical Tuck decomposition of solution tensors and a corresponding conservative projection algorithm. Extensive numerical tests on the VP system are showcased for the algorithm's efficacy.
△ Less
Submitted 1 July, 2022;
originally announced July 2022.
-
A well-balanced moving mesh discontinuous Galerkin method for the Ripa model on triangular meshes
Authors:
Weizhang Huang,
Ruo Li,
Jianxian Qiu,
Min Zhang
Abstract:
A well-balanced moving mesh discontinuous Galerkin (DG) method is proposed for the numerical solution of the Ripa model -- a generalization of the shallow water equations that accounts for effects of water temperature variations. Thermodynamic processes are important particularly in the upper layers of the ocean where the variations of sea surface temperature play a fundamental role in climate cha…
▽ More
A well-balanced moving mesh discontinuous Galerkin (DG) method is proposed for the numerical solution of the Ripa model -- a generalization of the shallow water equations that accounts for effects of water temperature variations. Thermodynamic processes are important particularly in the upper layers of the ocean where the variations of sea surface temperature play a fundamental role in climate change. The well-balance property which requires numerical schemes to preserve the lake-at-rest steady state is crucial to the simulation of perturbation waves over that steady state such as waves on a lake or tsunami waves in the deep ocean. To ensure the well-balance, positivity-preserving, and high-order properties, a DG-interpolation scheme (with or without scaling positivity-preserving limiter) and special treatments pertaining to the Ripa model are employed in the transfer of both the flow variables and bottom topography from the old mesh to the new one and in the TVB limiting process. Mesh adaptivity is realized using an MMPDE moving mesh approach and a metric tensor based on an equilibrium variable and water depth. A motivation is to adapt the mesh according to both the perturbations of the lake-at-rest steady state and the water depth distribution (bottom structure). Numerical examples in one and two dimensions are presented to demonstrate the well-balance, high-order accuracy, and positivity-preserving properties of the method and its ability to capture small perturbations of the lake-at-rest steady state.
△ Less
Submitted 28 May, 2022;
originally announced May 2022.
-
Hypothesis Testing in Sequentially Sampled Data: AdapRT to Maximize Power Beyond iid Sampling
Authors:
Dae Woong Ham,
Jiaze Qiu
Abstract:
Testing whether a variable of interest affects the outcome is one of the most fundamental problem in statistics and is often the main scientific question of interest. To tackle this problem, the conditional randomization test (CRT) is widely used to test the independence of variable(s) of interest (X) with an outcome (Y) holding other variable(s) (Z) fixed. The CRT uses randomization or design-bas…
▽ More
Testing whether a variable of interest affects the outcome is one of the most fundamental problem in statistics and is often the main scientific question of interest. To tackle this problem, the conditional randomization test (CRT) is widely used to test the independence of variable(s) of interest (X) with an outcome (Y) holding other variable(s) (Z) fixed. The CRT uses randomization or design-based inference that relies solely on the iid sampling of (X,Z) to produce exact finite-sample p-values that are constructed using any test statistic. We propose a new method, the adaptive randomization test (ART), that tackles the independence problem while allowing the data to be adaptively sampled. We first showcase the ART in a particular multi-arm bandit problem known as the normal-mean model. Under this setting, we theoretically characterize the powers of both the iid sampling procedure and the adaptive sampling procedure and empirically find that the ART can uniformly outperform the CRT that pulls all arms independently with equal probability. We also surprisingly find that the ART can be more powerful than even the CRT that uses an oracle iid sampling procedure when the signal is relatively strong. We believe that the proposed adaptive procedure is successful because it takes arms that may initially look like "fake" signals due to random chance and stabilizes them closer to "null" signals. We additionally showcase the ART to a popular factorial survey design setting known as conjoint analysis. We find similar results through simulations and a recent application concerning the role of gender discrimination in political candidate evaluation.
△ Less
Submitted 27 August, 2022; v1 submitted 5 May, 2022;
originally announced May 2022.
-
An Eulerian-Lagrangian Runge-Kutta finite volume (EL-RK-FV) method for solving convection and convection-diffusion equations
Authors:
Joseph Nakao,
Jiajie Chen,
**gmei Qiu
Abstract:
We propose a new Eulerian-Lagrangian Runge-Kutta finite volume method for numerically solving convection and convection-diffusion equations. Eulerian-Lagrangian and semi-Lagrangian methods have grown in popularity mostly due to their ability to allow large time steps. Our proposed scheme is formulated by integrating the PDE on a space-time region partitioned by approximations of the characteristic…
▽ More
We propose a new Eulerian-Lagrangian Runge-Kutta finite volume method for numerically solving convection and convection-diffusion equations. Eulerian-Lagrangian and semi-Lagrangian methods have grown in popularity mostly due to their ability to allow large time steps. Our proposed scheme is formulated by integrating the PDE on a space-time region partitioned by approximations of the characteristics determined from the Rankine-Hugoniot jump condition; and then rewriting the time-integral form into a time differential form to allow application of Runge-Kutta (RK) methods via the method-of-lines approach. The scheme can be viewed as a generalization of the standard Runge-Kutta finite volume (RK-FV) scheme for which the space-time region is partitioned by approximate characteristics with zero velocity. The high-order spatial reconstruction is achieved using the recently developed weighted essentially non-oscillatory schemes with adaptive order (WENO-AO); and the high-order temporal accuracy is achieved by explicit RK methods for convection equations and implicit-explicit (IMEX) RK methods for convection-diffusion equations. Our algorithm extends to higher dimensions via dimensional splitting. Numerical experiments demonstrate our algorithm's robustness, high-order accuracy, and ability to handle extra large time steps.
△ Less
Submitted 7 April, 2022;
originally announced April 2022.
-
The TAP free energy for high-dimensional linear regression
Authors:
Jiaze Qiu,
Subhabrata Sen
Abstract:
We derive a variational representation for the log-normalizing constant of the posterior distribution in Bayesian linear regression with a uniform spherical prior and an i.i.d. Gaussian design. We work under the "proportional" asymptotic regime, where the number of observations and the number of features grow at a proportional rate. This rigorously establishes the Thouless-Anderson-Palmer (TAP) ap…
▽ More
We derive a variational representation for the log-normalizing constant of the posterior distribution in Bayesian linear regression with a uniform spherical prior and an i.i.d. Gaussian design. We work under the "proportional" asymptotic regime, where the number of observations and the number of features grow at a proportional rate. This rigorously establishes the Thouless-Anderson-Palmer (TAP) approximation arising from spin glass theory, and proves a conjecture of Krzakala et. al. (2014) in the special case of the spherical prior.
△ Less
Submitted 14 March, 2022;
originally announced March 2022.
-
Polynomial orbits in totally minimal systems
Authors:
Jiahao Qiu
Abstract:
Inspired by the recent work of Glasner, Huang, Shao, Weiss and Ye, we prove that the maximal $\infty$-step pro-nilfactor $X_\infty$ of a minimal system $(X,T)$ is the topological characteristic factor along polynomials in a certain sense. Namely, we show that by an almost one to one modification of $π:X\to X_\infty$, the induced open extension $π^*:X^*\to X_\infty^*$ has the following property: fo…
▽ More
Inspired by the recent work of Glasner, Huang, Shao, Weiss and Ye, we prove that the maximal $\infty$-step pro-nilfactor $X_\infty$ of a minimal system $(X,T)$ is the topological characteristic factor along polynomials in a certain sense. Namely, we show that by an almost one to one modification of $π:X\to X_\infty$, the induced open extension $π^*:X^*\to X_\infty^*$ has the following property: for any $d\in \mathbb{N}$, any open subsets $V_0,V_1,\ldots,V_d$ of $X^*$ with $\bigcap_{i=0}^d π^*(V_i)\neq \emptyset$ and any distinct non-constant integer polynomials $p_i$ with $p_i(0)=0$ for $i=1,\ldots,d$, there exists some $n\in \mathbb{Z}$ such that $V_0\cap T^{-p_1(n)}V_1\cap \ldots \cap T^{-p_d(n)}V_d \neq \emptyset$. where an integer polynomial is the polynomial with rational coefficients taking integer values on the integers.
As an application, the following result is obtained: for a totally minimal system $(X,T)$ and integer polynomials $p_1,\ldots,p_d$, if every non-trivial integer combination of $p_1,\ldots,p_d$ is not constant, then there is a dense $G_δ$ subset $Ω$ of $ X$ such that the set \[ \{(T^{p_1(n)}x,\ldots, T^{p_d(n)}x):n\in \mathbb{Z}\} \] is dense in $X^d$ for every $x\in Ω$.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
On the Global Convergence of Particle Swarm Optimization Methods
Authors:
Hui Huang,
**niao Qiu,
Konstantin Riedl
Abstract:
In this paper we provide a rigorous convergence analysis for the renowned particle swarm optimization method by using tools from stochastic calculus and the analysis of partial differential equations. Based on a time-continuous formulation of the particle dynamics as a system of stochastic differential equations, we establish convergence to a global minimizer of a possibly nonconvex and nonsmooth…
▽ More
In this paper we provide a rigorous convergence analysis for the renowned particle swarm optimization method by using tools from stochastic calculus and the analysis of partial differential equations. Based on a time-continuous formulation of the particle dynamics as a system of stochastic differential equations, we establish convergence to a global minimizer of a possibly nonconvex and nonsmooth objective function in two steps. First, we prove consensus formation of an associated mean-field dynamics by analyzing the time-evolution of the variance of the particle distribution. We then show that this consensus is close to a global minimizer by employing the asymptotic Laplace principle and a tractability condition on the energy landscape of the objective function. These results allow for the usage of memory mechanisms, and hold for a rich class of objectives provided certain conditions of well-preparation of the hyperparameters and the initial datum. In a second step, at least for the case without memory effects, we provide a quantitative result about the mean-field approximation of particle swarm optimization, which specifies the convergence of the interacting particle system to the associated mean-field limit. Combining these two results allows for global convergence guarantees of the numerical particle swarm optimization method with provable polynomial complexity. To demonstrate the applicability of the method we propose an efficient and parallelizable implementation, which is tested in particular on a competitive and well-understood high-dimensional benchmark problem in machine learning.
△ Less
Submitted 8 June, 2022; v1 submitted 28 January, 2022;
originally announced January 2022.
-
Moment-based multi-resolution HWENO scheme for hyperbolic conservation laws
Authors:
Jiayin Li,
Chi-Wang Shu,
Jianxian Qiu
Abstract:
In this paper, a high-order moment-based multi-resolution Hermite weighted essentially non-oscillatory (HWENO) scheme is designed for hyperbolic conservation laws. The main idea of this scheme is derived from our previous work [J. Comput. Phys., 446 (2021) 110653], in which the integral averages of the function and its first order derivative are used to reconstruct both the function and its first…
▽ More
In this paper, a high-order moment-based multi-resolution Hermite weighted essentially non-oscillatory (HWENO) scheme is designed for hyperbolic conservation laws. The main idea of this scheme is derived from our previous work [J. Comput. Phys., 446 (2021) 110653], in which the integral averages of the function and its first order derivative are used to reconstruct both the function and its first order derivative values at the boundaries. However, in this paper, only the function values at the Gauss-Lobatto points in the one or two dimensional case need to be reconstructed by using the information of the zeroth and first order moments. In addition, an extra modification procedure is used to modify those first order moments in the troubled-cells, which leads to an improvement of stability and an enhancement of resolution near discontinuities. To obtain the same order of accuracy, the size of the stencil required by this moment-based multi-resolution HWENO scheme is still the same as the general HWENO scheme and is more compact than the general WENO scheme. Moreover, the linear weights can also be any positive numbers as long as their sum equals one and the CFL number can still be 0.6 whether for the one or two dimensional case. Extensive numerical examples are given to demonstrate the stability and resolution of such moment-based multi-resolution HWENO scheme.
△ Less
Submitted 28 January, 2022;
originally announced January 2022.
-
A conservative low rank tensor method for the Vlasov dynamics
Authors:
Wei Guo,
**g-Mei Qiu
Abstract:
In this paper, we propose a conservative low rank tensor method to approximate nonlinear Vlasov solutions. The low rank approach is based on our earlier work (arxiv: 2106.08834). It takes advantage of the fact that the differential operators in the Vlasov equation are tensor friendly, based on which we propose to dynamically and adaptively build up low rank solution basis by adding new basis funct…
▽ More
In this paper, we propose a conservative low rank tensor method to approximate nonlinear Vlasov solutions. The low rank approach is based on our earlier work (arxiv: 2106.08834). It takes advantage of the fact that the differential operators in the Vlasov equation are tensor friendly, based on which we propose to dynamically and adaptively build up low rank solution basis by adding new basis functions from discretization of the differential equation, and removing basis from a singular value decomposition (SVD)-type truncation procedure. For the discretization, we adopt a high order finite difference spatial discretization together with a second order strong stability preserving multi-step time discretization.
While the SVD truncation will remove the redundancy in representing the high dimensional Vlasov solution, it will destroy the conservation properties of the associated full conservative scheme. In this paper, we develop a conservative truncation procedure with conservation of mass, momentum and kinetic energy densities. The conservative truncation is achieved by an orthogonal projection onto a subspace spanned by $1$, $v$ and $v^2$ in the velocity space associated with a weighted inner product. Then the algorithm performs a weighted SVD truncation of the remainder, which involves a scaling, followed by the standard SVD truncation and rescaling back. The algorithm is further developed in high dimensions with hierarchical Tucker tensor decomposition of high dimensional Vlasov solutions, overcoming the curse of dimensionality. An extensive set of nonlinear Vlasov examples are performed to show the effectiveness and conservation property of proposed conservative low rank approach. Comparison is performed against the non-conservative low rank tensor approach on conservation history of mass, momentum and energy.
△ Less
Submitted 25 January, 2022;
originally announced January 2022.
-
Distributed Random Reshuffling over Networks
Authors:
Kun Huang,
Xiao Li,
Andre Milzarek,
Shi Pu,
Junwen Qiu
Abstract:
In this paper, we consider distributed optimization problems where $n$ agents, each possessing a local cost function, collaboratively minimize the average of the local cost functions over a connected network. To solve the problem, we propose a distributed random reshuffling (D-RR) algorithm that invokes the random reshuffling (RR) update in each agent. We show that D-RR inherits favorable characte…
▽ More
In this paper, we consider distributed optimization problems where $n$ agents, each possessing a local cost function, collaboratively minimize the average of the local cost functions over a connected network. To solve the problem, we propose a distributed random reshuffling (D-RR) algorithm that invokes the random reshuffling (RR) update in each agent. We show that D-RR inherits favorable characteristics of RR for both smooth strongly convex and smooth nonconvex objective functions. In particular, for smooth strongly convex objective functions, D-RR achieves $\mathcal{O}(1/T^2)$ rate of convergence (where $T$ counts epoch number) in terms of the squared distance between the iterate and the global minimizer. When the objective function is assumed to be smooth nonconvex, we show that D-RR drives the squared norm of gradient to $0$ at a rate of $\mathcal{O}(1/T^{2/3})$. These convergence results match those of centralized RR (up to constant factors) and outperform the distributed stochastic gradient descent (DSGD) algorithm if we run a relatively large number of epochs. Finally, we conduct a set of numerical experiments to illustrate the efficiency of the proposed D-RR method on both strongly convex and nonconvex distributed optimization problems.
△ Less
Submitted 23 March, 2023; v1 submitted 30 December, 2021;
originally announced December 2021.
-
Nijenhuis tensor and invariant polynomials
Authors:
Francesco Bonechi,
Jian Qiu,
Marco Tarlini,
Emanuele Viviani
Abstract:
We discuss the diagonalization problem of the Nijenhuis tensor in a class of Poisson-Nijenhuis structures defined on compact hermitian symmetric spaces. We study its action on the ring of invariant polynomials of a Thimm chain of subalgebras. The existence of $φ$-minimal representations defines a suitable basis of invariant polynomials that completely solves the diagonalization problem. We prove t…
▽ More
We discuss the diagonalization problem of the Nijenhuis tensor in a class of Poisson-Nijenhuis structures defined on compact hermitian symmetric spaces. We study its action on the ring of invariant polynomials of a Thimm chain of subalgebras. The existence of $φ$-minimal representations defines a suitable basis of invariant polynomials that completely solves the diagonalization problem. We prove that such representations exist in the classical cases AIII, BDI, DIII and CI, and do not exist in the exceptional cases EIII and EVII. We discuss a second general construction that in these two cases computes partially the spectrum and hints at a different behavior with respect to the classical cases.
△ Less
Submitted 18 November, 2021;
originally announced November 2021.
-
SketchNE: Embedding Billion-Scale Networks Accurately in One Hour
Authors:
Yuyang Xie,
Yuxiao Dong,
Jiezhong Qiu,
Wenjian Yu,
Xu Feng,
Jie Tang
Abstract:
We study large-scale network embedding with the goal of generating high-quality embeddings for networks with more than 1 billion vertices and 100 billion edges. Recent attempts LightNE and NetSMF propose to sparsify and factorize the (dense) NetMF matrix for embedding large networks, where NetMF is a theoretically-grounded network embedding method. However, there is a trade-off between their embed…
▽ More
We study large-scale network embedding with the goal of generating high-quality embeddings for networks with more than 1 billion vertices and 100 billion edges. Recent attempts LightNE and NetSMF propose to sparsify and factorize the (dense) NetMF matrix for embedding large networks, where NetMF is a theoretically-grounded network embedding method. However, there is a trade-off between their embeddings' quality and scalability due to their expensive memory requirements, making embeddings less effective under real-world memory constraints. Therefore, we present the SketchNE model, a scalable, effective, and memory-efficient network embedding solution developed for a single machine with CPU only. The main idea of SketchNE is to avoid the explicit construction and factorization of the NetMF matrix either sparsely or densely when producing the embeddings through the proposed sparse-sign randomized single-pass SVD algorithm. We conduct extensive experiments on nine datasets of various sizes for vertex classification and link prediction, demonstrating the consistent outperformance of SketchNE over state-of-the-art baselines in terms of both effectiveness and efficiency. SketchNE costs only 1.0 hours to embed the Hyperlink2012 network with 3.5 billion vertices and 225 billion edges on a CPU-only single machine with embedding superiority (e.g., a 282% relative HITS@10 gain over LightNE).
△ Less
Submitted 1 February, 2024; v1 submitted 25 October, 2021;
originally announced October 2021.
-
Convergence of Random Reshuffling Under The Kurdyka-Łojasiewicz Inequality
Authors:
Xiao Li,
Andre Milzarek,
Junwen Qiu
Abstract:
We study the random reshuffling (RR) method for smooth nonconvex optimization problems with a finite-sum structure. Though this method is widely utilized in practice such as the training of neural networks, its convergence behavior is only understood in several limited settings. In this paper, under the well-known Kurdyka-Lojasiewicz (KL) inequality, we establish strong limit-point convergence res…
▽ More
We study the random reshuffling (RR) method for smooth nonconvex optimization problems with a finite-sum structure. Though this method is widely utilized in practice such as the training of neural networks, its convergence behavior is only understood in several limited settings. In this paper, under the well-known Kurdyka-Lojasiewicz (KL) inequality, we establish strong limit-point convergence results for RR with appropriate diminishing step sizes, namely, the whole sequence of iterates generated by RR is convergent and converges to a single stationary point in an almost sure sense. In addition, we derive the corresponding rate of convergence, depending on the KL exponent and the suitably selected diminishing step sizes. When the KL exponent lies in $[0,\frac12]$, the convergence is at a rate of $\mathcal{O}(t^{-1})$ with $t$ counting the iteration number. When the KL exponent belongs to $(\frac12,1)$, our derived convergence rate is of the form $\mathcal{O}(t^{-q})$ with $q\in (0,1)$ depending on the KL exponent. The standard KL inequality-based convergence analysis framework only applies to algorithms with a certain descent property. We conduct a novel convergence analysis for the non-descent RR method with diminishing step sizes based on the KL inequality, which generalizes the standard KL framework. We summarize our main steps and core ideas in an informal analysis framework, which is of independent interest. As a direct application of this framework, we also establish similar strong limit-point convergence results for the reshuffled proximal point method.
△ Less
Submitted 25 January, 2023; v1 submitted 10 October, 2021;
originally announced October 2021.
-
Asymptotic normality for eigenvalue statistics of a general sample covariance matrix when $p/n \to \infty$ and applications
Authors:
Jiaxin Qiu,
Zeng Li,
Jianfeng Yao
Abstract:
The asymptotic normality for a large family of eigenvalue statistics of a general sample covariance matrix is derived under the ultra-high dimensional setting, that is, when the dimension to sample size ratio $p/n \to \infty$. Based on this CLT result, we first adapt the covariance matrix test problem to the new ultra-high dimensional context. Then as a second application, we develop a new test fo…
▽ More
The asymptotic normality for a large family of eigenvalue statistics of a general sample covariance matrix is derived under the ultra-high dimensional setting, that is, when the dimension to sample size ratio $p/n \to \infty$. Based on this CLT result, we first adapt the covariance matrix test problem to the new ultra-high dimensional context. Then as a second application, we develop a new test for the separable covariance structure of a matrix-valued white noise. Simulation experiments are conducted for the investigation of finite-sample properties of the general asymptotic normality of eigenvalue statistics, as well as the second test for separable covariance structure of matrix-valued white noise.
△ Less
Submitted 14 September, 2021;
originally announced September 2021.