Search | arXiv e-print repository

arXiv:2406.19617 [pdf, ps, other]

Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity

Authors: Qian Yu, Yining Wang, Baihe Huang, Qi Lei, Jason D. Lee

Abstract: Optimization of convex functions under stochastic zeroth-order feedback has been a major and challenging question in online learning. In this work, we consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the mi… ▽ More Optimization of convex functions under stochastic zeroth-order feedback has been a major and challenging question in online learning. In this work, we consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by develo** matching upper and lower bounds. We propose an algorithm that features a combination of a bootstrap** stage and a mirror-descent stage. Our main technical innovation consists of a sharp characterization for the spherical-sampling gradient estimator under higher-order smoothness conditions, which allows the algorithm to optimally balance the bias-variance tradeoff, and a new iterative method for the bootstrap** stage, which maintains the performance for unbounded Hessian. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.17933 [pdf, other]

Modeling and simulations of high-density two-phase flows using projection-based Cahn-Hilliard Navier-Stokes equations

Authors: Ali Rabeh, Makrand A. Khanwale, Jonghyun Lee, Baskar Ganapathysubramanian

Abstract: Accurately modeling the dynamics of high-density ratio ($\mathcal{O}(10^5)$) two-phase flows is important for many applications in material science and manufacturing. In this work, we consider numerical simulations of molten metal undergoing microgravity oscillations. Accurate simulation of the oscillation dynamics allows us to characterize the interplay between the two fluids' surface tension and… ▽ More Accurately modeling the dynamics of high-density ratio ($\mathcal{O}(10^5)$) two-phase flows is important for many applications in material science and manufacturing. In this work, we consider numerical simulations of molten metal undergoing microgravity oscillations. Accurate simulation of the oscillation dynamics allows us to characterize the interplay between the two fluids' surface tension and density ratio, which is an important consideration for terrestrial manufacturing applications. We present a projection-based computational framework for solving a thermodynamically-consistent Cahn-Hilliard Navier-Stokes equations for two-phase flows under these large density ratios. A modified version of the pressure-decoupled solver based on the Helmholtz-Hodge decomposition presented in Khanwale et al. [$\textit{A projection-based, semi-implicit time-step** approach for the Cahn-Hilliard Navier-Stokes equations on adaptive octree meshes.}$, Journal of Computational Physics 475 (2023): 111874] is used. We present a comprehensive convergence study to investigate the effect of mesh resolution, time-step, and interfacial thickness on droplet-shape oscillations. We deploy our framework to predict the oscillation behavior of three physical systems exhibiting very large density ratios ($10^4-10^5:1$) that have previously never been performed. △ Less

Submitted 1 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.15361 [pdf, other]

Minimal grid diagrams of the prime alternating knots with 13 crossings

Authors: Hwa Jeong Lee, Alexander Stoimenow, Gyo Taek **

Abstract: A knot is a closed loop in space without self-intersection. Two knots are equivalent if there is a self homeomorphism of space bringing one onto the other. An arc presentation is an embedding of a knot in the union of finitely many half planes with a common boundary line such that each half plane contains a simple arc of the knot. The minimal number of such half planes among all arc presentations… ▽ More A knot is a closed loop in space without self-intersection. Two knots are equivalent if there is a self homeomorphism of space bringing one onto the other. An arc presentation is an embedding of a knot in the union of finitely many half planes with a common boundary line such that each half plane contains a simple arc of the knot. The minimal number of such half planes among all arc presentations of a given knot is called the arc index of the knot. A knot is usually presented as a planar diagram with finitely many crossings of two strands where one of the strands goes over the other. A grid diagram is a planar diagram which is a non-simple rectilinear polygon such that vertical edges always cross over horizontal edges at all crossings. It is easily seen that an arc presentation gives rise to a grid diagram and vice versa. It is known that the arc index of an alternating knot is two plus its minimal crossing number. There are 4878 prime alternating knots with minimal crossing number 13. We obtained minimal arc presentations of them in the form of grid diagrams having 15 vertical segments. This is a continuation of the works on prime alternating knots of 11 crossings and 12 crossings. △ Less

Submitted 31 March, 2024; originally announced June 2024.

Comments: 76 pages, 4 figures, 4878 grid diagrams

MSC Class: 57K10

arXiv:2406.13309 [pdf, other]

The Powell Conjecture in genus four

Authors: Sangbum Cho, Yuya Koda, Jung Hoon Lee

Abstract: The Powell Conjecture states that four specific elements suffice to generate the Goeritz group of the Heegaard splitting of the $3$-sphere. We show that this conjecture is true when the genus of the splitting is four. The Powell Conjecture states that four specific elements suffice to generate the Goeritz group of the Heegaard splitting of the $3$-sphere. We show that this conjecture is true when the genus of the splitting is four. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 17 pages, 12 figures

MSC Class: 57K30

arXiv:2406.12761 [pdf, other]

Obstructing two-torsion in the rational knot concordance group

Authors: Jaewon Lee

Abstract: It is well known that there are many 2-torsion elements in the classical knot concordance group. On the other hand, it is not known if there is any torsion element in the rational knot concordance group $\mathcal{C}_\mathbb{Q}$. Cha defined the algebraic rational concordance group $\mathcal{AC}_\mathbb{Q}$, an analogue of the classical algebraic concordance group, and showed that… ▽ More It is well known that there are many 2-torsion elements in the classical knot concordance group. On the other hand, it is not known if there is any torsion element in the rational knot concordance group $\mathcal{C}_\mathbb{Q}$. Cha defined the algebraic rational concordance group $\mathcal{AC}_\mathbb{Q}$, an analogue of the classical algebraic concordance group, and showed that $\mathcal{AC}_\mathbb{Q}\cong\mathbb{Z}^\infty\oplus\mathbb{Z}_2^\infty\oplus\mathbb{Z}_4^\infty$. The knots that represent 2-torsions in $\mathcal{AC}_\mathbb{Q}$ potentially have order $2$ in $\mathcal{C}_\mathbb{Q}$. In this paper, we provide an obstruction for knots of order $2$ in $\mathcal{AC}_\mathbb{Q}$ from being of finite order in $\mathcal{C}_\mathbb{Q}$. Moreover, we give a family consisting of such knots that generates an infinite rank subgroup of $\mathcal{C}_\mathbb{Q}$. We also note that Cha proved that in higher dimensions, the algebraic rational concordance order is the same as the rational knot concordance order. Our obstruction is based on the localized von Neumann $ρ$-invariant. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 22 pages, 5 figures

MSC Class: 57K10

arXiv:2406.11378 [pdf, ps, other]

Non-freeness of parabolic two-generator groups

Authors: Philip Choi, Kyeonghee Jo, Hyuk Kim, Junho Lee

Abstract: A complex number $λ$ is said to be non-free if the subgroup of $SL(2,\bc)$ generated by $$X=\begin{pmatrix} 1& 1\\ 0 & 1 \end{pmatrix} \,\, \text{and}\,\,\,Y_λ=\begin{pmatrix} 1& 0\\ λ& 1 \end{pmatrix}$$ is not a free group of rank 2. In this case the number $λ$ is called a relation number, and it has been a long standing problem to determine the relation numbers. In this paper, we characteriz… ▽ More A complex number $λ$ is said to be non-free if the subgroup of $SL(2,\bc)$ generated by $$X=\begin{pmatrix} 1& 1\\ 0 & 1 \end{pmatrix} \,\, \text{and}\,\,\,Y_λ=\begin{pmatrix} 1& 0\\ λ& 1 \end{pmatrix}$$ is not a free group of rank 2. In this case the number $λ$ is called a relation number, and it has been a long standing problem to determine the relation numbers. In this paper, we characterize the relation numbers by establishing the equivalence between $λ$ being a relation number and $u:=\sqrt{- λ}$ being a root of a `generalized Chebyshev polynomial'. The generalized Chebyshev polynomials of degree $k$ are given by a sequence of $k$ integers $(n_1, n_2,\cdots, n_k)$ using the usual recursive formula, and thereby can be studied systematically using continuants and continued fractions. Such formulation, then, enables us to prove that, the question whether a given number $λ$ is a relation number of $u$-degree $k$ can be answered by checking only finitely many generalized Chebyshev polynomials. Based on these theorems, we design an algorithm deciding any given number is a relation number with minimal degree $k$. With its computer implementation we provide a few sample examples, with a particular emphasis on the well known conjecture that every rational number in the interval $(-4, 4)$ is a relation number. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 43 pages, 2 figures

MSC Class: 20E05; 11B39; 11J70; 30F35; 30F40

arXiv:2406.10920 [pdf, other]

Hamilton-Jacobi Based Policy-Iteration via Deep Operator Learning

Authors: Jae Yong Lee, Yeoneung Kim

Abstract: The framework of deep operator network (DeepONet) has been widely exploited thanks to its capability of solving high dimensional partial differential equations. In this paper, we incorporate DeepONet with a recently developed policy iteration scheme to numerically solve optimal control problems and the corresponding Hamilton--Jacobi--Bellman (HJB) equations. A notable feature of our approach is th… ▽ More The framework of deep operator network (DeepONet) has been widely exploited thanks to its capability of solving high dimensional partial differential equations. In this paper, we incorporate DeepONet with a recently developed policy iteration scheme to numerically solve optimal control problems and the corresponding Hamilton--Jacobi--Bellman (HJB) equations. A notable feature of our approach is that once the neural network is trained, the solution to the optimal control problem and HJB equations with different terminal functions can be inferred quickly thanks to the unique feature of operator learning. Furthermore, a quantitative analysis of the accuracy of the algorithm is carried out via comparison principles of viscosity solutions. The effectiveness of the method is verified with various examples, including 10-dimensional linear quadratic regulator problems (LQRs). △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 24 pages, 5 figures

MSC Class: 68T20; 68U07; 35F21; 49L12; 49L25

arXiv:2406.09619 [pdf, ps, other]

A Characterization of backward bounded solutions

Authors: Minkyu Kwak, Jihoon Lee, Bataa Lkhagvasuren

Abstract: We prove that the collection $\mathcal M_{-\infty}$ of backward bounded solutions for a semilinear evolution equation is the graph of an upper hemicontinuous set-valued function from the low Fourier modes to the higher Fourier modes, which is invariant and contains the global attractor. We also show that there exists a limit $\mathcal M_{\infty}$ of finite dimensional Lipschitz manifolds… ▽ More We prove that the collection $\mathcal M_{-\infty}$ of backward bounded solutions for a semilinear evolution equation is the graph of an upper hemicontinuous set-valued function from the low Fourier modes to the higher Fourier modes, which is invariant and contains the global attractor. We also show that there exists a limit $\mathcal M_{\infty}$ of finite dimensional Lipschitz manifolds $\mathcal M_t$ generated by the time $t$-maps ($t>0$) from the flat manifold $\mathcal M_0$ with the Hausdorff distance and we find $\mathcal M_{\infty} \subset \mathcal M_{-\infty}$. No spectral gap conditions are assumed. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.08466 [pdf, other]

Scaling Laws in Linear Regression: Compute, Parameters, and Data

Authors: Licong Lin, **gfeng Wu, Sham M. Kakade, Peter L. Bartlett, Jason D. Lee

Abstract: Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists of approximation, bias, and variance errors, where the variance error increases with model size. This disagrees with the general form of neural scaling laws, wh… ▽ More Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists of approximation, bias, and variance errors, where the variance error increases with model size. This disagrees with the general form of neural scaling laws, which predict that increasing model size monotonically improves performance. We study the theory of scaling laws in an infinite dimensional linear regression setup. Specifically, we consider a model with $M$ parameters as a linear function of sketched covariates. The model is trained by one-pass stochastic gradient descent (SGD) using $N$ data. Assuming the optimal parameter satisfies a Gaussian prior and the data covariance matrix has a power-law spectrum of degree $a>1$, we show that the reducible part of the test error is $Θ(M^{-(a-1)} + N^{-(a-1)/a})$. The variance error, which increases with $M$, is dominated by the other errors due to the implicit regularization of SGD, thus disappearing from the bound. Our theory is consistent with the empirical neural scaling laws and verified by numerical simulation. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.06855 [pdf, other]

Design and Scheduling of an AI-based Queueing System

Authors: Jiung Lee, Hongseok Namkoong, Yibo Zeng

Abstract: To leverage prediction models to make optimal scheduling decisions in service systems, we must understand how predictive errors impact congestion due to externalities on the delay of other jobs. Motivated by applications where prediction models interact with human servers (e.g., content moderation), we consider a large queueing system comprising of many single server queues where the class of a jo… ▽ More To leverage prediction models to make optimal scheduling decisions in service systems, we must understand how predictive errors impact congestion due to externalities on the delay of other jobs. Motivated by applications where prediction models interact with human servers (e.g., content moderation), we consider a large queueing system comprising of many single server queues where the class of a job is estimated using a prediction model. By characterizing the impact of mispredictions on congestion cost in heavy traffic, we design an index-based policy that incorporates the predicted class information in a near-optimal manner. Our theoretical results guide the design of predictive models by providing a simple model selection procedure with downstream queueing performance as a central concern, and offer novel insights on how to design queueing systems with AI-based triage. We illustrate our framework on a content moderation task based on real online comments, where we construct toxicity classifiers by finetuning large language models. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2405.19713 [pdf, other]

Summing divergent matrix series

Authors: Rongbiao Wang, JungHo Lee, Lek-Heng Lim

Abstract: We extend several celebrated methods in classical analysis for summing series of complex numbers to series of complex matrices. These include the summation methods of Abel, Borel, Cesáro, Euler, Lambert, Nörlund, and Mittag-Leffler, which are frequently used to sum scalar series that are divergent in the conventional sense. One feature of our matrix extensions is that they are fully noncommutative… ▽ More We extend several celebrated methods in classical analysis for summing series of complex numbers to series of complex matrices. These include the summation methods of Abel, Borel, Cesáro, Euler, Lambert, Nörlund, and Mittag-Leffler, which are frequently used to sum scalar series that are divergent in the conventional sense. One feature of our matrix extensions is that they are fully noncommutative generalizations of their scalar counterparts -- not only is the scalar series replaced by a matrix series, positive weights are replaced by positive definite matrix weights, order on $\mathbb{R}$ replaced by Loewner order, exponential function replaced by matrix exponential function, etc. We will establish the regularity of our matrix summation methods, i.e., when applied to a matrix series convergent in the conventional sense, we obtain the same value for the sum. Our second goal is to provide numerical algorithms that work in conjunction with these summation methods. We discuss how the block and mixed-block summation algorithms, the Kahan compensated summation algorithm, may be applied to matrix sums with similar roundoff error bounds. These summation methods and algorithms apply not only to power or Taylor series of matrices but to any general matrix series including matrix Fourier and Dirichlet series. We will demonstrate the utility of these summation methods: establishing a Fejér's theorem and alleviating the Gibbs phenomenon for matrix Fourier series; extending the domains of matrix functions and accurately evaluating them; enhancing the matrix Padé approximation and Schur--Parlett algorithms; and more. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 39 pages, 11 figures

MSC Class: 15A16; 40D05; 40G10; 47A56; 65B10; 65F60

arXiv:2405.16894 [pdf, ps, other]

An Unconstrained Formulation of Some Constrained Partial Differential Equations and its Application to Finite Neuron Methods

Authors: Jiwei Jia, Young Ju Lee, Ruitong Shan

Abstract: In this paper, we present a new framework how a PDE with constraints can be formulated into a sequence of PDEs with no constraints, whose solutions are convergent to the solution of the PDE with constraints. This framework is then used to build a novel finite neuron method to solve the 2nd order elliptic equations with the Dirichlet boundary condition. Our algorithm is the first algorithm, proven… ▽ More In this paper, we present a new framework how a PDE with constraints can be formulated into a sequence of PDEs with no constraints, whose solutions are convergent to the solution of the PDE with constraints. This framework is then used to build a novel finite neuron method to solve the 2nd order elliptic equations with the Dirichlet boundary condition. Our algorithm is the first algorithm, proven to lead to shallow neural network solutions with an optimal H1 norm error. We show that a widely used penalized PDE, which imposes the Dirichlet boundary condition weakly can be interpreted as the first element of the sequence of PDEs within our framework. Furthermore, numerically, we show that it may not lead to the solution with the optimal H1 norm error bound in general. On the other hand, we theoretically demonstrate that the second and later elements of a sequence of PDEs can lead to an adequate solution with the optimal H1 norm error bound. A number of sample tests are performed to confirm the effectiveness of the proposed algorithm and the relevant theory. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.16855 [pdf, ps, other]

Maximal operators given by Fourier multipliers with dilation of fractional dimensions

Authors: ** Bong Lee, **sol Seo

Abstract: In this paper, we investigate $L^p$ bounds of maximal Fourier multiplier operators with dilation of fractional dimensions. For the Fourier multipliers, we suggest a criterion related to dimensions of dilation sets which guarantees $L^p$ bounds of the maximal operators for each $p$. Our criterion covers Mikhlin-type multipliers, multipliers with limited decay, and multipliers with slow decay. In this paper, we investigate $L^p$ bounds of maximal Fourier multiplier operators with dilation of fractional dimensions. For the Fourier multipliers, we suggest a criterion related to dimensions of dilation sets which guarantees $L^p$ bounds of the maximal operators for each $p$. Our criterion covers Mikhlin-type multipliers, multipliers with limited decay, and multipliers with slow decay. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 15 pages

MSC Class: 42B25; 42B15; 42B35; 42B37

arXiv:2405.16828 [pdf, other]

Kernel-based optimally weighted conformal prediction intervals

Authors: Jonghyeok Lee, Chen Xu, Yao Xie

Abstract: Conformal prediction has been a popular distribution-free framework for uncertainty quantification. In this paper, we present a novel conformal prediction method for time-series, which we call Kernel-based Optimally Weighted Conformal Prediction Intervals (KOWCPI). Specifically, KOWCPI adapts the classic Reweighted Nadaraya-Watson (RNW) estimator for quantile regression on dependent data and learn… ▽ More Conformal prediction has been a popular distribution-free framework for uncertainty quantification. In this paper, we present a novel conformal prediction method for time-series, which we call Kernel-based Optimally Weighted Conformal Prediction Intervals (KOWCPI). Specifically, KOWCPI adapts the classic Reweighted Nadaraya-Watson (RNW) estimator for quantile regression on dependent data and learns optimal data-adaptive weights. Theoretically, we tackle the challenge of establishing a conditional coverage guarantee for non-exchangeable data under strong mixing conditions on the non-conformity scores. We demonstrate the superior performance of KOWCPI on real time-series against state-of-the-art methods, where KOWCPI achieves narrower confidence intervals without losing coverage. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.13455 [pdf, ps, other]

Carleson measures for weighted Bergman--Zygmund spaces

Authors: Hong Rae Cho, Hyungwoon Koo, Young Joo Lee, Atte Pennanen, Jouni Rättyä, Fanglei Wu

Abstract: For $0<p<\infty$, $Ψ:[0,\infty)\to(0,\infty)$ and a finite positive Borel measure $μ$ on the unit disc $\mathbb{D}$, the Lebesgue--Zygmund space $L^p_{μ,Ψ}$ consists of all measurable functions $f$ such that $\lVert f \rVert_{L_{μ, Ψ}^{p}}^p =\int_{\mathbb{D}}|f|^pΨ(|f|)\,dμ< \infty$. For an integrable radial function $ω$ on $\mathbb{D}$, the corresponding weighted Bergman-Zygmund space… ▽ More For $0<p<\infty$, $Ψ:[0,\infty)\to(0,\infty)$ and a finite positive Borel measure $μ$ on the unit disc $\mathbb{D}$, the Lebesgue--Zygmund space $L^p_{μ,Ψ}$ consists of all measurable functions $f$ such that $\lVert f \rVert_{L_{μ, Ψ}^{p}}^p =\int_{\mathbb{D}}|f|^pΨ(|f|)\,dμ< \infty$. For an integrable radial function $ω$ on $\mathbb{D}$, the corresponding weighted Bergman-Zygmund space $A_{ω, Ψ}^{p}$ is the set of all analytic functions in $L_{μ, Ψ}^{p}$ with $dμ=ω\,dA$. The purpose of the paper is to characterize bounded (and compact) embeddings $A_{ω,Ψ}^{p}\subset L_{μ, Φ}^{q}$, when $0<p\le q<\infty$, the functions $Ψ$ and $Φ$ are essential monotonic, and $Ψ,Φ,ω$ satisfy certain doubling properties. The tools developed on the way to the main results are applied to characterize bounded and compact integral operators acting from $A^p_{ω,Ψ}$ to $A^q_{ν,Φ}$, provided $ν$ admits the same doubling property as $ω$. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.10520 [pdf, ps, other]

Witten deformation and divergence-free symmetric Killing 2-tensors

Authors: Kwangho Choi, Junho Lee

Abstract: Using a Morse function and a Witten deformation argument, we obtain an upper bound for the dimension of the space of divergence-free symmetric Killing $p$-tensors on a closed Riemannian manifold, and calculate it explicitly for $p=2$. Using a Morse function and a Witten deformation argument, we obtain an upper bound for the dimension of the space of divergence-free symmetric Killing $p$-tensors on a closed Riemannian manifold, and calculate it explicitly for $p=2$. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.05504 [pdf, ps, other]

The standard generators of the tetrahedron algebra and their look-alikes

Authors: Jae-Ho Lee

Abstract: The tetrahedron algebra $\boxtimes$ is an infinite-dimensional Lie algebra defined by generators $\{x_{ij} \mid i, j \in \{0, 1, 2, 3\}, i \neq j\}$ and some relations, including the Dolan-Grady relations. These twelve generators are called standard. We introduce a type of element in $\boxtimes$ that "looks like" a standard generator. For mutually distinct $h, i, j, k \in \{0, 1, 2, 3\}$, consider… ▽ More The tetrahedron algebra $\boxtimes$ is an infinite-dimensional Lie algebra defined by generators $\{x_{ij} \mid i, j \in \{0, 1, 2, 3\}, i \neq j\}$ and some relations, including the Dolan-Grady relations. These twelve generators are called standard. We introduce a type of element in $\boxtimes$ that "looks like" a standard generator. For mutually distinct $h, i, j, k \in \{0, 1, 2, 3\}$, consider the standard generator $x_{ij}$ of $\boxtimes$. An element $ξ\in \boxtimes$ is called $x_{ij}$-like whenever both (i) $ξ$ commutes with $x_{ij}$; (ii) $ξ$ and $x_{hk}$ satisfy a Dolan-Grady relation. Pick mutually distinct $i,j,k \in \{0,1,2,3\}$. In our main result, we find an attractive basis for $\boxtimes$ with the property that every basis element is either $x_{ij}$-like or $x_{jk}$-like or $x_{ki}$-like. We discuss this basis from multiple points of view. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: 30 pages

MSC Class: 17B65; 17B05

arXiv:2405.03931 [pdf, ps, other]

Incorporating changeable attitudes toward vaccination into an SIR infectious disease model

Authors: Yi Jiang, Kristin M. Kurianski, Jane H. Lee, Yan** Ma, Daniel Cicala, Glenn Ledder

Abstract: We develop a mechanistic model that classifies individuals both in terms of epidemiological status (SIR) and vaccination attitude (willing or unwilling), with the goal of discovering how disease spread is influenced by changing opinions about vaccination. Analysis of the model identifies existence and stability criteria for both disease-free and endemic disease equilibria. The analytical results,… ▽ More We develop a mechanistic model that classifies individuals both in terms of epidemiological status (SIR) and vaccination attitude (willing or unwilling), with the goal of discovering how disease spread is influenced by changing opinions about vaccination. Analysis of the model identifies existence and stability criteria for both disease-free and endemic disease equilibria. The analytical results, supported by numerical simulations, show that attitude changes induced by disease prevalence can destabilize endemic disease equilibria, resulting in limit cycles. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 30 pages, 3 tables, 10 figures

MSC Class: 37N25 (Primary) 92D30 (Secondary)

arXiv:2405.03183 [pdf, other]

Impact of EIP-4844 on Ethereum: Consensus Security, Ethereum Usage, Rollup Transaction Dynamics, and Blob Gas Fee Markets

Authors: Seongwan Park, Bosul Mun, Seungyun Lee, Woo** Jeong, Jaewook Lee, Hyeonsang Eom, Huisu Jang

Abstract: On March 13, 2024, Ethereum implemented EIP-4844, designed to enhance its role as a data availability layer. While this upgrade reduces data posting costs for rollups, it also raises concerns about its impact on the consensus layer due to increased propagation sizes. Moreover, the broader effects on the overall Ethereum ecosystem remain largely unexplored. In this paper, we conduct an empirical an… ▽ More On March 13, 2024, Ethereum implemented EIP-4844, designed to enhance its role as a data availability layer. While this upgrade reduces data posting costs for rollups, it also raises concerns about its impact on the consensus layer due to increased propagation sizes. Moreover, the broader effects on the overall Ethereum ecosystem remain largely unexplored. In this paper, we conduct an empirical analysis of the impact of EIP-4844 on consensus security, Ethereum usage, rollup transaction dynamics, and the blob gas fee mechanism. We explore changes in synchronization times, provide quantitative assessments of rollup and user behaviors, and deepen the understanding of the blob gas fee mechanism, highlighting both enhancements and areas of concern post-upgrade. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.00889 [pdf, ps, other]

Uniqueness of $p$-local truncated Brown-Peterson spectra

Authors: David Jongwon Lee

Abstract: When $p$ is an odd prime, we prove that the $\mathbb F_p$-cohomology of $\mathrm{BP}\langle n\rangle$ as a module over the Steenrod algebra determines the $p$-local spectrum $\mathrm{BP}\langle n\rangle$. In particular, we prove that the $p$-local spectrum $\mathrm{BP}\langle n\rangle$ only depends on its $p$-completion $\mathrm{BP}\langle n\rangle_p^\wedge$. As a corollary, this proves that the… ▽ More When $p$ is an odd prime, we prove that the $\mathbb F_p$-cohomology of $\mathrm{BP}\langle n\rangle$ as a module over the Steenrod algebra determines the $p$-local spectrum $\mathrm{BP}\langle n\rangle$. In particular, we prove that the $p$-local spectrum $\mathrm{BP}\langle n\rangle$ only depends on its $p$-completion $\mathrm{BP}\langle n\rangle_p^\wedge$. As a corollary, this proves that the $p$-local homotopy type of $\mathrm{BP}\langle n\rangle$ does not depend on the ideal by which we take the quotient of $\mathrm{BP}$. In the course of the argument, we show that there is a vanishing line for odd degree classes in the Adams spectral sequence for endomorphisms of $\mathrm{BP}\langle n\rangle$. We also prove that there are enough endomorphisms of $\mathrm{BP}\langle n\rangle$ in a suitable sense. When $p=2$, we obtain the results for $n\leq 3$. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: 27 pages, comments welcome

arXiv:2404.17868 [pdf, other]

Error analysis for finite element operator learning methods for solving parametric second-order elliptic PDEs

Authors: Youngjoon Hong, Seungchan Ko, Jaeyong Lee

Abstract: In this paper, we provide a theoretical analysis of a type of operator learning method without data reliance based on the classical finite element approximation, which is called the finite element operator network (FEONet). We first establish the convergence of this method for general second-order linear elliptic PDEs with respect to the parameters for neural network approximation. In this regard,… ▽ More In this paper, we provide a theoretical analysis of a type of operator learning method without data reliance based on the classical finite element approximation, which is called the finite element operator network (FEONet). We first establish the convergence of this method for general second-order linear elliptic PDEs with respect to the parameters for neural network approximation. In this regard, we address the role of the condition number of the finite element matrix in the convergence of the method. Secondly, we derive an explicit error estimate for the self-adjoint case. For this, we investigate some regularity properties of the solution in certain function classes for a neural network approximation, verifying the sufficient condition for the solution to have the desired regularity. Finally, we will also conduct some numerical experiments that support the theoretical findings, confirming the role of the condition number of the finite element matrix in the overall convergence. △ Less

Submitted 27 April, 2024; originally announced April 2024.

arXiv:2404.17562 [pdf, other]

Boosting e-BH via conditional calibration

Authors: Junu Lee, Zhimei Ren

Abstract: The e-BH procedure is an e-value-based multiple testing procedure that provably controls the false discovery rate (FDR) under any dependence structure between the e-values. Despite this appealing theoretical FDR control guarantee, the e-BH procedure often suffers from low power in practice. In this paper, we propose a general framework that boosts the power of e-BH without sacrificing its FDR cont… ▽ More The e-BH procedure is an e-value-based multiple testing procedure that provably controls the false discovery rate (FDR) under any dependence structure between the e-values. Despite this appealing theoretical FDR control guarantee, the e-BH procedure often suffers from low power in practice. In this paper, we propose a general framework that boosts the power of e-BH without sacrificing its FDR control under arbitrary dependence. This is achieved by the technique of conditional calibration, where we take as input the e-values and calibrate them to be a set of "boosted e-values" that are guaranteed to be no less -- and are often more -- powerful than the original ones. Our general framework is explicitly instantiated in three classes of multiple testing problems: (1) testing under parametric models, (2) conditional independence testing under the model-X setting, and (3) model-free conformalized selection. Extensive numerical experiments show that our proposed method significantly improves the power of e-BH while continuing to control the FDR. We also demonstrate the effectiveness of our method through an application to an observational study dataset for identifying individuals whose counterfactuals satisfy certain properties. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.17467 [pdf, ps, other]

Around the positive graph conjecture

Authors: David Conlon, Joonkyung Lee, Leo Versteegen

Abstract: A graph $H$ is said to be positive if the homomorphism density $t_H(G)$ is non-negative for all weighted graphs $G$. The positive graph conjecture proposes a characterisation of such graphs, saying that a graph is positive if and only if it is symmetric, in the sense that it is formed by gluing two copies of some subgraph along an independent set. We prove several results relating to this conjectu… ▽ More A graph $H$ is said to be positive if the homomorphism density $t_H(G)$ is non-negative for all weighted graphs $G$. The positive graph conjecture proposes a characterisation of such graphs, saying that a graph is positive if and only if it is symmetric, in the sense that it is formed by gluing two copies of some subgraph along an independent set. We prove several results relating to this conjecture. First, we make progress towards the conjecture itself by showing that any connected positive graph must have a vertex of even degree. We then make use of this result to identify some new counterexamples to the analogue of Sidorenko's conjecture for hypergraphs. In particular, we show that, for $r$ odd, every $r$-uniform tight cycle is a counterexample, generalising a recent result of Conlon, Lee and Sidorenko that dealt with the case $r=3$. Finally, we relate the positive graph conjecture to the emerging study of graph codes by showing that any positive graph has vanishing graph code density, thereby improving a result of Alon who proved the same result for symmetric graphs. Our proofs make use of a variety of tools and techniques, including the properties of independence polynomials, hypergraph quasirandomness and discrete Fourier analysis. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: 14 pages

arXiv:2404.07599 [pdf, other]

Linear structures of norm-attaining Lipschitz functions and their complements

Authors: Geunsu Choi, Mingu Jung, Han Ju Lee, Oscar Roldan

Abstract: We solve two main questions on linear structures of (non-)norm-attaining Lipschitz functions. First, we show that for every infinite metric space $M$, the set consisting of Lipschitz functions on $M$ which do not strongly attain their norm and the zero contains an isometric copy of $\ell_\infty$, and moreover, those functions can be chosen not to attain their norm as functionals on the Lipschitz-f… ▽ More We solve two main questions on linear structures of (non-)norm-attaining Lipschitz functions. First, we show that for every infinite metric space $M$, the set consisting of Lipschitz functions on $M$ which do not strongly attain their norm and the zero contains an isometric copy of $\ell_\infty$, and moreover, those functions can be chosen not to attain their norm as functionals on the Lipschitz-free space over $M$. Second, we prove that for every infinite metric space $M$, neither the set of strongly norm-attaining Lipschitz functions on $M$ nor the union of its complement with zero is ever a linear space. Furthermore, we observe that the set consisting of Lipschitz functions which cannot be approximated by strongly norm-attaining ones and the zero element contains $\ell_\infty$ isometrically in all the known cases. Some natural observations and spaceability results are also investigated for Lipschitz functions that attain their norm in one way but do not in another, for several norm-attainment notions considered in the literature. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: 33 pages, 3 figures

arXiv:2404.07010 [pdf, ps, other]

Gaining or losing perspective for convex multivariate functions on box domains

Authors: Luze Xu, Jon Lee

Abstract: MINLO (mixed-integer nonlinear optimization) formulations of the disjunction between the origin and a polytope via a binary indicator variable is broadly used in nonlinear combinatorial optimization for modeling a fixed cost associated with carrying out a group of activities and a convex cost function associated with the levels of the activities. The perspective relaxation of such models is often… ▽ More MINLO (mixed-integer nonlinear optimization) formulations of the disjunction between the origin and a polytope via a binary indicator variable is broadly used in nonlinear combinatorial optimization for modeling a fixed cost associated with carrying out a group of activities and a convex cost function associated with the levels of the activities. The perspective relaxation of such models is often used to solve to global optimality in a branch-and-bound context, but it typically requires suitable conic solvers and is not compatible with general-purpose NLP software in the presence of other classes of constraints. This motivates the investigation of when simpler but weaker relaxations may be adequate. Comparing the volume (i.e., Lebesgue measure) of the relaxations as a measure of tightness, we lift some of the results related to the simplex case to the box case. In order to compare the volumes of different relaxations in the box case, it is necessary to find an appropriate concave upper bound that preserves the convexity and is minimal, which is more difficult than in the simplex case. To address the challenge beyond the simplex case, the triangulation approach is used. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: To appear in Mathematical Programming, Series B

arXiv:2404.03105 [pdf, other]

Methodology for Interpretable Reinforcement Learning for Optimizing Mechanical Ventilation

Authors: Joo Seung Lee, Malini Mahendra, Anil Aswani

Abstract: Mechanical ventilation is a critical life-support intervention that uses a machine to deliver controlled air and oxygen to a patient's lungs, assisting or replacing spontaneous breathing. While several data-driven approaches have been proposed to optimize ventilator control strategies, they often lack interpretability and agreement with general domain knowledge. This paper proposes a methodology f… ▽ More Mechanical ventilation is a critical life-support intervention that uses a machine to deliver controlled air and oxygen to a patient's lungs, assisting or replacing spontaneous breathing. While several data-driven approaches have been proposed to optimize ventilator control strategies, they often lack interpretability and agreement with general domain knowledge. This paper proposes a methodology for interpretable reinforcement learning (RL) using decision trees for mechanical ventilation control. Using a causal, nonparametric model-based off-policy evaluation, we evaluate the policies in their ability to gain increases in SpO2 while avoiding aggressive ventilator settings which are known to cause ventilator induced lung injuries and other complications. Numerical experiments using MIMIC-III data on the stays of real patients' intensive care unit stays demonstrate that the decision tree policy outperforms the behavior cloning policy and is comparable to state-of-the-art RL policy. Future work concerns better aligning the cost function with medical objectives to generate deeper clinical insights. △ Less

Submitted 3 April, 2024; originally announced April 2024.

arXiv:2404.01390 [pdf, other]

Convex relaxation for the generalized maximum-entropy sampling problem

Authors: Gabriel Ponte, Marcia Fampa, Jon Lee

Abstract: The generalized maximum-entropy sampling problem (GMESP) is to select an order-$s$ principal submatrix from an order-$n$ covariance matrix, to maximize the product of its $t$ greatest eigenvalues, $0<t\leq s <n$. It is a problem that specializes to two fundamental problems in statistical design theory:(i) maximum-entropy sampling problem (MESP); (ii) binary D-optimality (D-Opt). In the general cas… ▽ More The generalized maximum-entropy sampling problem (GMESP) is to select an order-$s$ principal submatrix from an order-$n$ covariance matrix, to maximize the product of its $t$ greatest eigenvalues, $0<t\leq s <n$. It is a problem that specializes to two fundamental problems in statistical design theory:(i) maximum-entropy sampling problem (MESP); (ii) binary D-optimality (D-Opt). In the general case, it is motivated by a selection problem in the context of PCA (principal component analysis). We introduce the first convex-optimization based relaxation for GMESP, study its behavior, compare it to an earlier spectral bound, and demonstrate its use in a branch-and-bound scheme. We find that such an approach is practical when $s-t$ is very small. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2404.01134 [pdf, ps, other]

Towards a classification of $1$-homogeneous distance-regular graphs with positive intersection number $a_1$

Authors: Jack H. Koolen, Mamoon Abdullah, Brhane Gebremichel, Jae-Ho Lee

Abstract: Let $Γ$ be a graph with diameter at least two. Then $Γ$ is said to be $1$-homogeneous (in the sense of Nomura) whenever for every pair of adjacent vertices $x$ and $y$ in $Γ$, the distance partition of the vertex set of $Γ$ with respect to both $x$ and $y$ is equitable, and the parameters corresponding to equitable partitions are independent of the choice of $x$ and $y$. Assume $Γ$ is $1$-homogene… ▽ More Let $Γ$ be a graph with diameter at least two. Then $Γ$ is said to be $1$-homogeneous (in the sense of Nomura) whenever for every pair of adjacent vertices $x$ and $y$ in $Γ$, the distance partition of the vertex set of $Γ$ with respect to both $x$ and $y$ is equitable, and the parameters corresponding to equitable partitions are independent of the choice of $x$ and $y$. Assume $Γ$ is $1$-homogeneous distance-regular with intersection number $a_1>0$ and $D\geqslant 5$. Define $b=b_1/(θ_1+1)$, where $b_1$ is the intersection number and $θ_1$ is the second largest eigenvalue of $Γ$. We show that if intersection number $c_2\geqslant 2$, then $b\geqslant 1$ and one of the following (i)--(vi) holds: (i) $Γ$ is a regular near $2D$-gon, (ii) $Γ$ is a Johnson graph $J(2D,D)$, (iii) $Γ$ is a halved $\ell$-cube where $\ell \in \{2D,2D+1\}$, (iv) $Γ$ is a folded Johnson graph $\bar{J}(4D,2D)$, (v) $Γ$ is a folded halved $(4D)$-cube, (vi) the valency of $Γ$ is bounded by a function of $b$. Using this result, we characterize $1$-homogeneous graphs with classical parameters and $a_1>0$, as well as tight distance-regular graphs. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 19 pages

MSC Class: 05E30; 05C50

arXiv:2403.18707 [pdf, other]

Connections between Reachability and Time Optimality

Authors: Juho Bae, Ji Hoon Bai, Byung-Yoon Lee, Jun-Yong Lee, Chang-Hun Lee

Abstract: This paper presents the concept of an equivalence relation between the set of optimal control problems. By leveraging this concept, we show that the boundary of the reachability set can be constructed by the solutions of time optimal problems. Alongside, a more generalized equivalence theorem is presented together. The findings facilitate the use of solution structures from a certain class of opti… ▽ More This paper presents the concept of an equivalence relation between the set of optimal control problems. By leveraging this concept, we show that the boundary of the reachability set can be constructed by the solutions of time optimal problems. Alongside, a more generalized equivalence theorem is presented together. The findings facilitate the use of solution structures from a certain class of optimal control problems to address problems in corresponding equivalent classes. As a byproduct, we state and prove the construction methods of the reachability sets of three-dimensional curves with prescribed curvature bound. The findings are twofold: Firstly, we prove that any boundary point of the reachability set, with the terminal direction taken into account, can be accessed via curves of H, CSC, CCC, or their respective subsegments, where H denotes a helicoidal arc, C a circular arc with maximum curvature, and S a straight segment. Secondly, we show that any boundary point of the reachability set, without considering the terminal direction, can be accessed by curves of CC, CS, or their respective subsegments. These findings extend the developments presented in literature regarding planar curves, or Dubins car dynamics, into spatial curves in $\mathbb{R}^3$. For higher dimensions, we confirm that the problem of identifying the reachability set of curvature bounded paths subsumes the well-known Markov-Dubins problem. These advancements in understanding the reachability of curvature bounded paths in $\mathbb{R}^3$ hold significant practical implications, particularly in the contexts of mission planning problems and time optimal guidance. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: Submitted to Automatica

arXiv:2403.15973 [pdf, ps, other]

Isoperimetric profile function comparisons with Integral Ricci curvature bounds

Authors: Jihye Lee, Fabio Ricci

Abstract: We prove comparison results for the Isoperimetric profile function in the setting of manifolds with integral bounds on the Ricci curvature. We extend previous work of Ni and Wang and Bayle and Rosales under the usual pointwise bounds for the Ricci curvature. We prove comparison results for the Isoperimetric profile function in the setting of manifolds with integral bounds on the Ricci curvature. We extend previous work of Ni and Wang and Bayle and Rosales under the usual pointwise bounds for the Ricci curvature. △ Less

Submitted 23 March, 2024; originally announced March 2024.

arXiv:2403.13189 [pdf, ps, other]

The Johnson-Mercier elasticity element in any dimensions

Authors: Jay Gopalakrishnan, Johnny Guzman, Jeonghun J. Lee

Abstract: Mixed methods for linear elasticity with strongly symmetric stresses of lowest order are studied in this paper. On each simplex, the stress space has piecewise linear components with respect to its Alfeld split (which connects the vertices to barycenter), generalizing the Johnson-Mercier two-dimensional element to higher dimensions. Further reductions in the stress space in the three-dimensional c… ▽ More Mixed methods for linear elasticity with strongly symmetric stresses of lowest order are studied in this paper. On each simplex, the stress space has piecewise linear components with respect to its Alfeld split (which connects the vertices to barycenter), generalizing the Johnson-Mercier two-dimensional element to higher dimensions. Further reductions in the stress space in the three-dimensional case (to 24 degrees of freedom per tetrahedron) are possible when the displacement space is reduced to local rigid displacements. Proofs of optimal error estimates of numerical solutions and improved error estimates via postprocessing and the duality argument are presented. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: 31 pages

MSC Class: 65N12; 65N15; 65N30

arXiv:2403.08930 [pdf, other]

How Much Can Reconfigurable Intelligent Surfaces Augment Sky Visibility: A Stochastic Geometry Approach

Authors: Junse Lee, Francois Baccelli

Abstract: This paper uses the theory of point processes and stochastic geometry to quantify the sky visibility experienced by users located in an urban environment. The general idea is to represent the buildings of this environment as a stationary marked point process, where the points represent the building locations and the marks their heights. The point process framework is first used to characterize the… ▽ More This paper uses the theory of point processes and stochastic geometry to quantify the sky visibility experienced by users located in an urban environment. The general idea is to represent the buildings of this environment as a stationary marked point process, where the points represent the building locations and the marks their heights. The point process framework is first used to characterize the distribution of the blockage angle, which limits the visibility of a typical user into the sky due to the obstruction by buildings. In the context of communications, this distribution is useful when users try to connect to the nodes of an aerial or non-terrestrial network in a Line-of-Sight way. Within this context, the point process framework can also be used to investigate the gain of connectivity obtained thanks to Reconfigurable Intelligent Surfaces. Assuming that such surfaces are installed on the top of buildings to extend the user's sky visibility, this point process approach allows one to quantify the gain in visibility and hence the gain in connectivity obtained by the typical user. The distributional properties of visibility-related metrics are cross-validated by comparison to simulation results. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 13 pages, 13 figures, 2 tables, submitted to IEEE

arXiv:2403.05107 [pdf, ps, other]

A converse of dynamical Mordell--Lang conjecture in positive characteristic

Authors: Jungin Lee, Gyeonghyeon Nam

Abstract: In this paper, we prove the converse of the dynamical Mordell--Lang conjecture in positive characteristic: For every subset $S \subseteq \mathbb{N}_0$ which is a union of finitely many arithmetic progressions along with finitely many $p$-sets of the form $\left \{ \sum_{j=1}^{m} c_j p^{k_jn_j} : n_j \in \mathbb{N}_0 \right \}$ ($c_j \in \mathbb{Q}$, $k_j \in \mathbb{N}_0$), there exist a split tor… ▽ More In this paper, we prove the converse of the dynamical Mordell--Lang conjecture in positive characteristic: For every subset $S \subseteq \mathbb{N}_0$ which is a union of finitely many arithmetic progressions along with finitely many $p$-sets of the form $\left \{ \sum_{j=1}^{m} c_j p^{k_jn_j} : n_j \in \mathbb{N}_0 \right \}$ ($c_j \in \mathbb{Q}$, $k_j \in \mathbb{N}_0$), there exist a split torus $X = \mathbb{G}_m^k$ defined over $K=\overline{\mathbb{F}_p}(t)$, an endomorphism $Φ$ of $X$, $α\in X(K)$ and a closed subvariety $V \subseteq X$ such that $\left \{ n \in \mathbb{N}_0 : Φ^n(α) \in V(K) \right \} = S$. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: 6 pages

arXiv:2403.03183 [pdf, other]

How Well Can Transformers Emulate In-context Newton's Method?

Authors: Angeliki Giannou, Liu Yang, Tianhao Wang, Dimitris Papailiopoulos, Jason D. Lee

Abstract: Transformer-based models have demonstrated remarkable in-context learning capabilities, prompting extensive research into its underlying mechanisms. Recent studies have suggested that Transformers can implement first-order optimization algorithms for in-context learning and even second order ones for the case of linear regression. In this work, we study whether Transformers can perform higher orde… ▽ More Transformer-based models have demonstrated remarkable in-context learning capabilities, prompting extensive research into its underlying mechanisms. Recent studies have suggested that Transformers can implement first-order optimization algorithms for in-context learning and even second order ones for the case of linear regression. In this work, we study whether Transformers can perform higher order optimization methods, beyond the case of linear regression. We establish that linear attention Transformers with ReLU layers can approximate second order optimization algorithms for the task of logistic regression and achieve $ε$ error with only a logarithmic to the error more layers. As a by-product we demonstrate the ability of even linear attention-only Transformers in implementing a single step of Newton's iteration for matrix inversion with merely two layers. These results suggest the ability of the Transformer architecture to implement complex algorithms, beyond gradient descent. △ Less

Submitted 5 March, 2024; originally announced March 2024.

arXiv:2402.16613 [pdf, other]

Structure-Preserving Operator Learning: Modeling the Collision Operator of Kinetic Equations

Authors: Jae Yong Lee, Steffen Schotthöfer, Tianbai Xiao, Sebastian Krumscheid, Martin Frank

Abstract: This work explores the application of deep operator learning principles to a problem in statistical physics. Specifically, we consider the linear kinetic equation, consisting of a differential advection operator and an integral collision operator, which is a powerful yet expensive mathematical model for interacting particle systems with ample applications, e.g., in radiation transport. We investig… ▽ More This work explores the application of deep operator learning principles to a problem in statistical physics. Specifically, we consider the linear kinetic equation, consisting of a differential advection operator and an integral collision operator, which is a powerful yet expensive mathematical model for interacting particle systems with ample applications, e.g., in radiation transport. We investigate the capabilities of the Deep Operator network (DeepONet) approach to modelling the high dimensional collision operator of the linear kinetic equation. This integral operator has crucial analytical structures that a surrogate model, e.g., a DeepONet, needs to preserve to enable meaningful physical simulation. We propose several DeepONet modifications to encapsulate essential structural properties of this integral operator in a DeepONet model. To be precise, we adapt the architecture of the trunk-net so the DeepONet has the same collision invariants as the theoretical kinetic collision operator, thus preserving conserved quantities, e.g., mass, of the modeled many-particle system. Further, we propose an entropy-inspired data-sampling method tailored to train the modified DeepONet surrogates without requiring an excessive expensive simulation-based data generation. △ Less

Submitted 26 February, 2024; originally announced February 2024.

Comments: 12 pages, 8 figures

arXiv:2402.15188 [pdf, other]

Parameter-Free Algorithms for Performative Regret Minimization under Decision-Dependent Distributions

Authors: Sungwoo Park, Junyeop Kwon, Byeongnoh Kim, Suhyun Chae, Jeeyong Lee, Dabeen Lee

Abstract: This paper studies performative risk minimization, a formulation of stochastic optimization under decision-dependent distributions. We consider the general case where the performative risk can be non-convex, for which we develop efficient parameter-free optimistic optimization-based methods. Our algorithms significantly improve upon the existing Lipschitz bandit-based method in many aspects. In pa… ▽ More This paper studies performative risk minimization, a formulation of stochastic optimization under decision-dependent distributions. We consider the general case where the performative risk can be non-convex, for which we develop efficient parameter-free optimistic optimization-based methods. Our algorithms significantly improve upon the existing Lipschitz bandit-based method in many aspects. In particular, our framework does not require knowledge about the sensitivity parameter of the distribution map and the Lipshitz constant of the loss function. This makes our framework practically favorable, together with the efficient optimistic optimization-based tree-search mechanism. We provide experimental results that demonstrate the numerical superiority of our algorithms over the existing method and other black-box optimistic optimization methods. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2402.14295 [pdf, ps, other]

Fluctuations of the free energy of the spherical Sherrington-Kirkpatrick model with heavy-tailed interaction

Authors: Taegyun Kim, Ji Oon Lee

Abstract: We consider the 2-spin spherical Sherrington--Kirkpatrick model where the interactions between the spins are given as random variables with heavy-tailed distribution. We prove that the free energy exhibits sharp phase transition, depending on the location of the largest eigenvalue of the interaction matrix. We also prove the order of the limiting free energy and the limiting distribution of the fl… ▽ More We consider the 2-spin spherical Sherrington--Kirkpatrick model where the interactions between the spins are given as random variables with heavy-tailed distribution. We prove that the free energy exhibits sharp phase transition, depending on the location of the largest eigenvalue of the interaction matrix. We also prove the order of the limiting free energy and the limiting distribution of the fluctuation of the free energy for both regimes. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 32 pages

arXiv:2402.11867 [pdf, other]

LoRA Training in the NTK Regime has No Spurious Local Minima

Authors: Uijeong Jang, Jason D. Lee, Ernest K. Ryu

Abstract: Low-rank adaptation (LoRA) has become the standard approach for parameter-efficient fine-tuning of large language models (LLM), but our theoretical understanding of LoRA has been limited. In this work, we theoretically analyze LoRA fine-tuning in the neural tangent kernel (NTK) regime with $N$ data points, showing: (i) full fine-tuning (without LoRA) admits a low-rank solution of rank… ▽ More Low-rank adaptation (LoRA) has become the standard approach for parameter-efficient fine-tuning of large language models (LLM), but our theoretical understanding of LoRA has been limited. In this work, we theoretically analyze LoRA fine-tuning in the neural tangent kernel (NTK) regime with $N$ data points, showing: (i) full fine-tuning (without LoRA) admits a low-rank solution of rank $r\lesssim \sqrt{N}$; (ii) using LoRA with rank $r\gtrsim \sqrt{N}$ eliminates spurious local minima, allowing gradient descent to find the low-rank solutions; (iii) the low-rank solution found using LoRA generalizes well. △ Less

Submitted 28 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: 23 pages

arXiv:2402.11619 [pdf, ps, other]

Relativized Galois groups of first order theories over a hyperimaginary

Authors: Hyoyoon Lee, Junguk Lee

Abstract: We study relativized Lascar groups, which are formed by relativizing Lascar groups to the solution set of a partial type $Σ$. We introduce the notion of a Lascar tuple for $Σ$ and by considering the space of types over a Lascar tuple for $Σ$, the topology for a relativized Lascar group is (re-)defined and some fundamental facts about the Galois groups of first-order theories are generalized to the… ▽ More We study relativized Lascar groups, which are formed by relativizing Lascar groups to the solution set of a partial type $Σ$. We introduce the notion of a Lascar tuple for $Σ$ and by considering the space of types over a Lascar tuple for $Σ$, the topology for a relativized Lascar group is (re-)defined and some fundamental facts about the Galois groups of first-order theories are generalized to the relativized context. In particular, we prove that any closed subgroup of a relativized Lascar group corresponds to a stabilizer of a bounded hyperimaginary having at least one representative in the solution set of the given partial type $Σ$. Using this, we find the correspondence between subgroups of the relativized Lascar group and the relativized strong types. We also compare the relativized notion with the restricted one, and provide a condition when two notions coincide. △ Less

Submitted 18 February, 2024; originally announced February 2024.

Comments: 14 pages

MSC Class: 03C60 (Primary) 54H11 (Secondary)

arXiv:2402.10475 [pdf, other]

Fundamental Benefit of Alternating Updates in Minimax Optimization

Authors: Jaewook Lee, Hanseul Cho, Chulhee Yun

Abstract: The Gradient Descent-Ascent (GDA) algorithm, designed to solve minimax optimization problems, takes the descent and ascent steps either simultaneously (Sim-GDA) or alternately (Alt-GDA). While Alt-GDA is commonly observed to converge faster, the performance gap between the two is not yet well understood theoretically, especially in terms of global convergence rates. To address this theory-practice… ▽ More The Gradient Descent-Ascent (GDA) algorithm, designed to solve minimax optimization problems, takes the descent and ascent steps either simultaneously (Sim-GDA) or alternately (Alt-GDA). While Alt-GDA is commonly observed to converge faster, the performance gap between the two is not yet well understood theoretically, especially in terms of global convergence rates. To address this theory-practice gap, we present fine-grained convergence analyses of both algorithms for strongly-convex-strongly-concave and Lipschitz-gradient objectives. Our new iteration complexity upper bound of Alt-GDA is strictly smaller than the lower bound of Sim-GDA; i.e., Alt-GDA is provably faster. Moreover, we propose Alternating-Extrapolation GDA (Alex-GDA), a general algorithmic framework that subsumes Sim-GDA and Alt-GDA, for which the main idea is to alternately take gradients from extrapolations of the iterates. We show that Alex-GDA satisfies a smaller iteration complexity bound, identical to that of the Extra-gradient method, while requiring less gradient computations. We also prove that Alex-GDA enjoys linear convergence for bilinear problems, for which both Sim-GDA and Alt-GDA fail to converge at all. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: 77 pages, 2 figures

arXiv:2402.08187 [pdf, other]

Learning time-dependent PDE via graph neural networks and deep operator network for robust accuracy on irregular grids

Authors: Sung Woong Cho, Jae Yong Lee, Hyung Ju Hwang

Abstract: Scientific computing using deep learning has seen significant advancements in recent years. There has been growing interest in models that learn the operator from the parameters of a partial differential equation (PDE) to the corresponding solutions. Deep Operator Network (DeepONet) and Fourier Neural operator, among other models, have been designed with structures suitable for handling functions… ▽ More Scientific computing using deep learning has seen significant advancements in recent years. There has been growing interest in models that learn the operator from the parameters of a partial differential equation (PDE) to the corresponding solutions. Deep Operator Network (DeepONet) and Fourier Neural operator, among other models, have been designed with structures suitable for handling functions as inputs and outputs, enabling real-time predictions as surrogate models for solution operators. There has also been significant progress in the research on surrogate models based on graph neural networks (GNNs), specifically targeting the dynamics in time-dependent PDEs. In this paper, we propose GraphDeepONet, an autoregressive model based on GNNs, to effectively adapt DeepONet, which is well-known for successful operator learning. GraphDeepONet exhibits robust accuracy in predicting solutions compared to existing GNN-based PDE solver models. It maintains consistent performance even on irregular grids, leveraging the advantages inherited from DeepONet and enabling predictions on arbitrary grids. Additionally, unlike traditional DeepONet and its variants, GraphDeepONet enables time extrapolation for time-dependent PDE solutions. We also provide theoretical analysis of the universal approximation capability of GraphDeepONet in approximating continuous operators across arbitrary time intervals. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: 25 pages, 11 figures

MSC Class: 65D17; 68U07

arXiv:2402.07695 [pdf, other]

Metastability and time scales for parabolic equations with drift 2: the general time scale

Authors: Claudio Landim, Jungkyoung Lee, Insuk Seo

Abstract: Consider the elliptic operator given by \[ \mathscr{L}_εf=b\cdot\nabla f+εΔf \] for some smooth vector field $b:\mathbb{R}^d\to\mathbb{R}^d$ and $ε>0$, and the initial-valued problem on $\mathbb{R}^d$ \[ \left\{\begin{aligned}&\partial_t u_ε=\mathscr{L}_εu_ε,\\ &u_ε(0,\,\cdot)=u_0(\cdot), \end{aligned} \right. \] for some bounded continuous function $u_0$. Under the hypothesis that the diffusion o… ▽ More Consider the elliptic operator given by \[ \mathscr{L}_εf=b\cdot\nabla f+εΔf \] for some smooth vector field $b:\mathbb{R}^d\to\mathbb{R}^d$ and $ε>0$, and the initial-valued problem on $\mathbb{R}^d$ \[ \left\{\begin{aligned}&\partial_t u_ε=\mathscr{L}_εu_ε,\\ &u_ε(0,\,\cdot)=u_0(\cdot), \end{aligned} \right. \] for some bounded continuous function $u_0$. Under the hypothesis that the diffusion on $\mathbb{R}^d$ induced by $\mathscr{L}_ε$ has a Gibbs invariant measure of the form $\exp \{-U(x)/ε\}dx$ for some smooth Morse potential function $U$, we provide the complete characterization of the multi-scale behavior of the solution $u_ε$ in the regime $ε\to0$. More precisely, we find the critical time scales $1\ll θ_ε^{(1)}\ll\cdots\ll θ_ε^{(q)}$ as $ε\to0$, and the kernels $R_t^{(p)}:M_0\times M_0\to\mathbb{R}_+$, where $M_0$ denotes the set of local minima of $U$, such that \[ \lim_{ε\to0}u_ε(tθ_ε^{(p)},\,x)=\sum_{m'\in M_0}R_t^{(p)}(m,\,m')u_0(m'), \] for all $t>0$ and $x$ in the domain of attraction of $m$ for the dynamical system $\dot{x}(t)=b(x(t))$. We then complete the characterization of the solution $u_ε$ by computing the exact asymptotic limit of the solution between time scales $θ_ε^{(p)}$ and $θ_ε^{(p+1)}$ for each $p$, where $θ_ε^{(0)}=1$ and $θ_ε^{(q+1)}=\infty$. Our proof relies on the full tree-structure characterization of the metastable behavior in different time-scales of the diffusion induced by $\mathscr{L}_ε$. This result can be regarded as the precise refinement of Freidlin-Wentzell theory which was not known for more than a half century. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: 100 pages, 8 figures

MSC Class: 35K15; 60J60

arXiv:2402.07438 [pdf, ps, other]

The Powell Conjecture for the genus-three Heegaard splitting of the $3$-sphere

Authors: Sangbum Cho, Yuya Koda, Jung Hoon Lee

Abstract: The Powell Conjecture states that four specific elements suffice to generate the Goeritz group of the Heegaard splitting of the $3$-sphere. We present an alternative proof of the Powell Conjecture when the genus of the splitting is $3$, and suggest a strategy for the case of higher genera. The Powell Conjecture states that four specific elements suffice to generate the Goeritz group of the Heegaard splitting of the $3$-sphere. We present an alternative proof of the Powell Conjecture when the genus of the splitting is $3$, and suggest a strategy for the case of higher genera. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: 9 pages, 3 figures

MSC Class: 57K30

arXiv:2402.05380 [pdf, ps, other]

Generators for the cohomology of the moduli space of irregular parabolic Higgs bundles

Authors: Jia Choon Lee, Sukjoo Lee

Abstract: We prove that the pure part of the cohomology ring of the moduli space of irregular $\underlineξ$-parabolic Higgs bundles is generated by the Künneth components of the Chern classes of a universal bundle and the Chern classes of the successive quotients of a universal flag of subbundles. As an application, in the regular full-flag case, we demonstrate a similar result for the cohomology ring of th… ▽ More We prove that the pure part of the cohomology ring of the moduli space of irregular $\underlineξ$-parabolic Higgs bundles is generated by the Künneth components of the Chern classes of a universal bundle and the Chern classes of the successive quotients of a universal flag of subbundles. As an application, in the regular full-flag case, we demonstrate a similar result for the cohomology ring of the moduli spaces of parabolic and strongly parabolic Higgs bundles. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 22 pages

arXiv:2402.04512 [pdf, ps, other]

Multiplicative Thom-Sebastiani for Bernstein-Sato polynomials

Authors: Jonghyun Lee

Abstract: We show that if $f\in \mathcal{O}_X(X)$ and $g\in \mathcal{O}_Y(Y)$ are nonzero regular functions on smooth complex algebraic varieties $X$ and $Y$, then the Bernstein-Sato polynomial of the product function $fg \in \mathcal{O}_{X\times Y}(X \times Y)$ is given by $b_{fg}(s)=b_f(s)b_g(s)$. This answers a question of Budur in \cite{Bud12} and of Popa in \cite{Pop21}. We show that if $f\in \mathcal{O}_X(X)$ and $g\in \mathcal{O}_Y(Y)$ are nonzero regular functions on smooth complex algebraic varieties $X$ and $Y$, then the Bernstein-Sato polynomial of the product function $fg \in \mathcal{O}_{X\times Y}(X \times Y)$ is given by $b_{fg}(s)=b_f(s)b_g(s)$. This answers a question of Budur in \cite{Bud12} and of Popa in \cite{Pop21}. △ Less

Submitted 14 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

Comments: v3; added references to the survey [Bud12] and the paper [SZ24], which gives another proof of our main result

MSC Class: 14F10 (Primary)

arXiv:2402.02717 [pdf, other]

Minimal grid diagrams of the prime knots with crossing number 13 and arc index 13

Authors: Hwa Jeong Lee, Yoonsang Lee, Chanmin Lee, Yeseo Park, Hun Kim, Gyo Taek **

Abstract: We give a list of minimal grid diagrams of the 13 crossing prime nonalternating knots which have arc index 13. There are 9,988 prime knots with crossing number 13. Among them 4,878 are alternating and have arc index 15. Among the other nonalternating knots, 49, 399, 1,412 and 3,250 have arc index 10, 11, 12, and 13, respectively. We used the Dowker-Thistlethwaite code of the 3,250 knots provided b… ▽ More We give a list of minimal grid diagrams of the 13 crossing prime nonalternating knots which have arc index 13. There are 9,988 prime knots with crossing number 13. Among them 4,878 are alternating and have arc index 15. Among the other nonalternating knots, 49, 399, 1,412 and 3,250 have arc index 10, 11, 12, and 13, respectively. We used the Dowker-Thistlethwaite code of the 3,250 knots provided by the program Knotscape to generate spanning trees of the corresponding knot diagrams to obtain minimal arc presentations in the form of grid diagrams. △ Less

Submitted 4 February, 2024; originally announced February 2024.

Comments: 57 pages, 5 figures, 1 table and 3250 grid diagrams

MSC Class: 57K10

arXiv:2402.01066 [pdf, ps, other]

On the Hardness of Short and Sign-Compatible Circuit Walks

Authors: Steffen Borgwardt, Weston Grewe, Sean Kafer, Jon Lee, Laura Sanità

Abstract: The circuits of a polyhedron are a superset of its edge directions. Circuit walks, a sequence of steps along circuits, generalize edge walks and are "short" if they have few steps or small total length. Both interpretations of short are relevant to the theory and application of linear programming. We study the hardness of several problems relating to the construction of short circuit walks. We e… ▽ More The circuits of a polyhedron are a superset of its edge directions. Circuit walks, a sequence of steps along circuits, generalize edge walks and are "short" if they have few steps or small total length. Both interpretations of short are relevant to the theory and application of linear programming. We study the hardness of several problems relating to the construction of short circuit walks. We establish that for a pair of vertices of a $0/1$-network-flow polytope, it is NP-complete to determine the length of a shortest circuit walk, even if we add the requirement that the walk must be sign-compatible. Our results also imply that determining the minimal number of circuits needed for a sign-compatible decomposition is NP-complete. Further, we show that it is NP-complete to determine the smallest total length (for $p$-norms $\lVert \cdot \rVert_p$, $1 < p \leq \infty$) of a circuit walk between a pair of vertices. One method to construct a short circuit walk is to pick up a correct facet at each step, which generalizes a non-revisiting walk. We prove that it is NP-complete to determine if there is a circuit direction that picks up a correct facet; in contrast, this problem can be solved in polynomial time for TU polyhedra. △ Less

Submitted 8 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

MSC Class: 52B05; 68Q25; 90C60

arXiv:2401.17540 [pdf, other]

Good and Fast Row-Sparse ah-Symmetric Reflexive Generalized Inverses

Authors: Gabriel Ponte, Marcia Fampa, Jon Lee, Luze Xu

Abstract: We present several algorithms aimed at constructing sparse and structured sparse (row-sparse) generalized inverses, with application to the efficient computation of least-squares solutions, for inconsistent systems of linear equations, in the setting of multiple right-hand sides and a rank-deficient constraint matrix. Leveraging our earlier formulations to minimize the 1- and 2,1- norms of general… ▽ More We present several algorithms aimed at constructing sparse and structured sparse (row-sparse) generalized inverses, with application to the efficient computation of least-squares solutions, for inconsistent systems of linear equations, in the setting of multiple right-hand sides and a rank-deficient constraint matrix. Leveraging our earlier formulations to minimize the 1- and 2,1- norms of generalized inverses that satisfy important properties of the Moore-Penrose pseudoinverse, we develop efficient and scalable ADMM algorithms to address these norm-minimization problems and to limit the number of nonzero rows in the solution. We establish a 2,1-norm approximation result for a local-search procedure that was originally designed for 1-norm minimization, and we compare the ADMM algorithms with the local-search procedure and with general-purpose optimization solvers. △ Less

Submitted 25 June, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

arXiv:2401.16716 [pdf, ps, other]

A parameter-free approach for solving SOS-convex semi-algebraic fractional programs

Authors: Chengmiao Yang, Liguo Jiao, Jae Hyoung Lee

Abstract: In this paper, we study a class of nonsmooth fractional programs {\rm (FP, for short)} with SOS-convex semi-algebraic functions. Under suitable assumptions, we derive a strong duality result between the problem (FP) and its semidefinite programming (SDP) relaxations. Remarkably, we extract an optimal solution of the problem (FP) by solving one and only one associated SDP problem. Numerical example… ▽ More In this paper, we study a class of nonsmooth fractional programs {\rm (FP, for short)} with SOS-convex semi-algebraic functions. Under suitable assumptions, we derive a strong duality result between the problem (FP) and its semidefinite programming (SDP) relaxations. Remarkably, we extract an optimal solution of the problem (FP) by solving one and only one associated SDP problem. Numerical examples are also given. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 22 pages

MSC Class: 90C32; 90C22; 90C23

arXiv:2401.14839 [pdf, ps, other]

Well-posedness, magnetic helicity conservation, inviscid limit and asymptotic stability for the generalized Navier-Stokes-Maxwell equations

Authors: Kyungkeun Kang, Jihoon Lee, Dinh Duong Nguyen

Abstract: This paper is devoted to studying the well-posedness, conservation of magnetic helicity, inviscid limit and asymptotic stability of the generalized Navier-Stokes-Maxwell (NSM) equations with the standard Ohm's law in $\mathbb{R}^d$ for $d \in \{2,3\}$. More precisely, the global well-posedness is established in case of fractional Laplacian velocity $(-Δ)^αv$ with $α= \frac{d}{2}$ for suitable data… ▽ More This paper is devoted to studying the well-posedness, conservation of magnetic helicity, inviscid limit and asymptotic stability of the generalized Navier-Stokes-Maxwell (NSM) equations with the standard Ohm's law in $\mathbb{R}^d$ for $d \in \{2,3\}$. More precisely, the global well-posedness is established in case of fractional Laplacian velocity $(-Δ)^αv$ with $α= \frac{d}{2}$ for suitable data. In addition, the local well-posedness in the inviscid case is also provided for sufficient smooth data, which allows us to study the inviscid limit of associated positive viscosity solutions in the case $α= 1$, where an explicit bound on the difference is given. Furthermore, in three dimensions if the initial data satisfies futher suitable conditions then magnetic helicity is conserved as the electric conductivity goes to infinity. On the other hand, in the case $α= 0$ the stability near a magnetohydrostatic equilibrium with a constant (or equivalently bounded) magnetic field is also obtained in which nonhomogeneous Sobolev norms of the velocity and electric fields, and for $p \in (2,\infty]$ the $L^p$ norm of the magnetic field converge to zero as time goes to infinity with an implicit rate. In this velocity dam** case, the situation is different both in case of the two and a half, and three-dimensional (Hall)-magnetohydrodynamics ((H)-MHD) system, where an explicit rate of convergence in infinite time is computed for both the velocity and magnetic fields in nonhomogeneous Sobolev norms. Therefore, it seems that there is a gap between NSM and MHD in terms of the norm convergence of the magnetic field and the rate of decaying in time, even the latter equations can be proved as a limiting system of the former one in the sense of distributions as the speed of light tends to infinity. △ Less

Submitted 15 June, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

MSC Class: 35Q35; 35Q60; 76D03; 76W05; 78A25

Showing 1–50 of 904 results for author: Lee, J