Search | arXiv e-print repository

Solving Moving Sofa Problem Using Calculus of Variations

Abstract: In 1966, Leo Moser introduced the "moving sofa problem," which seeks to determine the largest area of a shape that can be maneuvered through a 90-degree hallway of unit-width. This problem remains unsolved and open yet. In this paper, we employ calculus of variations method to solve this problem. Assuming the trajectories and envelopes are convex, the sofa's area is formulated as an integral funct… ▽ More In 1966, Leo Moser introduced the "moving sofa problem," which seeks to determine the largest area of a shape that can be maneuvered through a 90-degree hallway of unit-width. This problem remains unsolved and open yet. In this paper, we employ calculus of variations method to solve this problem. Assuming the trajectories and envelopes are convex, the sofa's area is formulated as an integral functional on a set of parametric equations for curves. The final shape is determined by solving the Euler-Lagrange equations. Utilizing numerical methods, we obtain the non-trivial area of 2.2195316, consistent with the previously well-known Gerver's constant since 1992. We prove that both the results of Gerver's sofa and Romik's car satisfy the Euler-Lagrange equations for the necessary condition of maximal area. We also explore additional cases and asymmetric conditions, and discuss other variant problems. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 37 pages main body; 15 figures; 8 appendices-Mathematica notebooks

MSC Class: 49Q10; 49K15

arXiv:2406.02643 [pdf, ps, other]

Seymour and Woodall's conjecture holds for graphs with independence number two

Authors: Rong Chen, Zijian Deng

Abstract: Woodall (and Seymour independently) in 2001 proposed a conjecture that every graph $G$ contains every complete bipartite graph on $χ(G)$ vertices as a minor, where $χ(G)$ is the chromatic number of $G$. In this paper, we prove that for each positive integer $\ell$ with $2\ell \leq χ(G)$, each graph $G$ with independence number two contains a $K^{\ell}_{\ell,χ(G)-\ell}$-minor, implying that Seymour… ▽ More Woodall (and Seymour independently) in 2001 proposed a conjecture that every graph $G$ contains every complete bipartite graph on $χ(G)$ vertices as a minor, where $χ(G)$ is the chromatic number of $G$. In this paper, we prove that for each positive integer $\ell$ with $2\ell \leq χ(G)$, each graph $G$ with independence number two contains a $K^{\ell}_{\ell,χ(G)-\ell}$-minor, implying that Seymour and Woodall's conjecture holds for graphs with independence number two, where $K^{\ell}_{\ell,χ(G)-\ell}$ is the graph obtained from $K_{\ell,χ(G)-\ell}$ by making every pair of vertices on the side of the bipartition of size $\ell$ adjacent. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2402.03541 [pdf, other]

HAMLET: Graph Transformer Neural Operator for Partial Differential Equations

Authors: Andrey Bryutkin, Jiahao Huang, Zhongying Deng, Guang Yang, Carola-Bibiane Schönlieb, Angelica Aviles-Rivero

Abstract: We present a novel graph transformer framework, HAMLET, designed to address the challenges in solving partial differential equations (PDEs) using neural networks. The framework uses graph transformers with modular input encoders to directly incorporate differential equation information into the solution process. This modularity enhances parameter correspondence control, making HAMLET adaptable to… ▽ More We present a novel graph transformer framework, HAMLET, designed to address the challenges in solving partial differential equations (PDEs) using neural networks. The framework uses graph transformers with modular input encoders to directly incorporate differential equation information into the solution process. This modularity enhances parameter correspondence control, making HAMLET adaptable to PDEs of arbitrary geometries and varied input formats. Notably, HAMLET scales effectively with increasing data complexity and noise, showcasing its robustness. HAMLET is not just tailored to a single type of physical simulation, but can be applied across various domains. Moreover, it boosts model resilience and performance, especially in scenarios with limited data. We demonstrate, through extensive experiments, that our framework is capable of outperforming current techniques for PDEs. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 17 pages, 7 figures, 6 tables

arXiv:2401.08330 [pdf, other]

Boosting Gradient Ascent for Continuous DR-submodular Maximization

Authors: Qixin Zhang, Zongqi Wan, Zengde Deng, Zaiyi Chen, Xiaoming Sun, Jialin Zhang, Yu Yang

Abstract: Projected Gradient Ascent (PGA) is the most commonly used optimization scheme in machine learning and operations research areas. Nevertheless, numerous studies and examples have shown that the PGA methods may fail to achieve the tight approximation ratio for continuous DR-submodular maximization problems. To address this challenge, we present a boosting technique in this paper, which can efficient… ▽ More Projected Gradient Ascent (PGA) is the most commonly used optimization scheme in machine learning and operations research areas. Nevertheless, numerous studies and examples have shown that the PGA methods may fail to achieve the tight approximation ratio for continuous DR-submodular maximization problems. To address this challenge, we present a boosting technique in this paper, which can efficiently improve the approximation guarantee of the standard PGA to \emph{optimal} with only small modifications on the objective function. The fundamental idea of our boosting technique is to exploit non-oblivious search to derive a novel auxiliary function $F$, whose stationary points are excellent approximations to the global maximum of the original DR-submodular objective $f$. Specifically, when $f$ is monotone and $γ$-weakly DR-submodular, we propose an auxiliary function $F$ whose stationary points can provide a better $(1-e^{-γ})$-approximation than the $(γ^2/(1+γ^2))$-approximation guaranteed by the stationary points of $f$ itself. Similarly, for the non-monotone case, we devise another auxiliary function $F$ whose stationary points can achieve an optimal $\frac{1-\min_{\boldsymbol{x}\in\mathcal{C}}\|\boldsymbol{x}\|_{\infty}}{4}$-approximation guarantee where $\mathcal{C}$ is a convex constraint set. In contrast, the stationary points of the original non-monotone DR-submodular function can be arbitrarily bad~\citep{chen2023continuous}. Furthermore, we demonstrate the scalability of our boosting technique on four problems. In all of these four problems, our resulting variants of boosting PGA algorithm beat the previous standard PGA in several aspects such as approximation ratio and efficiency. Finally, we corroborate our theoretical findings with numerical experiments, which demonstrate the effectiveness of our boosting PGA methods. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: 74 pages, 6 figures and 9 tables. An extended version of Stochastic Continuous Submodular Maximization: Boosting via Non-oblivious Function (ICML 2022)

arXiv:2312.01273 [pdf, other]

An Augmented Lagrangian Primal-Dual Semismooth Newton Method for Multi-Block Composite Optimization

Authors: Zhanwang Deng, Kangkang Deng, Jiang Hu, Zaiwen Wen

Abstract: In this paper, we develop a novel primal-dual semismooth Newton method for solving linearly constrained multi-block convex composite optimization problems. First, a differentiable augmented Lagrangian (AL) function is constructed by utilizing the Moreau envelopes of the nonsmooth functions. It enables us to derive an equivalent saddle point problem and establish the strong AL duality under the Sla… ▽ More In this paper, we develop a novel primal-dual semismooth Newton method for solving linearly constrained multi-block convex composite optimization problems. First, a differentiable augmented Lagrangian (AL) function is constructed by utilizing the Moreau envelopes of the nonsmooth functions. It enables us to derive an equivalent saddle point problem and establish the strong AL duality under the Slater's condition. Consequently, a semismooth system of nonlinear equations is formulated to characterize the optimality of the original problem instead of the inclusion-form KKT conditions. We then develop a semismooth Newton method, called ALPDSN, which uses purely second-order steps and a nonmonotone line search based globalization strategy. Through a connection to the inexact first-order steps when the regularization parameter is sufficiently large, the global convergence of ALPDSN is established. Under the regularity conditions, partial smoothness, the local error bound, and the strict complementarity, we show that both the primal and the dual iteration sequences possess a superlinear convergence rate and provide concrete examples where these regularity conditions are met. Numerical results on the image restoration with two regularization terms and the corrected tensor nuclear norm problem are presented to demonstrate the high efficiency and robustness of our ALPDSN. △ Less

Submitted 15 May, 2024; v1 submitted 2 December, 2023; originally announced December 2023.

Comments: 27 pages

arXiv:2305.09934 [pdf, ps, other]

doi 10.1007/s10898-023-01290-z

New semidefinite relaxations for a class of complex quadratic programming problems

Authors: Yingzhe Xu, Cheng Lu, Zhibin Deng, Ya-Feng Liu

Abstract: In this paper, we propose some new semidefinite relaxations for a class of nonconvex complex quadratic programming problems, which widely appear in the areas of signal processing and power system. By deriving new valid constraints to the matrix variables in the lifted space, we derive some enhanced semidefinite relaxations of the complex quadratic programming problems. Then, we compare the propose… ▽ More In this paper, we propose some new semidefinite relaxations for a class of nonconvex complex quadratic programming problems, which widely appear in the areas of signal processing and power system. By deriving new valid constraints to the matrix variables in the lifted space, we derive some enhanced semidefinite relaxations of the complex quadratic programming problems. Then, we compare the proposed semidefinite relaxations with existing ones and show that the newly proposed semidefinite relaxations could be strictly tighter than the previous ones. Moreover, the proposed semidefinite relaxations can be applied to more general cases of complex quadratic programming problems, whereas the previous ones are only designed for special cases. Numerical results indicate that the proposed semidefinite relaxations not only provide tighter relaxation bounds but also improve some existing approximation algorithms by finding better sub-optimal solutions. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: 24 pages, 1 figure

arXiv:2304.10842 [pdf, other]

Residual-Based Multi-peak Sampling Algorithm in Inverse Problems of Dynamical Systems

Authors: Xiao-Kai An, Lin Du, Zi-Chen Deng, Yu-jia Zhang

Abstract: Stochastic differential equations can describe a wide range of dynamical systems, and obtaining the governing equations of these systems is the premise of studying the nonlinear dynamic behavior of the system. Neural networks are currently the most popular approach in the inverse problem of dynamical systems. In order to obtain accurate dynamical equations, neural networks need a large amount of t… ▽ More Stochastic differential equations can describe a wide range of dynamical systems, and obtaining the governing equations of these systems is the premise of studying the nonlinear dynamic behavior of the system. Neural networks are currently the most popular approach in the inverse problem of dynamical systems. In order to obtain accurate dynamical equations, neural networks need a large amount of trajectory data as a training set. To address this shortcoming, we propose a residual-based multi-peaks sampling algorithm. Evaluate the training results of each epoch of neural network, calculate the probability density function $P(x)$ of the residual, perform sampling where the $P(x)$ is large, and add samples to the training set to retrain the neural network. In order to prevent the neural network from falling into the trap of overfitting, we discretize the sampling points. We conduct case studies using two classical nonlinear dynamical systems and perform bifurcation and first escape probability analyzes of the fitted equations. Results show that our proposed sampling strategy requires only 20$\sim $30\% of the sample points of the original method to reconstruct the stochastic dynamical behavior of the system. Finally, the algorithm is tested by adding interference noise to the data, and the results show that the sampling strategy has better numerical robustness and stability. △ Less

Submitted 24 April, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

arXiv:2303.17779 [pdf, other]

Decentralized Weakly Convex Optimization Over the Stiefel Manifold

Authors: **xin Wang, Jiang Hu, Shixiang Chen, Zengde Deng, Anthony Man-Cho So

Abstract: We focus on a class of non-smooth optimization problems over the Stiefel manifold in the decentralized setting, where a connected network of $n$ agents cooperatively minimize a finite-sum objective function with each component being weakly convex in the ambient Euclidean space. Such optimization problems, albeit frequently encountered in applications, are quite challenging due to their non-smoothn… ▽ More We focus on a class of non-smooth optimization problems over the Stiefel manifold in the decentralized setting, where a connected network of $n$ agents cooperatively minimize a finite-sum objective function with each component being weakly convex in the ambient Euclidean space. Such optimization problems, albeit frequently encountered in applications, are quite challenging due to their non-smoothness and non-convexity. To tackle them, we propose an iterative method called the decentralized Riemannian subgradient method (DRSM). The global convergence and an iteration complexity of $\mathcal{O}(\varepsilon^{-2} \log^2(\varepsilon^{-1}))$ for forcing a natural stationarity measure below $\varepsilon$ are established via the powerful tool of proximal smoothness from variational analysis, which could be of independent interest. Besides, we show the local linear convergence of the DRSM using geometrically diminishing stepsizes when the problem at hand further possesses a sharpness property. Numerical experiments are conducted to corroborate our theoretical findings. △ Less

Submitted 30 March, 2023; originally announced March 2023.

Comments: 27 pages, 6 figures, 1 table

arXiv:2303.03254 [pdf, ps, other]

An Online Algorithm for Chance Constrained Resource Allocation

Authors: Yuwei Chen, Zengde Deng, Yinzhi Zhou, Zaiyi Chen, Yujie Chen, Haoyuan Hu

Abstract: This paper studies the online stochastic resource allocation problem (RAP) with chance constraints. The online RAP is a 0-1 integer linear programming problem where the resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined instantaneously without future inform… ▽ More This paper studies the online stochastic resource allocation problem (RAP) with chance constraints. The online RAP is a 0-1 integer linear programming problem where the resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined instantaneously without future information. Moreover, in online applications, the resource consumption coefficients are often obtained by prediction. To model their uncertainties, we take the chance constraints into the consideration. To the best of our knowledge, this is the first time chance constraints are introduced in the online RAP problem. Assuming that the uncertain variables have known Gaussian distributions, the stochastic RAP can be transformed into a deterministic but nonlinear problem with integer second-order cone constraints. Next, we linearize this nonlinear problem and analyze the performance of vanilla online primal-dual algorithm for solving the linearized stochastic RAP. Under mild technical assumptions, the optimality gap and constraint violation are both on the order of $\sqrt{n}$. Then, to further improve the performance of the algorithm, several modified online primal-dual algorithms with heuristic corrections are proposed. Finally, extensive numerical experiments on both synthetic and real data demonstrate the applicability and effectiveness of our methods. △ Less

Submitted 6 March, 2023; originally announced March 2023.

Comments: 5 pages, 5 figures. Accepted to ICASSP 2023. arXiv admin note: substantial text overlap with arXiv:2203.16818

arXiv:2208.08681 [pdf, other]

Communication-Efficient Decentralized Online Continuous DR-Submodular Maximization

Authors: Qixin Zhang, Zengde Deng, Xiangru Jian, Zaiyi Chen, Haoyuan Hu, Yu Yang

Abstract: Maximizing a monotone submodular function is a fundamental task in machine learning, economics, and statistics. In this paper, we present two communication-efficient decentralized online algorithms for the monotone continuous DR-submodular maximization problem, both of which reduce the number of per-function gradient evaluations and per-round communication complexity from $T^{3/2}$ to $1$. The fir… ▽ More Maximizing a monotone submodular function is a fundamental task in machine learning, economics, and statistics. In this paper, we present two communication-efficient decentralized online algorithms for the monotone continuous DR-submodular maximization problem, both of which reduce the number of per-function gradient evaluations and per-round communication complexity from $T^{3/2}$ to $1$. The first one, One-shot Decentralized Meta-Frank-Wolfe (Mono-DMFW), achieves a $(1-1/e)$-regret bound of $O(T^{4/5})$. As far as we know, this is the first one-shot and projection-free decentralized online algorithm for monotone continuous DR-submodular maximization. Next, inspired by the non-oblivious boosting function \citep{zhang2022boosting}, we propose the Decentralized Online Boosting Gradient Ascent (DOBGA) algorithm, which attains a $(1-1/e)$-regret of $O(\sqrt{T})$. To the best of our knowledge, this is the first result to obtain the optimal $O(\sqrt{T})$ against a $(1-1/e)$-approximation with only one gradient inquiry for each local objective function per step. Finally, various experimental results confirm the effectiveness of the proposed methods. △ Less

Submitted 18 August, 2022; originally announced August 2022.

Comments: 37 pages, 7 figures, 2 tables

arXiv:2208.07632 [pdf, other]

Online Learning for Non-monotone Submodular Maximization: From Full Information to Bandit Feedback

Authors: Qixin Zhang, Zengde Deng, Zaiyi Chen, Kuangqi Zhou, Haoyuan Hu, Yu Yang

Abstract: In this paper, we revisit the online non-monotone continuous DR-submodular maximization problem over a down-closed convex set, which finds wide real-world applications in the domain of machine learning, economics, and operations research. At first, we present the Meta-MFW algorithm achieving a $1/e$-regret of $O(\sqrt{T})$ at the cost of $T^{3/2}$ stochastic gradient evaluations per round. As far… ▽ More In this paper, we revisit the online non-monotone continuous DR-submodular maximization problem over a down-closed convex set, which finds wide real-world applications in the domain of machine learning, economics, and operations research. At first, we present the Meta-MFW algorithm achieving a $1/e$-regret of $O(\sqrt{T})$ at the cost of $T^{3/2}$ stochastic gradient evaluations per round. As far as we know, Meta-MFW is the first algorithm to obtain $1/e$-regret of $O(\sqrt{T})$ for the online non-monotone continuous DR-submodular maximization problem over a down-closed convex set. Furthermore, in sharp contrast with ODC algorithm \citep{thang2021online}, Meta-MFW relies on the simple online linear oracle without discretization, lifting, or rounding operations. Considering the practical restrictions, we then propose the Mono-MFW algorithm, which reduces the per-function stochastic gradient evaluations from $T^{3/2}$ to 1 and achieves a $1/e$-regret bound of $O(T^{4/5})$. Next, we extend Mono-MFW to the bandit setting and propose the Bandit-MFW algorithm which attains a $1/e$-regret bound of $O(T^{8/9})$. To the best of our knowledge, Mono-MFW and Bandit-MFW are the first sublinear-regret algorithms to explore the one-shot and bandit setting for online non-monotone continuous DR-submodular maximization problem over a down-closed convex set, respectively. Finally, we conduct numerical experiments on both synthetic and real-world datasets to verify the effectiveness of our methods. △ Less

Submitted 16 August, 2022; originally announced August 2022.

Comments: 31 pages, 6 figures, 3 tables

arXiv:2206.13497 [pdf, other]

Robustness Implies Generalization via Data-Dependent Generalization Bounds

Authors: Kenji Kawaguchi, Zhun Deng, Kyle Luh, Jiaoyang Huang

Abstract: This paper proves that robustness implies generalization via data-dependent generalization bounds. As a result, robustness and generalization are shown to be connected closely in a data-dependent manner. Our bounds improve previous bounds in two directions, to solve an open problem that has seen little development since 2010. The first is to reduce the dependence on the covering number. The second… ▽ More This paper proves that robustness implies generalization via data-dependent generalization bounds. As a result, robustness and generalization are shown to be connected closely in a data-dependent manner. Our bounds improve previous bounds in two directions, to solve an open problem that has seen little development since 2010. The first is to reduce the dependence on the covering number. The second is to remove the dependence on the hypothesis space. We present several examples, including ones for lasso and deep learning, in which our bounds are provably preferable. The experiments on real-world data and theoretical models demonstrate near-exponential improvements in various situations. To achieve these improvements, we do not require additional assumptions on the unknown distribution; instead, we only incorporate an observable and computable property of the training samples. A key technical innovation is an improved concentration bound for multinomial random variables that is of independent interest beyond robustness and generalization. △ Less

Submitted 3 August, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

Comments: Accepted by ICML 2022, and selected for ICML long presentation (top 2% of submissions)

arXiv:2204.01486 [pdf, ps, other]

Bayesian approach for limited-aperture inverse acoustic scattering with total variation prior

Authors: Xiao-Mei Yang, Zhi-Liang Deng, Ailin Qian

Abstract: In this work, we apply the Bayesian approach for the acoustic scattering problem to reconstruct the shape of a sound-soft obstacle using the limited-aperture far-field measure data. A novel total variation prior is assigned to the shape parameterization form. This prior is imposed on the Fourier coefficients of the parameterized form of the obstacle. Extensive numerical tests are provided to illus… ▽ More In this work, we apply the Bayesian approach for the acoustic scattering problem to reconstruct the shape of a sound-soft obstacle using the limited-aperture far-field measure data. A novel total variation prior is assigned to the shape parameterization form. This prior is imposed on the Fourier coefficients of the parameterized form of the obstacle. Extensive numerical tests are provided to illustrate the numerical performance. △ Less

Submitted 19 March, 2022; originally announced April 2022.

arXiv:2203.16818 [pdf, other]

Online Primal-Dual Algorithms For Stochastic Resource Allocation Problems

Authors: Yuwei Chen, Zengde Deng, Zaiyi Chen, Yinzhi Zhou, Yujie Chen, Haoyuan Hu

Abstract: This paper studies the online stochastic resource allocation problem (RAP) with chance constraints and conditional expectation constraints. The online RAP is an integer linear programming problem where resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined inst… ▽ More This paper studies the online stochastic resource allocation problem (RAP) with chance constraints and conditional expectation constraints. The online RAP is an integer linear programming problem where resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined instantaneously without future information. In online applications, the resource consumption coefficients are often obtained by prediction. An application for such scenario rises from the online order fulfilment task. When the timeliness constraints are considered, the coefficients are generated by the prediction for the transportation time from origin to destination. To model their uncertainties, we take the chance constraints and conditional expectation constraints into the consideration. Assuming that the uncertain variables have known Gaussian distributions, the stochastic RAP can be transformed into a deterministic but nonlinear problem with integer second-order cone constraints. Next, we linearize this nonlinear problem and theoretically analyze the performance of vanilla online primal-dual algorithm for solving the linearized stochastic RAP. Under mild technical assumptions, the optimality gap and constraint violation are both on the order of $\sqrt{n}$. Then, to further improve the performance of the algorithm, several modified online primal-dual algorithms with heuristic corrections are proposed. Finally, extensive numerical experiments demonstrate the applicability and effectiveness of our methods. △ Less

Submitted 31 March, 2022; originally announced March 2022.

Comments: 22 pages, 6 figures, 2 tables

arXiv:2203.14771 [pdf, other]

Bayesian inverse problems using homotopy

Authors: Xiao-Mei Yang, Zhi-Liang Deng

Abstract: In solving Bayesian inverse problems, it is often desirable to use a common density parameterization to denote the prior and posterior. Typically we seek a density from the same family as the prior which closely approximates the true posterior. As one of the most important classes of distributions in statistics, the exponential family is considered as the parameterization. The optimal parameter va… ▽ More In solving Bayesian inverse problems, it is often desirable to use a common density parameterization to denote the prior and posterior. Typically we seek a density from the same family as the prior which closely approximates the true posterior. As one of the most important classes of distributions in statistics, the exponential family is considered as the parameterization. The optimal parameter values for representing the approximated posterior are achieved by minimizing the deviation between the parameterized density and a homotopy that deforms the prior density into the posterior density. Rather than trying to solve the original problem, it is exactly converted into a corresponding system of explicit ordinary first-order differential equations. Solving this system over a finite 'time' interval yields the desired optimal density parameters. This method is proven to be effective by some numerical examples. △ Less

Submitted 28 March, 2022; originally announced March 2022.

arXiv:2201.00703 [pdf, other]

Stochastic Continuous Submodular Maximization: Boosting via Non-oblivious Function

Authors: Qixin Zhang, Zengde Deng, Zaiyi Chen, Haoyuan Hu, Yu Yang

Abstract: In this paper, we revisit Stochastic Continuous Submodular Maximization in both offline and online settings, which can benefit wide applications in machine learning and operations research areas. We present a boosting framework covering gradient ascent and online gradient ascent. The fundamental ingredient of our methods is a novel non-oblivious function $F$ derived from a factor-revealing optimiz… ▽ More In this paper, we revisit Stochastic Continuous Submodular Maximization in both offline and online settings, which can benefit wide applications in machine learning and operations research areas. We present a boosting framework covering gradient ascent and online gradient ascent. The fundamental ingredient of our methods is a novel non-oblivious function $F$ derived from a factor-revealing optimization problem, whose any stationary point provides a $(1-e^{-γ})$-approximation to the global maximum of the $γ$-weakly DR-submodular objective function $f\in C^{1,1}_L(\mathcal{X})$. Under the offline scenario, we propose a boosting gradient ascent method achieving $(1-e^{-γ}-ε^{2})$-approximation after $O(1/ε^2)$ iterations, which improves the $(\frac{γ^2}{1+γ^2})$ approximation ratio of the classical gradient ascent algorithm. In the online setting, for the first time we consider the adversarial delays for stochastic gradient feedback, under which we propose a boosting online gradient algorithm with the same non-oblivious function $F$. Meanwhile, we verify that this boosting online algorithm achieves a regret of $O(\sqrt{D})$ against a $(1-e^{-γ})$-approximation to the best feasible solution in hindsight, where $D$ is the sum of delays of gradient feedback. To the best of our knowledge, this is the first result to obtain $O(\sqrt{T})$ regret against a $(1-e^{-γ})$-approximation with $O(1)$ gradient inquiry at each time step, when no delay exists, i.e., $D=T$. Finally, numerical experiments demonstrate the effectiveness of our boosting methods. △ Less

Submitted 10 June, 2022; v1 submitted 3 January, 2022; originally announced January 2022.

Comments: Accepted to ICML 2022. 29 pages, 5 figures, 2 tables

arXiv:2106.14836 [pdf, other]

doi 10.1162/neco_a_01483

Understanding Dynamics of Nonlinear Representation Learning and Its Application

Authors: Kenji Kawaguchi, Linjun Zhang, Zhun Deng

Abstract: Representations of the world environment play a crucial role in artificial intelligence. It is often inefficient to conduct reasoning and inference directly in the space of raw sensory representations, such as pixel values of images. Representation learning allows us to automatically discover suitable representations from raw sensory data. For example, given raw sensory data, a deep neural network… ▽ More Representations of the world environment play a crucial role in artificial intelligence. It is often inefficient to conduct reasoning and inference directly in the space of raw sensory representations, such as pixel values of images. Representation learning allows us to automatically discover suitable representations from raw sensory data. For example, given raw sensory data, a deep neural network learns nonlinear representations at its hidden layers, which are subsequently used for classification (or regression) at its output layer. This happens implicitly during training through minimizing a supervised or unsupervised loss in common practical regimes of deep learning, unlike the neural tangent kernel (NTK) regime. In this paper, we study the dynamics of such implicit nonlinear representation learning, which is beyond the NTK regime. We identify a pair of a new assumption and a novel condition, called the common model structure assumption and the data-architecture alignment condition. Under the common model structure assumption, the data-architecture alignment condition is shown to be sufficient for the global convergence and necessary for the global optimality. Moreover, our theory explains how and when increasing the network size does and does not improve the training behaviors in the practical regime. Our results provide practical guidance for designing a model structure: e.g., the common model structure assumption can be used as a justification for using a particular model structure instead of others. We also derive a new training framework based on the theory. The proposed framework is empirically shown to maintain competitive (practical) test performances while providing global convergence guarantees for deep residual neural networks with convolutions, skip connections, and batch normalization with standard benchmark datasets, including CIFAR-10, CIFAR-100, and SVHN. △ Less

Submitted 9 April, 2022; v1 submitted 28 June, 2021; originally announced June 2021.

Journal ref: Neural computation, volume 34, pages 991-1018 (2022)

arXiv:2101.03765 [pdf, other]

A Bayesian level set method for an inverse medium scattering problem in acoustics

Authors: J. Huang, Z. Deng, L. Xu

Abstract: In this work, we are interested in the determination of the shape of the scatterer for the two dimensional time harmonic inverse medium scattering problems in acoustics. The scatterer is assumed to be a piecewise constant function with a known value inside inhomogeneities, and its shape is represented by the level set functions for which we investigate the information using the Bayesian method. In… ▽ More In this work, we are interested in the determination of the shape of the scatterer for the two dimensional time harmonic inverse medium scattering problems in acoustics. The scatterer is assumed to be a piecewise constant function with a known value inside inhomogeneities, and its shape is represented by the level set functions for which we investigate the information using the Bayesian method. In the Bayesian framework, the solution of the geometric inverse problem is defined as a posterior probability distribution. The well-posedness of the posterior distribution would be discussed, and the Markov chain Monte Carlo (MCMC) methods will be applied to generate samples from the arising posterior distribution. Numerical experiments will be presented to demonstrate the effectiveness of the proposed method. △ Less

Submitted 11 January, 2021; originally announced January 2021.

arXiv:2008.12953 [pdf, other]

Sparse High-Order Portfolios via Proximal DCA and SCA

Authors: **xin Wang, Zengde Deng, Taoli Zheng, Anthony Man-Cho So

Abstract: In this paper, we aim at solving the cardinality constrained high-order portfolio optimization, i.e., mean-variance-skewness-kurtosis model with cardinality constraint (MVSKC). Optimization for the MVSKC model is of great difficulty in two parts. One is that the objective function is non-convex, the other is the combinational nature of the cardinality constraint, leading to non-convexity as well d… ▽ More In this paper, we aim at solving the cardinality constrained high-order portfolio optimization, i.e., mean-variance-skewness-kurtosis model with cardinality constraint (MVSKC). Optimization for the MVSKC model is of great difficulty in two parts. One is that the objective function is non-convex, the other is the combinational nature of the cardinality constraint, leading to non-convexity as well dis-continuity. Based on the observation that cardinality constraint has the difference-of-convex (DC) property, we transform the cardinality constraint into a penalty term and then propose three algorithms including the proximal difference of convex algorithm (pDCA), pDCA with extrapolation (pDCAe) and the successive convex approximation (SCA) to handle the resulting penalized MVSK (PMVSK) formulation. Moreover, theoretical convergence results of these algorithms are established respectively. Numerical experiments on the real datasets demonstrate the superiority of our proposed methods in obtaining high utility and sparse solutions as well as efficiency in terms of time usage. △ Less

Submitted 10 June, 2021; v1 submitted 29 August, 2020; originally announced August 2020.

Comments: ICASSP 2021

arXiv:2005.02356 [pdf, other]

doi 10.1109/TSP.2021.3099643

Manifold Proximal Point Algorithms for Dual Principal Component Pursuit and Orthogonal Dictionary Learning

Authors: Shixiang Chen, Zengde Deng, Shiqian Ma, Anthony Man-Cho So

Abstract: We consider the problem of maximizing the $\ell_1$ norm of a linear map over the sphere, which arises in various machine learning applications such as orthogonal dictionary learning (ODL) and robust subspace recovery (RSR). The problem is numerically challenging due to its nonsmooth objective and nonconvex constraint, and its algorithmic aspects have not been well explored. In this paper, we show… ▽ More We consider the problem of maximizing the $\ell_1$ norm of a linear map over the sphere, which arises in various machine learning applications such as orthogonal dictionary learning (ODL) and robust subspace recovery (RSR). The problem is numerically challenging due to its nonsmooth objective and nonconvex constraint, and its algorithmic aspects have not been well explored. In this paper, we show how the manifold structure of the sphere can be exploited to design fast algorithms for tackling this problem. Specifically, our contribution is threefold. First, we present a manifold proximal point algorithm (ManPPA) for the problem and show that it converges at a sublinear rate. Furthermore, we show that ManPPA can achieve a quadratic convergence rate when applied to the ODL and RSR problems. Second, we propose a stochastic variant of ManPPA called StManPPA, which is well suited for large-scale computation, and establish its sublinear convergence rate. Both ManPPA and StManPPA have provably faster convergence rates than existing subgradient-type methods. Third, using ManPPA as a building block, we propose a new approach to solving a matrix analog of the problem, in which the sphere is replaced by the Stiefel manifold. The results from our extensive numerical experiments on the ODL and RSR problems demonstrate the efficiency and efficacy of our proposed methods. △ Less

Submitted 21 July, 2021; v1 submitted 5 May, 2020; originally announced May 2020.

Comments: Accepted in IEEE Transactions on Signal Processing

arXiv:1911.05047 [pdf, other]

Weakly Convex Optimization over Stiefel Manifold Using Riemannian Subgradient-Type Methods

Authors: Xiao Li, Shixiang Chen, Zengde Deng, Qing Qu, Zhihui Zhu, Anthony Man Cho So

Abstract: We consider a class of nonsmooth optimization problems over the Stiefel manifold, in which the objective function is weakly convex in the ambient Euclidean space. Such problems are ubiquitous in engineering applications but still largely unexplored. We present a family of Riemannian subgradient-type methods -- namely Riemannain subgradient, incremental subgradient, and stochastic subgradient metho… ▽ More We consider a class of nonsmooth optimization problems over the Stiefel manifold, in which the objective function is weakly convex in the ambient Euclidean space. Such problems are ubiquitous in engineering applications but still largely unexplored. We present a family of Riemannian subgradient-type methods -- namely Riemannain subgradient, incremental subgradient, and stochastic subgradient methods -- to solve these problems and show that they all have an iteration complexity of ${\cal O}(\varepsilon^{-4})$ for driving a natural stationarity measure below $\varepsilon$. In addition, we establish the local linear convergence of the Riemannian subgradient and incremental subgradient methods when the problem at hand further satisfies a sharpness property and the algorithms are properly initialized and use geometrically diminishing stepsizes. To the best of our knowledge, these are the first convergence guarantees for using Riemannian subgradient-type methods to optimize a class of nonconvex nonsmooth functions over the Stiefel manifold. The fundamental ingredient in the proof of the aforementioned convergence results is a new Riemannian subgradient inequality for restrictions of weakly convex functions on the Stiefel manifold, which could be of independent interest. We also show that our convergence results can be extended to handle a class of compact embedded submanifolds of the Euclidean space. Finally, we discuss the sharpness properties of various formulations of the robust subspace recovery and orthogonal dictionary learning problems and demonstrate the convergence performance of the algorithms on both problems via numerical simulations. △ Less

Submitted 24 March, 2021; v1 submitted 12 November, 2019; originally announced November 2019.

Comments: 30 pages. Accepted to SIAM Journal on Optimization

MSC Class: 68Q25; 65K10; 90C90; 90C26; 90C06

arXiv:1907.12187 [pdf, other]

An ensemble Kalman filter approach based on level set parameterization for acoustic source identification using multiple frequency information

Authors: Zhiliang Deng, Xiaomei Yang

Abstract: The spatial dependent unknown acoustic source is reconstructed according noisy multiple frequency data on a remote closed surface. Assume that the unknown function is supported on a bounded domain. To determine the support, we present a statistical inversion algorithm, which combines the ensemble Kalman filter approach with level set technique. Several numerical examples show that the proposed met… ▽ More The spatial dependent unknown acoustic source is reconstructed according noisy multiple frequency data on a remote closed surface. Assume that the unknown function is supported on a bounded domain. To determine the support, we present a statistical inversion algorithm, which combines the ensemble Kalman filter approach with level set technique. Several numerical examples show that the proposed method give good numerical reconstruction. △ Less

Submitted 28 July, 2019; originally announced July 2019.

arXiv:1907.08660 [pdf, other]

A parametric Bayesian level set approach for acoustic source identification using multiple frequency information

Authors: Zhiliang Deng, Xiaomei Yang, Jiangfeng Huang

Abstract: The reconstruction of the unknown acoustic source is studied using the noisy multiple frequency data on a remote closed surface. Assume that the unknown source is coded in a spatial dependent piecewise constant function, whose support set is the target to be determined. In this setting, the unknown source can be formalized by a level set function. The function is explored with Bayesian level set a… ▽ More The reconstruction of the unknown acoustic source is studied using the noisy multiple frequency data on a remote closed surface. Assume that the unknown source is coded in a spatial dependent piecewise constant function, whose support set is the target to be determined. In this setting, the unknown source can be formalized by a level set function. The function is explored with Bayesian level set approach. To reduce the infinite dimensional problem to finite dimension, we parameterize the level set function by the radial basis expansion. The well-posedness of the posterior distribution is proven. The posterior samples are generated according to the Metropolis-Hastings algorithm and the sample mean is used to approximate the unknown. Several shapes are tested to verify the effectiveness of the proposed algorithm. These numerical results show that the proposed algorithm is feasible and competitive with the Matérn random field for the acoustic source problem. △ Less

Submitted 19 July, 2019; originally announced July 2019.

arXiv:1907.03955 [pdf, ps, other]

Bayesian approach for inverse obstacle scattering with Poisson data

Authors: Xiaomei Yang, Zhiliang Deng

Abstract: We consider an acoustic obstacle reconstruction problem with Poisson data. Due to the stochastic nature of the data, we tackle this problem in the framework of Bayesian inversion. The unknown obstacle is parameterized in its angular form. The prior for the parameterized unknown plays key role in the Bayes reconstruction algorithm. The most popular used prior is the Gaussian. Under the Gaussian pri… ▽ More We consider an acoustic obstacle reconstruction problem with Poisson data. Due to the stochastic nature of the data, we tackle this problem in the framework of Bayesian inversion. The unknown obstacle is parameterized in its angular form. The prior for the parameterized unknown plays key role in the Bayes reconstruction algorithm. The most popular used prior is the Gaussian. Under the Gaussian prior assumption, we further suppose that the unknown satisfies the total variation prior. With the hybrid prior, the well-posedness of the posterior distribution is discussed. The numerical examples verify the effectiveness of the proposed algorithm. △ Less

Submitted 8 July, 2019; originally announced July 2019.

Comments: 14 pages, 9 figures

MSC Class: 2010: 35R20; 65R20

arXiv:1905.12222 [pdf, ps, other]

Limited Aperture Inverse Scattering Problems using Bayesian Approach and Extended Sampling Method

Authors: Zhaoxiang Li, Zhiliang Deng, Jiguang Sun

Abstract: Inverse scattering problems have many important applications. In this paper, given limited aperture data, we propose a Bayesian method for the inverse acoustic scattering to reconstruct the shape of an obstacle. The inverse problem is formulated as a statistical model using the Baye's formula. The well-posedness is proved in the sense of the Hellinger metric. The extended sampling method is modifi… ▽ More Inverse scattering problems have many important applications. In this paper, given limited aperture data, we propose a Bayesian method for the inverse acoustic scattering to reconstruct the shape of an obstacle. The inverse problem is formulated as a statistical model using the Baye's formula. The well-posedness is proved in the sense of the Hellinger metric. The extended sampling method is modified to provide the initial guess of the target location, which is critical to the fast convergence of the MCMC algorithm. An extensive numerical study is presented to illustrate the performance of the proposed method. △ Less

Submitted 29 May, 2019; originally announced May 2019.

arXiv:1903.05006 [pdf, other]

An Efficient Augmented Lagrangian Based Method for Constrained Lasso

Authors: Zengde Deng, Anthony Man-Cho So

Abstract: Variable selection is one of the most important tasks in statistics and machine learning. To incorporate more prior information about the regression coefficients, the constrained Lasso model has been proposed in the literature. In this paper, we present an inexact augmented Lagrangian method to solve the Lasso problem with linear equality constraints. By fully exploiting second-order sparsity of t… ▽ More Variable selection is one of the most important tasks in statistics and machine learning. To incorporate more prior information about the regression coefficients, the constrained Lasso model has been proposed in the literature. In this paper, we present an inexact augmented Lagrangian method to solve the Lasso problem with linear equality constraints. By fully exploiting second-order sparsity of the problem, we are able to greatly reduce the computational cost and obtain highly efficient implementations. Furthermore, numerical results on both synthetic data and real data show that our algorithm is superior to existing first-order methods in terms of both running time and solution accuracy. △ Less

Submitted 12 March, 2019; originally announced March 2019.

arXiv:1808.01062 [pdf, ps, other]

Q-Hermite polynomials chaos approximation of likelihood function based on q-Gaussian prior in Bayesian inversion

Authors: Zhiliang Deng, Xiaomei Yang

Abstract: In real applications, the construction of prior and acceleration of sampling for posterior are usually two key points of Bayesian inversion algorithm for engineers. In this paper, q-analogy of Gaussian distribution, q-Gaussian distribution, is introduced as the prior of inverse problems. And an acceleration algorithm based on spectral likelihood approximation is discussed. We mainly focus on the c… ▽ More In real applications, the construction of prior and acceleration of sampling for posterior are usually two key points of Bayesian inversion algorithm for engineers. In this paper, q-analogy of Gaussian distribution, q-Gaussian distribution, is introduced as the prior of inverse problems. And an acceleration algorithm based on spectral likelihood approximation is discussed. We mainly focus on the convergence of the posterior distribution in the sense of Kullback-Leibler divergence when approximated likelihood function and truncated prior distribution are used. Moreover, the convergence in the sense of total variation and Hellinger metric is obtained. In the end two numerical examples are displayed. △ Less

Submitted 2 August, 2018; originally announced August 2018.

arXiv:1704.04652 [pdf, ps, other]

doi 10.1109/TCYB.2018.2813431

Optimal Output Consensus of High-Order Multi-Agent Systems with Embedded Technique

Authors: Yutao Tang, Zhenhua Deng, Yiguang Hong

Abstract: In this paper, we study an optimal output consensus problem for a multi-agent network with agents in the form of multi-input multi-output minimum-phase dynamics. Optimal output consensus can be taken as an extended version of the existing output consensus problem for higher-order agents with an optimization requirement, where the output variables of agents are driven to achieve a consensus on the… ▽ More In this paper, we study an optimal output consensus problem for a multi-agent network with agents in the form of multi-input multi-output minimum-phase dynamics. Optimal output consensus can be taken as an extended version of the existing output consensus problem for higher-order agents with an optimization requirement, where the output variables of agents are driven to achieve a consensus on the optimal solution of a global cost function. To solve this problem, we first construct an optimal signal generator, and then propose an embedded control scheme by embedding the generator in the feedback loop. We give two kinds of algorithms based on different available information along with both state feedback and output feedback, and prove that these algorithms with the embedded technique can guarantee the solvability of the problem for high-order multi-agent systems under standard assumptions. △ Less

Submitted 21 August, 2018; v1 submitted 15 April, 2017; originally announced April 2017.

Comments: 23 page, 5 figures, accepted by IEEE Transactions on Cybernetics

arXiv:1610.09080 [pdf, ps, other]

Stability analysis of the numerical Method of characteristics applied to a class of energy-preserving systems. Part II: Nonreflecting boundary conditions

Authors: Taras I. Lakoba, Zihao Deng

Abstract: We show that imposition of non-periodic, in place of periodic, boundary conditions (BC) can alter stability of modes in the Method of characteristics (MoC) employing certain ordinary-differential equation (ODE) numerical solvers. Thus, using non-periodic BC may render some of the MoC schemes stable for most practical computations, even though they are unstable for periodic BC. This fact contradict… ▽ More We show that imposition of non-periodic, in place of periodic, boundary conditions (BC) can alter stability of modes in the Method of characteristics (MoC) employing certain ordinary-differential equation (ODE) numerical solvers. Thus, using non-periodic BC may render some of the MoC schemes stable for most practical computations, even though they are unstable for periodic BC. This fact contradicts a statement, found in some literature, that an instability detected by the von Neumann analysis for a given numerical scheme implies an instability of that scheme with arbitrary (i.e., non-periodic) BC. We explain the mechanism behind this contradiction. We also show that, and explain why, for the MoC employing some other ODE solvers, stability of the modes may be unaffected by the BC. △ Less

Submitted 27 July, 2017; v1 submitted 28 October, 2016; originally announced October 2016.

Comments: 38 pages, 13 figures

arXiv:1610.09079 [pdf, ps, other]

Stability analysis of the numerical Method of characteristics applied to a class of energy-preserving systems. Part I: Periodic boundary conditions

Authors: Taras I. Lakoba, Zihao Deng

Abstract: We study numerical (in)stability of the Method of characteristics (MoC) applied to a system of non-dissipative hyperbolic partial differential equations (PDEs) with periodic boundary conditions. We consider three different solvers along the characteristics: simple Euler (SE), modified Euler (ME), and Leap-frog (LF). The two former solvers are well known to exhibit a mild, but unconditional, numeri… ▽ More We study numerical (in)stability of the Method of characteristics (MoC) applied to a system of non-dissipative hyperbolic partial differential equations (PDEs) with periodic boundary conditions. We consider three different solvers along the characteristics: simple Euler (SE), modified Euler (ME), and Leap-frog (LF). The two former solvers are well known to exhibit a mild, but unconditional, numerical instability for non-dissipative ordinary differential equations (ODEs). They are found to have a similar (or stronger, for the MoC-ME) instability when applied to non-dissipative PDEs. On the other hand, the LF solver is known to be stable when applied to non-dissipative ODEs. However, when applied to non-dissipative PDEs within the MoC framework, it was found to have by far the strongest instability among all three solvers. We also comment on the use of the fourth-order Runge--Kutta solver within the MoC framework. △ Less

Submitted 27 July, 2017; v1 submitted 28 October, 2016; originally announced October 2016.

Comments: 20 pages, 5 figures

arXiv:1309.7421 [pdf, ps, other]

An inverse problem of identifying the radiative coefficient in a degenerate parabolic equation

Authors: Zui-Cha Deng, Liu Yang

Abstract: This work investigates an inverse problem of determining the radiative coefficient in a degenerate parabolic equation from the final overspecified data. Being different from other inverse coefficient problems in which the principle coefficients are assumed to be strictly positive definite, the mathematical model discussed in the paper belongs to the second order parabolic equations with non-negati… ▽ More This work investigates an inverse problem of determining the radiative coefficient in a degenerate parabolic equation from the final overspecified data. Being different from other inverse coefficient problems in which the principle coefficients are assumed to be strictly positive definite, the mathematical model discussed in the paper belongs to the second order parabolic equations with non-negative characteristic form, namely that there exists degeneracy on the lateral boundaries of the domain. The uniqueness of the solution is obtained by the contraction map** principle. Based on the optimal control framework, the problem is transformed into an optimization problem and the existence of the minimizer is established. After the necessary conditions which must be satisfied by the minimizer are deduced, the uniqueness and stability of the minimizer are proved. By minor modification of the cost functional and some \emph{a-priori} regularity conditions imposed on the forward operator, the convergence of the minimizer for the noisy input data is obtained in the paper. The results obtained in the paper are interesting and useful, and can be extended to more general degenerate parabolic equations. △ Less

Submitted 27 September, 2013; originally announced September 2013.

Comments: 36 pages

MSC Class: 35R30; 49J20

arXiv:1210.1284 [pdf, ps, other]

Factorization from an order-theoretic view 1&2

Authors: Zike Deng

Abstract: Drawing inspiration from Emmy Noether'set-theoretic foundations for algebra and Charles Ehresmann's topology without points, we adopt a new order-theoretic approach to ideal theory. For this we emphasize the order of divisibility in factorization and use it as a medium for relating algebra to topology 1. Replacing principal ideals and their intersections by equivalence classes and their collection… ▽ More Drawing inspiration from Emmy Noether'set-theoretic foundations for algebra and Charles Ehresmann's topology without points, we adopt a new order-theoretic approach to ideal theory. For this we emphasize the order of divisibility in factorization and use it as a medium for relating algebra to topology 1. Replacing principal ideals and their intersections by equivalence classes and their collections respectively, we transform integral divisorial ideals into B-ideals in order to provide an order-theoretic frame for treating decomposition dispensing with addition. The idea of a B-ideal is connected closely with generalized-algebraicty originated from semantics for programme languages. 2. Since B-ideals constitute a complete lattice, we can utilize the fact that decomposition, which means that each element can be decomposed into the join of all elements way-below it, is equivalent to complete distributivity. B-ideals with decomposition theorems in themselves do not depend on algebraic structures and can be applied to any poset 3. Closed-set lattice is cotopology based on multiplication and independent of a partioular prime in the sense of pointless topology by Ehresmann. It differs from Zariski topology in using prime-powers rather than primes so that multiplicity in algebra acquires geometric meaning. 4. Factorial group is also a free module with multiplication instead of addition. Hence poset-theoretic constructions have corresponding algebraic analoques. They are introduced based on Noether's set-theoretic approach but quotient is within like a subset rather than outside. △ Less

Submitted 3 October, 2012; originally announced October 2012.

Comments: part1 includes 12 pages. part2 includes 13 pages

MSC Class: Primary 06A11 Seundary 13F15

Showing 1–32 of 32 results for author: Deng, Z