Search | arXiv e-print repository

Online Non-Stationary Stochastic Quasar-Convex Optimization

Abstract: Recent research has shown that quasar-convexity can be found in applications such as identification of linear dynamical systems and generalized linear models. Such observations have in turn spurred exciting developments in design and analysis algorithms that exploit quasar-convexity. In this work, we study the online stochastic quasar-convex optimization problems in a dynamic environment. We estab… ▽ More Recent research has shown that quasar-convexity can be found in applications such as identification of linear dynamical systems and generalized linear models. Such observations have in turn spurred exciting developments in design and analysis algorithms that exploit quasar-convexity. In this work, we study the online stochastic quasar-convex optimization problems in a dynamic environment. We establish regret bounds of online gradient descent in terms of cumulative path variation and cumulative gradient variance for losses satisfying quasar-convexity and strong quasar-convexity. We then apply the results to generalized linear models (GLM) when the underlying parameter is time-varying. We establish regret bounds of online gradient descent when applying to GLMs with leaky ReLU activation function, logistic activation function, and ReLU activation function. Numerical results are presented to corroborate our findings. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2309.09411 [pdf, other]

Distributionally Time-Varying Online Stochastic Optimization under Polyak-Łojasiewicz Condition with Application in Conditional Value-at-Risk Statistical Learning

Authors: Yuen-Man Pun, Farhad Farokhi, Iman Shames

Abstract: In this work, we consider a sequence of stochastic optimization problems following a time-varying distribution via the lens of online optimization. Assuming that the loss function satisfies the Polyak-Łojasiewicz condition, we apply online stochastic gradient descent and establish its dynamic regret bound that is composed of cumulative distribution drifts and cumulative gradient biases caused by s… ▽ More In this work, we consider a sequence of stochastic optimization problems following a time-varying distribution via the lens of online optimization. Assuming that the loss function satisfies the Polyak-Łojasiewicz condition, we apply online stochastic gradient descent and establish its dynamic regret bound that is composed of cumulative distribution drifts and cumulative gradient biases caused by stochasticity. The distribution metric we adopt here is Wasserstein distance, which is well-defined without the absolute continuity assumption or with a time-varying support set. We also establish a regret bound of online stochastic proximal gradient descent when the objective function is regularized. Moreover, we show that the above framework can be applied to the Conditional Value-at-Risk (CVaR) learning problem. Particularly, we improve an existing proof on the discovery of the PL condition of the CVaR problem, resulting in a regret bound of online stochastic gradient descent. △ Less

Submitted 17 September, 2023; originally announced September 2023.

arXiv:2307.00210 [pdf, ps, other]

Projected Tensor Power Method for Hypergraph Community Recovery

Authors: **xin Wang, Yuen-Man Pun, Xiaolu Wang, Peng Wang, Anthony Man-Cho So

Abstract: This paper investigates the problem of exact community recovery in the symmetric $d$-uniform $(d \geq 2)$ hypergraph stochastic block model ($d$-HSBM). In this model, a $d$-uniform hypergraph with $n$ nodes is generated by first partitioning the $n$ nodes into $K\geq 2$ equal-sized disjoint communities and then generating hyperedges with a probability that depends on the community memberships of… ▽ More This paper investigates the problem of exact community recovery in the symmetric $d$-uniform $(d \geq 2)$ hypergraph stochastic block model ($d$-HSBM). In this model, a $d$-uniform hypergraph with $n$ nodes is generated by first partitioning the $n$ nodes into $K\geq 2$ equal-sized disjoint communities and then generating hyperedges with a probability that depends on the community memberships of $d$ nodes. Despite the non-convex and discrete nature of the maximum likelihood estimation problem, we develop a simple yet efficient iterative method, called the \emph{projected tensor power method}, to tackle it. As long as the initialization satisfies a partial recovery condition in the logarithmic degree regime of the problem, we show that our proposed method can exactly recover the hidden community structure down to the information-theoretic limit with high probability. Moreover, our proposed method exhibits a competitive time complexity of $\mathcal{O}(n\log^2n/\log\log n)$ when the aforementioned initialization condition is met. We also conduct numerical experiments to validate our theoretical findings. △ Less

Submitted 30 June, 2023; originally announced July 2023.

Journal ref: Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, USA. PMLR 202, 2023

arXiv:2112.11045 [pdf, other]

doi 10.1109/TSP.2021.3137953

Local Strong Convexity of Source Localization and Error Bound for Target Tracking under Time-of-Arrival Measurements

Authors: Yuen-Man Pun, Anthony Man-Cho So

Abstract: In this paper, we consider a time-varying optimization approach to the problem of tracking a moving target using noisy time-of-arrival (TOA) measurements. Specifically, we formulate the problem as that of sequential TOA-based source localization and apply online gradient descent (OGD) to it to generate the position estimates of the target. To analyze the tracking performance of OGD, we first revis… ▽ More In this paper, we consider a time-varying optimization approach to the problem of tracking a moving target using noisy time-of-arrival (TOA) measurements. Specifically, we formulate the problem as that of sequential TOA-based source localization and apply online gradient descent (OGD) to it to generate the position estimates of the target. To analyze the tracking performance of OGD, we first revisit the classic least-squares formulation of the (static) TOA-based source localization problem and elucidate its estimation and geometric properties. In particular, under standard assumptions on the TOA measurement model, we establish a bound on the distance between an optimal solution to the least-squares formulation and the true target position. Using this bound, we show that the loss function in the formulation, albeit non-convex in general, is locally strongly convex at its global minima. To the best of our knowledge, these results are new and can be of independent interest. By combining them with existing techniques from online strongly convex optimization, we then establish the first non-trivial bound on the cumulative target tracking error of OGD. Our numerical results corroborate the theoretical findings and show that OGD can effectively track the target at different noise levels. △ Less

Submitted 21 December, 2021; originally announced December 2021.

Comments: Accepted for publication in IEEE Transactions on Signal Processing

arXiv:2105.05458 [pdf, other]

doi 10.1109/TSP.2022.3229950

Distributionally Robust Graph Learning from Smooth Signals under Moment Uncertainty

Authors: Xiaolu Wang, Yuen-Man Pun, Anthony Man-Cho So

Abstract: We consider the problem of learning a graph from a finite set of noisy graph signal observations, the goal of which is to find a smooth representation of the graph signal. Such a problem is motivated by the desire to infer relational structure in large datasets and has been extensively studied in recent years. Most existing approaches focus on learning a graph on which the observed signals are smo… ▽ More We consider the problem of learning a graph from a finite set of noisy graph signal observations, the goal of which is to find a smooth representation of the graph signal. Such a problem is motivated by the desire to infer relational structure in large datasets and has been extensively studied in recent years. Most existing approaches focus on learning a graph on which the observed signals are smooth. However, the learned graph is prone to overfitting, as it does not take the unobserved signals into account. To address this issue, we propose a novel graph learning model based on the distributionally robust optimization methodology, which aims to identify a graph that not only provides a smooth representation of but is also robust against uncertainties in the observed signals. On the statistics side, we establish out-of-sample performance guarantees for our proposed model. On the optimization side, we show that under a mild assumption on the graph signal distribution, our proposed model admits a smooth non-convex optimization formulation. We then develop a projected gradient method to tackle this formulation and establish its convergence guarantees. Our formulation provides a new perspective on regularization in the graph learning setting. Moreover, extensive numerical experiments on both synthetic and real-world data show that our model has comparable yet more robust performance across different populations of observed signals than existing non-robust models according to various metrics. △ Less

Submitted 24 November, 2021; v1 submitted 12 May, 2021; originally announced May 2021.

arXiv:1606.02291 [pdf]

On Deposition of the Product of Demazure Atoms and Demazure Characters

Authors: Anna Ying Pun

Abstract: This paper studies the properties of Demazure atoms and characters using linear operators and also tableaux-combinatorics. It proves the atom-positivity property of the product of a dominating monomial and an atom, which was an open problem. Furthermore, it provides a combinatorial proof to the key-positivity property of the product of a dominating monomial and a key using skyline fillings, an alg… ▽ More This paper studies the properties of Demazure atoms and characters using linear operators and also tableaux-combinatorics. It proves the atom-positivity property of the product of a dominating monomial and an atom, which was an open problem. Furthermore, it provides a combinatorial proof to the key-positivity property of the product of a dominating monomial and a key using skyline fillings, an algebraic proof to the key-positivity property of the product of a Schur function and a key using linear operator and verifies the first open case for the conjecture of key-positivity of the product of two keys using linear operators and polytopes. △ Less

Submitted 7 June, 2016; originally announced June 2016.

Showing 1–6 of 6 results for author: Pun, Y