Skip to main content

Showing 1–4 of 4 results for author: Tsang, I W

Searching in archive math. Search in all archives.
.
  1. arXiv:2405.17761  [pdf, other

    cs.LG math.OC

    Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient

    Authors: Hao Di, Haishan Ye, Yueling Zhang, Xiangyu Chang, Guang Dai, Ivor W. Tsang

    Abstract: Variance reduction techniques are designed to decrease the sampling variance, thereby accelerating convergence rates of first-order (FO) and zeroth-order (ZO) optimization methods. However, in composite optimization problems, ZO methods encounter an additional variance called the coordinate-wise variance, which stems from the random gradient estimation. To reduce this variance, prior works require… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  2. arXiv:2308.10547  [pdf, other

    math.OC cs.LG eess.SY

    Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold

    Authors: Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor W. Tsang, Yong Liu

    Abstract: The conjugate gradient method is a crucial first-order optimization method that generally converges faster than the steepest descent method, and its computational cost is much lower than that of second-order methods. However, while various types of conjugate gradient methods have been studied in Euclidean spaces and on Riemannian manifolds, there is little study for those in distributed scenarios.… ▽ More

    Submitted 12 March, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

    Journal ref: International Conference on Learning Representations, 2024

  3. arXiv:2011.06446  [pdf, other

    stat.CO cs.LG math.NA

    Subgroup-based Rank-1 Lattice Quasi-Monte Carlo

    Authors: Yueming Lyu, Yuan Yuan, Ivor W. Tsang

    Abstract: Quasi-Monte Carlo (QMC) is an essential tool for integral approximation, Bayesian inference, and sampling for simulation in science, etc. In the QMC area, the rank-1 lattice is important due to its simple operation, and nice properties for point set construction. However, the construction of the generating vector of the rank-1 lattice is usually time-consuming because of an exhaustive computer sea… ▽ More

    Submitted 28 October, 2020; originally announced November 2020.

    Comments: NeurIPS 2020

  4. arXiv:1802.09932  [pdf, other

    cs.LG cs.CV math.OC stat.ML

    VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning

    Authors: Fanhua Shang, Kaiwen Zhou, Hongying Liu, James Cheng, Ivor W. Tsang, Lijun Zhang, Dacheng Tao, Licheng Jiao

    Abstract: In this paper, we propose a simple variant of the original SVRG, called variance reduced stochastic gradient descent (VR-SGD). Unlike the choices of snapshot and starting points in SVRG and its proximal variant, Prox-SVRG, the two vectors of VR-SGD are set to the average and last iterate of the previous epoch, respectively. The settings allow us to use much larger learning rates, and also make our… ▽ More

    Submitted 28 October, 2018; v1 submitted 26 February, 2018; originally announced February 2018.

    Comments: 46 pages, 25 figures. IEEE Transactions on Knowledge and Data Engineering, accepted in October, 2018. arXiv admin note: substantial text overlap with arXiv:1704.04966