Skip to main content

Showing 1–10 of 10 results for author: Bi, J

Searching in archive math. Search in all archives.
.
  1. arXiv:2406.19475  [pdf, other

    math.OC cs.LG

    Stochastic First-Order Methods with Non-smooth and Non-Euclidean Proximal Terms for Nonconvex High-Dimensional Stochastic Optimization

    Authors: Yue Xie, Jiawen Bi, Hongcheng Liu

    Abstract: When the nonconvex problem is complicated by stochasticity, the sample complexity of stochastic first-order methods may depend linearly on the problem dimension, which is undesirable for large-scale problems. In this work, we propose dimension-insensitive stochastic first-order methods (DISFOMs) to address nonconvex optimization with expected-valued objective function. Our algorithms allow for non… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    MSC Class: 90C06; 90C15; 90C26; 90C30

  2. arXiv:2404.13259  [pdf, other

    math.NA

    Structure-preserving weighted BDF2 methods for Anisotropic Cahn-Hilliard model: uniform/variable-time-steps

    Authors: Meng Li, **gjiang Bi, Nan Wang

    Abstract: In this paper, we innovatively develop uniform/variable-time-step weighted and shifted BDF2 (WSBDF2) methods for the anisotropic Cahn-Hilliard (CH) model, combining the scalar auxiliary variable (SAV) approach with two types of stabilized techniques. Using the concept of $G$-stability, the uniform-time-step WSBDF2 method is theoretically proved to be energy-stable. Due to the inapplicability of th… ▽ More

    Submitted 15 June, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  3. arXiv:2103.04413  [pdf, other

    math.OC cs.LG stat.ML

    Esca** Saddle Points with Stochastically Controlled Stochastic Gradient Methods

    Authors: Guannan Liang, Qianqian Tong, Chunjiang Zhu, **bo Bi

    Abstract: Stochastically controlled stochastic gradient (SCSG) methods have been proved to converge efficiently to first-order stationary points which, however, can be saddle points in nonconvex optimization. It has been observed that a stochastic gradient descent (SGD) step introduces anistropic noise around saddle points for deep learning and non-convex half space learning problems, which indicates that S… ▽ More

    Submitted 23 April, 2021; v1 submitted 7 March, 2021; originally announced March 2021.

  4. arXiv:2102.09893  [pdf, other

    cs.LG cs.AI math.OC

    A Variance Controlled Stochastic Method with Biased Estimation for Faster Non-convex Optimization

    Authors: Jia Bi, Steve R. Gunn

    Abstract: In this paper, we proposed a new technique, {\em variance controlled stochastic gradient} (VCSG), to improve the performance of the stochastic variance reduced gradient (SVRG) algorithm. To avoid over-reducing the variance of gradient by SVRG, a hyper-parameter $λ$ is introduced in VCSG that is able to control the reduced variance of SVRG. Theory shows that the optimization method can converge by… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

  5. arXiv:2011.00667  [pdf, other

    math.OC cs.DC

    Asynchronous Parallel Stochastic Quasi-Newton Methods

    Authors: Qianqian Tong, Guannan Liang, Xingyu Cai, Chunjiang Zhu, **bo Bi

    Abstract: Although first-order stochastic algorithms, such as stochastic gradient descent, have been the main force to scale up machine learning models, such as deep neural nets, the second-order quasi-Newton methods start to draw attention due to their effectiveness in dealing with ill-conditioned optimization problems. The L-BFGS method is one of the most widely used quasi-Newton methods. We propose an as… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: Accepted by Parallel Computing Journal

  6. arXiv:1908.00700  [pdf, other

    cs.LG math.OC stat.ML

    Calibrating the Adaptive Learning Rate to Improve Convergence of ADAM

    Authors: Qianqian Tong, Guannan Liang, **bo Bi

    Abstract: Adaptive gradient methods (AGMs) have become popular in optimizing the nonconvex problems in deep learning area. We revisit AGMs and identify that the adaptive learning rate (A-LR) used by AGMs varies significantly across the dimensions of the problem over epochs (i.e., anisotropic scale), which may lead to issues in convergence and generalization. All existing modified AGMs actually represent eff… ▽ More

    Submitted 11 September, 2019; v1 submitted 2 August, 2019; originally announced August 2019.

  7. arXiv:1905.05185  [pdf

    cs.LG math.OC stat.ML

    A Stochastic Gradient Method with Biased Estimation for Faster Nonconvex Optimization

    Authors: Jia Bi, Steve R. Gunn

    Abstract: A number of optimization approaches have been proposed for optimizing nonconvex objectives (e.g. deep learning models), such as batch gradient descent, stochastic gradient descent and stochastic variance reduced gradient descent. Theory shows these optimization methods can converge by using an unbiased gradient estimator. However, in practice biased gradient estimation can allow more efficient con… ▽ More

    Submitted 13 May, 2019; originally announced May 2019.

    Comments: 6 pages

  8. arXiv:1708.02626  [pdf, other

    q-bio.QM math.CO q-bio.PE

    A combinatorial method for connecting BHV spaces representing different numbers of taxa

    Authors: Yingying Ren, Sihan Zha, **gwen Bi, José A. Sanchez, Cara Monical, Michelle Delcourt, Rosemary K. Guzman, Ruth Davidson

    Abstract: The phylogenetic tree space introduced by Billera, Holmes, and Vogtmann (BHV tree space) is a CAT(0) continuous space that represents trees with edge weights with an intrinsic geodesic distance measure. The geodesic distance measure unique to BHV tree space is well known to be computable in polynomial time, which makes it a potentially powerful tool for optimization problems in phylogenetics and p… ▽ More

    Submitted 3 December, 2017; v1 submitted 8 August, 2017; originally announced August 2017.

    Comments: Updated section on applications and link to github software release

    MSC Class: 46N60; 37F20; 90C57; 97K20; 05C05; 92B10

  9. arXiv:1610.07184  [pdf, other

    cs.DC math.OC

    Hybrid-DCA: A Double Asynchronous Approach for Stochastic Dual Coordinate Ascent

    Authors: Soumitra Pal, Tingyang Xu, Tianbao Yang, Sanguthevar Rajasekaran, **bo Bi

    Abstract: In prior works, stochastic dual coordinate ascent (SDCA) has been parallelized in a multi-core environment where the cores communicate through shared memory, or in a multi-processor distributed memory environment where the processors communicate through message passing. In this paper, we propose a hybrid SDCA framework for multi-core clusters, the most common high performance computing environment… ▽ More

    Submitted 2 November, 2016; v1 submitted 23 October, 2016; originally announced October 2016.

  10. arXiv:1204.1113  [pdf, ps, other

    math.NT cs.CC

    Sub-Linear Root Detection, and New Hardness Results, for Sparse Polynomials Over Finite Fields

    Authors: **gguo Bi, Qi Cheng, J. Maurice Rojas

    Abstract: We present a deterministic 2^O(t)q^{(t-2)(t-1)+o(1)} algorithm to decide whether a univariate polynomial f, with exactly t monomial terms and degree <q, has a root in F_q. A corollary of our method --- the first with complexity sub-linear in q when t is fixed --- is that the nonzero roots in F_q can be partitioned into at most 2 \sqrt{t-1} (q-1)^{(t-2)(t-1)} cosets of two subgroups S_1,S_2 of F^*_… ▽ More

    Submitted 11 April, 2012; v1 submitted 4 April, 2012; originally announced April 2012.

    Comments: 15 pages total (cover page, 10 pages, references, and 3 short appendices). This version corrects various minor typos, and improves the statement of the first main theorem