Skip to main content

Showing 1–6 of 6 results for author: Ying, D

Searching in archive math. Search in all archives.
.
  1. arXiv:2405.14741  [pdf, other

    math.OC cs.LG stat.ML

    Bagging Improves Generalization Exponentially

    Authors: Huajie Qian, Donghao Ying, Henry Lam, Wotao Yin

    Abstract: Bagging is a popular ensemble technique to improve the accuracy of machine learning models. It hinges on the well-established rationale that, by repeatedly retraining on resampled data, the aggregated model exhibits lower variance and hence higher stability, especially for discontinuous base learners. In this paper, we provide a new perspective on bagging: By suitably aggregating the base learners… ▽ More

    Submitted 29 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: Correct author list typo

  2. arXiv:2305.17568  [pdf, other

    cs.LG math.OC

    Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities

    Authors: Donghao Ying, Yunkai Zhang, Yuhao Ding, Alec Koppel, Javad Lavaei

    Abstract: We investigate safe multi-agent reinforcement learning, where agents seek to collectively maximize an aggregate sum of local objectives while satisfying their own safety constraints. The objective and constraints are described by {\it general utilities}, i.e., nonlinear functions of the long-term state-action occupancy measure, which encompass broader decision-making goals such as risk, exploratio… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

    Comments: 50 pages

  3. arXiv:2305.17567  [pdf, other

    cs.GT math.OC

    No-Regret Learning in Dynamic Competition with Reference Effects Under Logit Demand

    Authors: Mengzi Amy Guo, Donghao Ying, Javad Lavaei, Zuo-Jun Max Shen

    Abstract: This work is dedicated to the algorithm design in a competitive framework, with the primary goal of learning a stable equilibrium. We consider the dynamic price competition between two firms operating within an opaque marketplace, where each firm lacks information about its competitor. The demand follows the multinomial logit (MNL) choice model, which depends on the consumers' observed price and t… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

  4. arXiv:2302.11190  [pdf, other

    math.OC

    A Hitting Time Analysis for Stochastic Time-Varying Functions with Applications to Adversarial Attacks on Computation of Markov Decision Processes

    Authors: Ali Yekkehkhany, Han Feng, Donghao Ying, Javad Lavaei

    Abstract: Stochastic time-varying optimization is an integral part of learning in which the shape of the function changes over time in a non-deterministic manner. This paper considers multiple models of stochastic time variation and analyzes the corresponding notion of hitting time for each model, i.e., the period after which optimizing the stochastic time-varying function reveals informative statistics on… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

  5. arXiv:2205.10715  [pdf, other

    cs.LG math.OC

    Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction

    Authors: Donghao Ying, Mengzi Amy Guo, Hyunin Lee, Yuhao Ding, Javad Lavaei, Zuo-Jun Max Shen

    Abstract: We study Concave Constrained Markov Decision Processes (Concave CMDPs) where both the objective and constraints are defined as concave functions of the state-action occupancy measure. We propose the Variance-Reduced Primal-Dual Policy Gradient Algorithm (VR-PDPG), which updates the primal variable via policy gradient ascent and the dual variable via projected sub-gradient descent. Despite the chal… ▽ More

    Submitted 26 May, 2024; v1 submitted 21 May, 2022; originally announced May 2022.

  6. arXiv:2110.08923  [pdf, ps, other

    cs.LG math.OC

    A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization

    Authors: Donghao Ying, Yuhao Ding, Javad Lavaei

    Abstract: We study entropy-regularized constrained Markov decision processes (CMDPs) under the soft-max parameterization, in which an agent aims to maximize the entropy-regularized value function while satisfying constraints on the expected total utility. By leveraging the entropy regularization, our theoretical analysis shows that its Lagrangian dual function is smooth and the Lagrangian duality gap can be… ▽ More

    Submitted 7 April, 2023; v1 submitted 17 October, 2021; originally announced October 2021.

    Comments: 24 pages, AISTATS22