Skip to main content

Showing 1–8 of 8 results for author: Shang, X

Searching in archive stat. Search in all archives.
.
  1. arXiv:2309.08709  [pdf, other

    stat.ML cs.LG

    Price of Safety in Linear Best Arm Identification

    Authors: Xuedong Shang, Igor Colin, Merwan Barlier, Hamza Cherkaoui

    Abstract: We introduce the safe best-arm identification framework with linear feedback, where the agent is subject to some stage-wise safety constraint that linearly depends on an unknown parameter vector. The agent must take actions in a conservative way so as to ensure that the safety constraint is not violated with high probability at each round. Ways of leveraging the linear structure for ensuring safet… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: 20 pages, 1 figures

  2. Monte-Carlo Sampling Approach to Model Selection: A Primer

    Authors: Petre Stoica, Xiaolei Shang, Yuanbo Cheng

    Abstract: Any data modeling exercise has two main components: parameter estimation and model selection. The latter will be the topic of this lecture note. More concretely we will introduce several Monte-Carlo sampling-based rules for model selection using the maximum a posteriori (MAP) approach. Model selection problems are omnipresent in signal processing applications: examples include selecting the order… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

    Journal ref: IEEE Signal Processing Magazine, Vol, 39, no. 5, pp. 85--2, 2022

  3. arXiv:2103.01312  [pdf, other

    stat.ML cs.LG

    UCB Momentum Q-learning: Correcting the bias without forgetting

    Authors: Pierre Menard, Omar Darwiche Domingues, Xuedong Shang, Michal Valko

    Abstract: We propose UCBMQ, Upper Confidence Bound Momentum Q-learning, a new algorithm for reinforcement learning in tabular and possibly stage-dependent, episodic Markov decision process. UCBMQ is based on Q-learning where we add a momentum term and rely on the principle of optimism in face of uncertainty to deal with exploration. Our new technical ingredient of UCBMQ is the use of momentum to correct the… ▽ More

    Submitted 18 March, 2022; v1 submitted 1 March, 2021; originally announced March 2021.

  4. arXiv:2010.08061  [pdf, ps, other

    cs.LG stat.ML

    Stochastic Bandits with Vector Losses: Minimizing $\ell^\infty$-Norm of Relative Losses

    Authors: Xuedong Shang, Han Shao, Jian Qian

    Abstract: Multi-armed bandits are widely applied in scenarios like recommender systems, for which the goal is to maximize the click rate. However, more factors should be considered, e.g., user stickiness, user growth rate, user experience assessment, etc. In this paper, we model this situation as a problem of $K$-armed bandit with multiple losses. We define relative loss vector of an arm where the $i$-th en… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: 14 pages

  5. arXiv:2007.00953  [pdf, other

    stat.ML cs.LG

    Gamification of Pure Exploration for Linear Bandits

    Authors: Rémy Degenne, Pierre Ménard, Xuedong Shang, Michal Valko

    Abstract: We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear stochastic bandits. While asymptotically optimal algorithms exist for standard multi-arm bandits, the existence of such algorithms for the best-arm identification in linear bandits has been elusive despite several attempts to address it. First, we provide a thorough comparison and new… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: 11+25 pages. To be published in the proceedings of ICML 2020

  6. arXiv:1910.10945  [pdf, other

    cs.LG stat.ML

    Fixed-Confidence Guarantees for Bayesian Best-Arm Identification

    Authors: Xuedong Shang, Rianne de Heide, Emilie Kaufmann, Pierre Ménard, Michal Valko

    Abstract: We investigate and provide new insights on the sampling rule called Top-Two Thompson Sampling (TTTS). In particular, we justify its use for fixed-confidence best-arm identification. We further propose a variant of TTTS called Top-Two Transportation Cost (T3C), which disposes of the computational burden of TTTS. As our main contribution, we provide the first sample complexity analysis of TTTS and T… ▽ More

    Submitted 28 October, 2019; v1 submitted 24 October, 2019; originally announced October 2019.

  7. arXiv:1809.02394  [pdf, other

    cs.LG cs.AI stat.ML

    Deep Feature Learning of Multi-Network Topology for Node Classification

    Authors: Hansheng Xue, Jiajie Peng, Xuequn Shang

    Abstract: Networks are ubiquitous structure that describes complex relationships between different entities in the real world. As a critical component of prediction task over nodes in networks, learning the feature representation of nodes has become one of the most active areas recently. Network Embedding, aiming to learn non-linear and low-dimensional feature representation based on network topology, has b… ▽ More

    Submitted 7 September, 2018; originally announced September 2018.

  8. arXiv:1510.08692  [pdf, ps, other

    stat.ML cs.LG

    Covariance-Controlled Adaptive Langevin Thermostat for Large-Scale Bayesian Sampling

    Authors: Xiaocheng Shang, Zhanxing Zhu, Benedict Leimkuhler, Amos J. Storkey

    Abstract: Monte Carlo sampling for Bayesian posterior inference is a common approach used in machine learning. The Markov Chain Monte Carlo procedures that are used are often discrete-time analogues of associated stochastic differential equations (SDEs). These SDEs are guaranteed to leave invariant the required posterior distribution. An area of current research addresses the computational benefits of stoch… ▽ More

    Submitted 12 February, 2020; v1 submitted 29 October, 2015; originally announced October 2015.

    Journal ref: Advances in Neural Information Processing Systems, 28, 37-45, (2015)