Skip to main content

Showing 1–36 of 36 results for author: Yin, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2404.04057  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation

    Authors: Mingyuan Zhou, Huangjie Zheng, Zhendong Wang, Mingzhang Yin, Hai Huang

    Abstract: We introduce Score identity Distillation (SiD), an innovative data-free method that distills the generative capabilities of pretrained diffusion models into a single-step generator. SiD not only facilitates an exponentially fast reduction in Fréchet inception distance (FID) during distillation but also approaches or even exceeds the FID performance of the original teacher diffusion models. By refo… ▽ More

    Submitted 24 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

    Comments: ICML 2024, PyTorch implementation: https://github.com/mingyuanzhou/SiD

  2. arXiv:2310.18919  [pdf, other

    cs.LG cs.AI stat.ML

    Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation

    Authors: Nikki Li**g Kuang, Ming Yin, Mengdi Wang, Yu-Xiang Wang, Yi-An Ma

    Abstract: Recent studies in reinforcement learning (RL) have made significant progress by leveraging function approximation to alleviate the sample complexity hurdle for better performance. Despite the success, existing provably efficient algorithms typically rely on the accessibility of immediate feedback upon taking actions. The failure to account for the impact of delay in observations can significantly… ▽ More

    Submitted 3 November, 2023; v1 submitted 29 October, 2023; originally announced October 2023.

  3. arXiv:2310.12026  [pdf, other

    stat.ML cs.LG stat.AP

    Nonparametric Discrete Choice Experiments with Machine Learning Guided Adaptive Design

    Authors: Mingzhang Yin, Ruijiang Gao, Weiran Lin, Steven M. Shugan

    Abstract: Designing products to meet consumers' preferences is essential for a business's success. We propose the Gradient-based Survey (GBS), a discrete choice experiment for multiattribute product design. The experiment elicits consumer preferences through a sequence of paired comparisons for partial profiles. GBS adaptively constructs paired comparison questions based on the respondents' previous choices… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  4. arXiv:2310.08824  [pdf, other

    cs.HC stat.ML

    Confounding-Robust Policy Improvement with Human-AI Teams

    Authors: Ruijiang Gao, Mingzhang Yin

    Abstract: Human-AI collaboration has the potential to transform various domains by leveraging the complementary strengths of human experts and Artificial Intelligence (AI) systems. However, unobserved confounding can undermine the effectiveness of this collaboration, leading to biased and unreliable outcomes. In this paper, we propose a novel solution to address unobserved confounding in human-AI collaborat… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 24 pages

  5. arXiv:2308.08858  [pdf, ps, other

    cs.LG cs.AI cs.GT stat.ML

    Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games

    Authors: Songtao Feng, Ming Yin, Yu-Xiang Wang, **g Yang, Yingbin Liang

    Abstract: The problem of two-player zero-sum Markov games has recently attracted increasing interests in theoretical studies of multi-agent reinforcement learning (RL). In particular, for finite-horizon episodic Markov decision processes (MDPs), it has been shown that model-based algorithms can find an $ε$-optimal Nash Equilibrium (NE) with the sample complexity of $O(H^3SAB/ε^2)$, which is optimal in the d… ▽ More

    Submitted 5 June, 2024; v1 submitted 17 August, 2023; originally announced August 2023.

  6. arXiv:2306.00861  [pdf, ps, other

    cs.LG stat.ML

    Non-stationary Reinforcement Learning under General Function Approximation

    Authors: Songtao Feng, Ming Yin, Ruiquan Huang, Yu-Xiang Wang, **g Yang, Yingbin Liang

    Abstract: General function approximation is a powerful tool to handle large state and action spaces in a broad range of reinforcement learning (RL) scenarios. However, theoretical understanding of non-stationary MDPs with general function approximation is still limited. In this paper, we make the first such an attempt. We first propose a new complexity metric called dynamic Bellman Eluder (DBE) dimension fo… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  7. arXiv:2303.09842  [pdf, ps, other

    eess.SY stat.ML

    Error Bounds for Kernel-Based Linear System Identification with Unknown Hyperparameters

    Authors: Mingzhou Yin, Roy S. Smith

    Abstract: The kernel-based method has been successfully applied in linear system identification using stable kernel designs. From a Gaussian process perspective, it automatically provides probabilistic error bounds for the identified models from the posterior covariance, which are useful in robust and stochastic control. However, the error bounds require knowledge of the true hyperparameters in the kernel d… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  8. arXiv:2302.12456  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Logarithmic Switching Cost in Reinforcement Learning beyond Linear MDPs

    Authors: Dan Qiao, Ming Yin, Yu-Xiang Wang

    Abstract: In many real-life reinforcement learning (RL) problems, deploying new policies is costly. In those scenarios, algorithms must solve exploration (which requires adaptivity) while switching the deployed policy sparsely (which limits adaptivity). In this paper, we go beyond the existing state-of-the-art on this problem that focused on linear Markov Decision Processes (MDPs) by considering linear Bell… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: 25 pages

  9. arXiv:2212.12767  [pdf, other

    stat.ML cs.LG

    Streaming Traffic Flow Prediction Based on Continuous Reinforcement Learning

    Authors: Yanan Xiao, Minyu Liu, Zichen Zhang, Lu Jiang, Minghao Yin, Jianan Wang

    Abstract: Traffic flow prediction is an important part of smart transportation. The goal is to predict future traffic conditions based on historical data recorded by sensors and the traffic network. As the city continues to build, parts of the transportation network will be added or modified. How to accurately predict expanding and evolving long-term streaming networks is of great significance. To this end,… ▽ More

    Submitted 24 December, 2022; originally announced December 2022.

  10. arXiv:2210.00750  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient

    Authors: Ming Yin, Mengdi Wang, Yu-Xiang Wang

    Abstract: Offline reinforcement learning, which aims at optimizing sequential decision-making strategies with historical data, has been extensively applied in real-life applications. State-Of-The-Art algorithms usually leverage powerful function approximators (e.g. neural networks) to alleviate the sample complexity hurdle for better empirical performances. Despite the successes, a more systematic understan… ▽ More

    Submitted 23 November, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

  11. arXiv:2208.06124  [pdf, other

    cs.LG stat.ML

    Gradient Estimation for Binary Latent Variables via Gradient Variance Clip**

    Authors: Russell Z. Kunes, Mingzhang Yin, Max Land, Doron Haviv, Dana Pe'er, Simon Tavaré

    Abstract: Gradient estimation is often necessary for fitting generative models with discrete latent variables, in contexts such as reinforcement learning and variational autoencoder (VAE) training. The DisARM estimator (Yin et al. 2020; Dong, Mnih, and Tucker 2020) achieves state of the art gradient variance for Bernoulli latent variable models in many contexts. However, DisARM and other estimators have pot… ▽ More

    Submitted 12 August, 2022; originally announced August 2022.

  12. arXiv:2206.06584  [pdf, other

    stat.ML cs.LG stat.ME

    Probabilistic Conformal Prediction Using Conditional Random Samples

    Authors: Zhendong Wang, Ruijiang Gao, Mingzhang Yin, Mingyuan Zhou, David M. Blei

    Abstract: This paper proposes probabilistic conformal prediction (PCP), a predictive inference algorithm that estimates a target variable by a discontinuous predictive set. Given inputs, PCP construct the predictive set based on random samples from an estimated generative model. It is efficient and compatible with either explicit or implicit conditional generative models. Theoretically, we show that PCP gua… ▽ More

    Submitted 20 June, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

  13. arXiv:2206.04921  [pdf, other

    cs.LG cs.AI stat.ML

    Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality

    Authors: Ming Yin, Wen**g Chen, Mengdi Wang, Yu-Xiang Wang

    Abstract: Goal-oriented Reinforcement Learning, where the agent needs to reach the goal state while simultaneously minimizing the cost, has received significant attention in real-world applications. Its theoretical formulation, stochastic shortest path (SSP), has been intensively researched in the online setting. Nevertheless, it remains understudied when such an online interaction is prohibited and only hi… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

    Comments: UAI 2022

  14. Infinite-Dimensional Sparse Learning in Linear System Identification

    Authors: Mingzhou Yin, Mehmet Tolga Akan, Andrea Iannelli, Roy S. Smith

    Abstract: Regularized methods have been widely applied to system identification problems without known model structures. This paper proposes an infinite-dimensional sparse learning algorithm based on atomic norm regularization. Atomic norm regularization decomposes the transfer function into first-order atomic models and solves a group lasso problem that selects a sparse set of poles and identifies the corr… ▽ More

    Submitted 31 August, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: Accepted for presentation at IEEE Conference on Decision and Control 2022

    Journal ref: 2022 IEEE 61st Conference on Decision and Control (CDC), Cancun, Mexico, 2022, pp. 850-855

  15. arXiv:2203.05804  [pdf, other

    cs.LG cs.AI stat.ML

    Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism

    Authors: Ming Yin, Yaqi Duan, Mengdi Wang, Yu-Xiang Wang

    Abstract: Offline reinforcement learning, which seeks to utilize offline/historical data to optimize sequential decision-making strategies, has gained surging prominence in recent studies. Due to the advantage that appropriate function approximators can help mitigate the sample complexity burden in modern reinforcement learning problems, existing endeavors usually enforce powerful function representation mo… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: ICLR 2022

  16. arXiv:2202.10665  [pdf, ps, other

    cs.LG stat.ME

    Partial Identification with Noisy Covariates: A Robust Optimization Approach

    Authors: Wenshuo Guo, Mingzhang Yin, Yixin Wang, Michael I. Jordan

    Abstract: Causal inference from observational datasets often relies on measuring and adjusting for covariates. In practice, measurements of the covariates can often be noisy and/or biased, or only measurements of their proxies may be available. Directly adjusting for these imperfect measurements of the covariates can lead to biased causal estimates. Moreover, without additional assumptions, the causal effec… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

    Comments: Proceedings of Conference on Causal Learning and Reasoning (CLeaR) 2022

  17. arXiv:2202.06385  [pdf, other

    cs.LG cs.AI stat.ML

    Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost

    Authors: Dan Qiao, Ming Yin, Ming Min, Yu-Xiang Wang

    Abstract: We study the problem of reinforcement learning (RL) with low (policy) switching cost - a problem well-motivated by real-life RL applications in which deployments of new policies are costly and the number of policy updates must be low. In this paper, we propose a new algorithm based on stage-wise exploration and adaptive policy elimination that achieves a regret of $\widetilde{O}(\sqrt{H^4S^2AT})$… ▽ More

    Submitted 4 June, 2022; v1 submitted 13 February, 2022; originally announced February 2022.

    Comments: 44 pages, 1 figure

  18. arXiv:2112.03493  [pdf, other

    stat.ME

    Conformal Sensitivity Analysis for Individual Treatment Effects

    Authors: Mingzhang Yin, Claudia Shi, Yixin Wang, David M. Blei

    Abstract: Estimating an individual treatment effect (ITE) is essential to personalized decision making. However, existing methods for estimating the ITE often rely on unconfoundedness, an assumption that is fundamentally untestable with observed data. To assess the robustness of individual-level causal conclusion with unconfoundedness, this paper proposes a method for sensitivity analysis of the ITE, a way… ▽ More

    Submitted 12 July, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

    Comments: Journal of the American Statistical Association

  19. arXiv:2110.08695  [pdf, other

    cs.LG cs.AI stat.ML

    Towards Instance-Optimal Offline Reinforcement Learning with Pessimism

    Authors: Ming Yin, Yu-Xiang Wang

    Abstract: We study the offline reinforcement learning (offline RL) problem, where the goal is to learn a reward-maximizing policy in an unknown Markov Decision Process (MDP) using the data coming from a policy $μ$. In particular, we consider the sample complexity problems of offline RL for finite-horizon MDPs. Prior works study this problem based on different data-coverage assumptions, and their learning gu… ▽ More

    Submitted 16 October, 2021; originally announced October 2021.

    Comments: NeurIPS, 2021

  20. arXiv:2109.11990  [pdf, other

    stat.ME cs.LG stat.ML

    Optimization-based Causal Estimation from Heterogenous Environments

    Authors: Mingzhang Yin, Yixin Wang, David M. Blei

    Abstract: This paper presents a new optimization approach to causal estimation. Given data that contains covariates and an outcome, which covariates are causes of the outcome, and what is the strength of the causality? In classical machine learning (ML), the goal of optimization is to maximize predictive accuracy. However, some covariates might exhibit a non-causal association with the outcome. Such spuriou… ▽ More

    Submitted 10 June, 2024; v1 submitted 24 September, 2021; originally announced September 2021.

    Comments: Journal of Machine Learning Research (JMLR). Code at https://github.com/mingzhang-yin/CoCo

  21. arXiv:2105.06029  [pdf, other

    cs.LG cs.AI stat.ML

    Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings

    Authors: Ming Yin, Yu-Xiang Wang

    Abstract: This work studies the statistical limits of uniform convergence for offline policy evaluation (OPE) problems with model-based methods (for episodic MDP) and provides a unified framework towards optimal learning for several well-motivated offline tasks. Uniform OPE $\sup_Π|Q^π-\hat{Q}^π|<ε$ is a stronger measure than the point-wise OPE and ensures offline learning when $Π$ contains all policies (th… ▽ More

    Submitted 24 June, 2021; v1 submitted 12 May, 2021; originally announced May 2021.

  22. arXiv:2102.01748  [pdf, other

    cs.LG cs.AI stat.ML

    Near-Optimal Offline Reinforcement Learning via Double Variance Reduction

    Authors: Ming Yin, Yu Bai, Yu-Xiang Wang

    Abstract: We consider the problem of offline reinforcement learning (RL) -- a well-motivated setting of RL that aims at policy optimization using only historical data. Despite its wide applicability, theoretical understandings of offline RL, such as its optimal sample complexity, remain largely open even in basic settings such as \emph{tabular} Markov Decision Processes (MDPs). In this paper, we propose O… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  23. arXiv:2009.09230  [pdf, other

    cs.LG stat.ML

    Simplifying Reinforced Feature Selection via Restructured Choice Strategy of Single Agent

    Authors: Xiaosa Zhao, Kunpeng Liu, Wei Fan, Lu Jiang, Xiaowei Zhao, Minghao Yin, Yanjie Fu

    Abstract: Feature selection aims to select a subset of features to optimize the performances of downstream predictive tasks. Recently, multi-agent reinforced feature selection (MARFS) has been introduced to automate feature selection, by creating agents for each feature to select or deselect corresponding features. Although MARFS enjoys the automation of the selection process, MARFS suffers from not just th… ▽ More

    Submitted 19 September, 2020; originally announced September 2020.

  24. arXiv:2007.03760  [pdf, other

    cs.LG cs.AI stat.ML

    Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning

    Authors: Ming Yin, Yu Bai, Yu-Xiang Wang

    Abstract: The problem of Offline Policy Evaluation (OPE) in Reinforcement Learning (RL) is a critical step towards applying RL in real-life applications. Existing work on OPE mostly focus on evaluating a fixed target policy $π$, which does not provide useful bounds for offline policy learning as $π$ will then be data-dependent. We address this problem by simultaneously evaluating all policies in a policy cl… ▽ More

    Submitted 1 December, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

    Comments: Short version presented at Offline RL workshop at Neurips, 2020

  25. arXiv:2006.06448  [pdf, other

    stat.ME

    Probabilistic Best Subset Selection via Gradient-Based Optimization

    Authors: Mingzhang Yin, Nhat Ho, Bowei Yan, Xiaoning Qian, Mingyuan Zhou

    Abstract: In high-dimensional statistics, variable selection recovers the latent sparse patterns from all possible covariate combinations. This paper proposes a novel optimization method to solve the exact L0-regularized regression problem, which is also known as the best subset selection. We reformulate the optimization problem from a discrete space to a continuous one via probabilistic reparameterization.… ▽ More

    Submitted 31 May, 2022; v1 submitted 11 June, 2020; originally announced June 2020.

  26. arXiv:2005.10477  [pdf, other

    cs.LG stat.ML

    Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and Self-Control Gradient Estimator

    Authors: Siamak Zamani Dadaneh, Shahin Boluki, Mingzhang Yin, Mingyuan Zhou, Xiaoning Qian

    Abstract: Semantic hashing has become a crucial component of fast similarity search in many large-scale information retrieval systems, in particular, for text data. Variational auto-encoders (VAEs) with binary latent variables as hashing codes provide state-of-the-art performance in terms of precision for document retrieval. We propose a pairwise loss function with discrete latent VAE to reward within-class… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

    Comments: To appear in UAI 2020

    Journal ref: Uncertainty in Artificial Intelligence Conference (UAI) 2020

  27. arXiv:2005.04366  [pdf, other

    cs.LG stat.ML

    Compressing Recurrent Neural Networks Using Hierarchical Tucker Tensor Decomposition

    Authors: Miao Yin, Siyu Liao, Xiao-Yang Liu, Xiaodong Wang, Bo Yuan

    Abstract: Recurrent Neural Networks (RNNs) have been widely used in sequence analysis and modeling. However, when processing high-dimensional data, RNNs typically require very large model sizes, thereby bringing a series of deployment challenges. Although the state-of-the-art tensor decomposition approaches can provide good model compression performance, these existing methods are still suffering some inher… ▽ More

    Submitted 9 May, 2020; originally announced May 2020.

  28. arXiv:2002.03534  [pdf, other

    stat.ML cs.LG

    Discrete Action On-Policy Learning with Action-Value Critic

    Authors: Yuguang Yue, Yunhao Tang, Mingzhang Yin, Mingyuan Zhou

    Abstract: Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension, making it challenging to apply existing on-policy gradient based deep RL algorithms efficiently. To effectively operate in multidimensional discrete action spaces, we construct a critic to estimate action-value functions, apply it on… ▽ More

    Submitted 21 February, 2020; v1 submitted 9 February, 2020; originally announced February 2020.

  29. arXiv:2001.10742  [pdf, other

    cs.LG cs.AI stat.ML

    Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning

    Authors: Ming Yin, Yu-Xiang Wang

    Abstract: We consider the problem of off-policy evaluation for reinforcement learning, where the goal is to estimate the expected reward of a target policy $π$ using offline data collected by running a logging policy $μ$. Standard importance-sampling based approaches for this problem suffer from a variance that scales exponentially with time horizon $H$, which motivates a splurge of recent interest in alter… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

    Comments: Includes appendix. Accepted for AISTATS 2020

    Journal ref: International Conference on Artificial Intelligence and Statistics, 108 (2020) 3948-3958

  30. arXiv:1912.03820  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-Learning without Memorization

    Authors: Mingzhang Yin, George Tucker, Mingyuan Zhou, Sergey Levine, Chelsea Finn

    Abstract: The ability to learn new concepts with small amounts of data is a critical aspect of intelligence that has proven challenging for deep learning methods. Meta-learning has emerged as a promising technique for leveraging data from previous tasks to enable efficient learning of new tasks. However, most meta-learning algorithms implicitly require that the meta-training tasks be mutually-exclusive, suc… ▽ More

    Submitted 27 April, 2020; v1 submitted 8 December, 2019; originally announced December 2019.

    Comments: ICLR 2020

  31. arXiv:1905.12659  [pdf, other

    stat.ML cs.LG

    Semi-Implicit Generative Model

    Authors: Mingzhang Yin, Mingyuan Zhou

    Abstract: To combine explicit and implicit generative models, we introduce semi-implicit generator (SIG) as a flexible hierarchical model that can be trained in the maximum likelihood framework. Both theoretically and experimentally, we demonstrate that SIG can generate high quality samples especially when dealing with multi-modality. By introducing SIG as an unbiased regularizer to the generative adversari… ▽ More

    Submitted 28 July, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: Third workshop on Bayesian Deep Learning (NeurIPS 2018), Montreal, Canada

  32. arXiv:1905.01413  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient Backpropagation Through Categorical Variables

    Authors: Mingzhang Yin, Yuguang Yue, Mingyuan Zhou

    Abstract: To address the challenge of backpropagating the gradient through categorical variables, we propose the augment-REINFORCE-swap-merge (ARSM) gradient estimator that is unbiased and has low variance. ARSM first uses variable augmentation, REINFORCE, and Rao-Blackwellization to re-express the gradient as an expectation under the Dirichlet distribution, then uses variable swap** to construct differen… ▽ More

    Submitted 21 December, 2019; v1 submitted 3 May, 2019; originally announced May 2019.

    Comments: Published in ICML 2019. We have updated Section 4.2 and the Appendix to reflect the improvements brought by fixing some bugs hidden in our original code. Please find the Errata in the authors' websites and check the updated code in Github

  33. arXiv:1904.07568  [pdf, other

    cs.LG hep-th stat.ML

    On the Mathematical Understanding of ResNet with Feynman Path Integral

    Authors: Minghao Yin, Xiu Li, Yongbing Zhang, Shiqi Wang

    Abstract: In this paper, we aim to understand Residual Network (ResNet) in a scientifically sound way by providing a bridge between ResNet and Feynman path integral. In particular, we prove that the effect of residual block is equivalent to partial differential equation, and the ResNet transforming process can be equivalently converted to Feynman path integral. These conclusions greatly help us mathematical… ▽ More

    Submitted 16 April, 2019; originally announced April 2019.

  34. arXiv:1903.05284  [pdf, other

    cs.LG cs.AI stat.ML

    Augment-Reinforce-Merge Policy Gradient for Binary Stochastic Policy

    Authors: Yunhao Tang, Mingzhang Yin, Mingyuan Zhou

    Abstract: Due to the high variance of policy gradients, on-policy optimization algorithms are plagued with low sample efficiency. In this work, we propose Augment-Reinforce-Merge (ARM) policy gradient estimator as an unbiased low-variance alternative to previous baseline estimators on tasks with binary action space, inspired by the recent ARM gradient estimator for discrete random variable models. We show t… ▽ More

    Submitted 12 March, 2019; originally announced March 2019.

  35. arXiv:1807.11143  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    ARM: Augment-REINFORCE-Merge Gradient for Stochastic Binary Networks

    Authors: Mingzhang Yin, Mingyuan Zhou

    Abstract: To backpropagate the gradients through stochastic binary layers, we propose the augment-REINFORCE-merge (ARM) estimator that is unbiased, exhibits low variance, and has low computational complexity. Exploiting variable augmentation, REINFORCE, and reparameterization, the ARM estimator achieves adaptive variance reduction for Monte Carlo integration by merging two expectations via common random num… ▽ More

    Submitted 9 September, 2019; v1 submitted 29 July, 2018; originally announced July 2018.

    Comments: ICLR 2019

  36. arXiv:1805.11183  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Semi-Implicit Variational Inference

    Authors: Mingzhang Yin, Mingyuan Zhou

    Abstract: Semi-implicit variational inference (SIVI) is introduced to expand the commonly used analytic variational distribution family, by mixing the variational parameter with a flexible distribution. This mixing distribution can assume any density function, explicit or not, as long as independent random samples can be generated via reparameterization. Not only does SIVI expand the variational family to i… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

    Comments: ICML 2018