Skip to main content

Showing 1–50 of 67 results for author: Dai, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.19320  [pdf, other

    cs.LG cs.AI stat.ML

    Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF

    Authors: Shicong Cen, **cheng Mei, Katayoon Goshvadi, Hanjun Dai, Tong Yang, Sherry Yang, Dale Schuurmans, Yuejie Chi, Bo Dai

    Abstract: Reinforcement learning from human feedback (RLHF) has demonstrated great promise in aligning large language models (LLMs) with human preference. Depending on the availability of preference data, both online and offline RLHF are active areas of investigation. A key bottleneck is understanding how to incorporate uncertainty estimation in the reward function learned from the preference data for RLHF,… ▽ More

    Submitted 4 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  2. arXiv:2311.12244  [pdf, other

    cs.LG cs.AI stat.ML

    Provable Representation with Efficient Planning for Partial Observable Reinforcement Learning

    Authors: Hongming Zhang, Tongzheng Ren, Chenjun Xiao, Dale Schuurmans, Bo Dai

    Abstract: In most real-world reinforcement learning applications, state information is only partially observable, which breaks the Markov decision process assumption and leads to inferior performance for algorithms that conflate observations with state. Partially Observable Markov Decision Processes (POMDPs), on the other hand, provide a general framework that allows for partial observability to be accounte… ▽ More

    Submitted 10 June, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: The first two authors contribute equally

  3. arXiv:2212.08765  [pdf, other

    cs.LG stat.ML

    Latent Variable Representation for Reinforcement Learning

    Authors: Tongzheng Ren, Chenjun Xiao, Tianjun Zhang, Na Li, Zhaoran Wang, Sujay Sanghavi, Dale Schuurmans, Bo Dai

    Abstract: Deep latent variable models have achieved significant empirical successes in model-based reinforcement learning (RL) due to their expressiveness in modeling complex transition dynamics. On the other hand, it remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of RL. In this paper, we provide a… ▽ More

    Submitted 7 March, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

    Comments: ICLR 2023. The first two authors contribute equally. Project Website: https://rlrep.github.io/lvrep/

  4. arXiv:2211.10061  [pdf, other

    stat.ML cs.AI cs.LG stat.AP stat.ME

    Data-Adaptive Discriminative Feature Localization with Statistically Guaranteed Interpretation

    Authors: Ben Dai, Xiaotong Shen, Lin Yee Chen, Chunlin Li, Wei Pan

    Abstract: In explainable artificial intelligence, discriminative feature localization is critical to reveal a blackbox model's decision-making process from raw data to prediction. In this article, we use two real datasets, the MNIST handwritten digits and MIT-BIH Electrocardiogram (ECG) signals, to motivate key characteristics of discriminative features, namely adaptiveness, predictive importance and effect… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: 27 pages, 11 figures

    Journal ref: The Annals of Applied Statistics, 2022

  5. arXiv:2211.07767  [pdf, other

    stat.ML cs.LG math.OC

    Learning to Optimize with Stochastic Dominance Constraints

    Authors: Hanjun Dai, Yuan Xue, Niao He, Bethany Wang, Na Li, Dale Schuurmans, Bo Dai

    Abstract: In real-world decision-making, uncertainty is important yet difficult to handle. Stochastic dominance provides a theoretically sound approach for comparing uncertain quantities, but optimization with stochastic dominance constraints is often computationally expensive, which limits practical applicability. In this paper, we develop a simple yet efficient approach for the problem, the Light Stochast… ▽ More

    Submitted 24 February, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: Accepted to the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023)

  6. arXiv:2209.08889  [pdf, other

    stat.ME stat.AP stat.CO

    Inference of nonlinear causal effects with GWAS summary data

    Authors: Ben Dai, Chunlin Li, Haoran Xue, Wei Pan, Xiaotong Shen

    Abstract: Large-scale genome-wide association studies (GWAS) have offered an exciting opportunity to discover putative causal genes or risk factors associated with diseases by using SNPs as instrumental variables (IVs). However, conventional approaches assume linear causal relations partly for simplicity and partly for the availability of GWAS summary data. In this work, we propose a novel model {for transc… ▽ More

    Submitted 26 October, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: 33 pages, 11 figures

  7. arXiv:2208.09515  [pdf, other

    cs.LG stat.ML

    Spectral Decomposition Representation for Reinforcement Learning

    Authors: Tongzheng Ren, Tianjun Zhang, Lisa Lee, Joseph E. Gonzalez, Dale Schuurmans, Bo Dai

    Abstract: Representation learning often plays a critical role in reinforcement learning by managing the curse of dimensionality. A representative class of algorithms exploits a spectral decomposition of the stochastic transition dynamics to construct representations that enjoy strong theoretical properties in an idealized setting. However, current spectral methods suffer from limited applicability because t… ▽ More

    Submitted 7 March, 2023; v1 submitted 19 August, 2022; originally announced August 2022.

    Comments: ICLR 2023. The first two authors contribute equally

  8. arXiv:2207.07150  [pdf, other

    cs.LG stat.ML

    Making Linear MDPs Practical via Contrastive Representation Learning

    Authors: Tianjun Zhang, Tongzheng Ren, Mengjiao Yang, Joseph E. Gonzalez, Dale Schuurmans, Bo Dai

    Abstract: It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations. This motivates much of the recent theoretical study on linear MDPs. However, most approaches require a given representation under unrealistic assumptions about the normalization of the decomposition or introduce unresolved computational challenges in practice. Instead, we… ▽ More

    Submitted 7 December, 2022; v1 submitted 14 July, 2022; originally announced July 2022.

    Comments: ICML 2022. The first two authors contribute equally

  9. arXiv:2206.13086  [pdf, other

    stat.ML cs.CV cs.LG math.ST

    RankSEG: A Consistent Ranking-based Framework for Segmentation

    Authors: Ben Dai, Chunlin Li

    Abstract: Segmentation has emerged as a fundamental field of computer vision and natural language processing, which assigns a label to every pixel/feature to extract regions of interest from an image/text. To evaluate the performance of segmentation, the Dice and IoU metrics are used to measure the degree of overlap between the ground truth and the predicted segmentation. In this paper, we establish a theor… ▽ More

    Submitted 13 November, 2023; v1 submitted 27 June, 2022; originally announced June 2022.

    Comments: 50 pages

    MSC Class: 62C05; 62C12 ACM Class: G.3; I.4.6

    Journal ref: Journal of Machine Learning Research, 24(224), 1-50 (2023)

  10. arXiv:2205.14240  [pdf, other

    stat.ML cond-mat.stat-mech cs.LG physics.data-an stat.CO

    Deterministic Langevin Monte Carlo with Normalizing Flows for Bayesian Inference

    Authors: Richard D. P. Grumitt, Biwei Dai, Uros Seljak

    Abstract: We propose a general purpose Bayesian inference algorithm for expensive likelihoods, replacing the stochastic term in the Langevin equation with a deterministic density gradient term. The particle density is evaluated from the current particle positions using a Normalizing Flow (NF), which is differentiable and has good generalization properties in high dimensions. We take advantage of NF precondi… ▽ More

    Submitted 13 October, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: 17 pages, 9 figures, Accepted at NeurIPS 2022

    MSC Class: 62-08

  11. arXiv:2112.12320  [pdf, other

    cs.LG stat.ML

    Model Selection in Batch Policy Optimization

    Authors: Jonathan N. Lee, George Tucker, Ofir Nachum, Bo Dai

    Abstract: We study the problem of model selection in batch policy optimization: given a fixed, partial-feedback dataset and $M$ model classes, learn a policy with performance that is competitive with the policy derived from the best model class. We formalize the problem in the contextual bandit setting with linear model classes by identifying three sources of error that any model selection algorithm should… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

  12. arXiv:2112.00874  [pdf, other

    cs.LG stat.ML

    Neural Stochastic Dual Dynamic Programming

    Authors: Hanjun Dai, Yuan Xue, Zia Syed, Dale Schuurmans, Bo Dai

    Abstract: Stochastic dual dynamic programming (SDDP) is a state-of-the-art method for solving multi-stage stochastic optimization, widely used for modeling real-world process optimization tasks. Unfortunately, SDDP has a worst-case complexity that scales exponentially in the number of decision variables, which severely limits applicability to only low dimensional problems. To overcome this limitation, we ex… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

    Comments: 24 pages

  13. arXiv:2111.11485  [pdf, other

    stat.ML cs.AI cs.LG

    A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning

    Authors: Tongzheng Ren, Tianjun Zhang, Csaba Szepesvári, Bo Dai

    Abstract: Representation learning lies at the heart of the empirical success of deep learning for dealing with the curse of dimensionality. However, the power of representation learning has not been fully exploited yet in reinforcement learning (RL), due to i), the trade-off between expressiveness and tractability; and ii), the coupling between exploration and representation learning. In this paper, we firs… ▽ More

    Submitted 7 March, 2023; v1 submitted 22 November, 2021; originally announced November 2021.

    Comments: UAI 2022. The first two authors contribute equally

  14. arXiv:2111.09423  [pdf, ps, other

    stat.ME

    Return-to-baseline multiple imputation for missing values in clinical trials

    Authors: Yongming Qu, Biyue Dai

    Abstract: Return-to-baseline is an important method to impute missing values or unobserved potential outcomes when certain hypothetical strategies are used to handle intercurrent events in clinical trials. Current return-to-baseline approaches seen in literature and in practice inflate the variability of the "complete" dataset after imputation and lead to biased mean estimators {when the probability of miss… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

    Comments: 25 pages

  15. arXiv:2110.06116  [pdf, other

    cs.IR cs.LG math.ST stat.ML

    Two-level monotonic multistage recommender systems

    Authors: Ben Dai, Xiaotong Shen, Wei Pan

    Abstract: A recommender system learns to predict the user-specific preference or intention over many items simultaneously for all users, making personalized recommendations based on a relatively small number of observations. One central issue is how to leverage three-way interactions, referred to as user-item-stage dependencies on a monotonic chain of events, to enhance the prediction accuracy. A monotonic… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

    Journal ref: 2021

  16. arXiv:2103.14077  [pdf, ps, other

    stat.ML cs.LG

    Nearly Horizon-Free Offline Reinforcement Learning

    Authors: Tongzheng Ren, Jialian Li, Bo Dai, Simon S. Du, Sujay Sanghavi

    Abstract: We revisit offline reinforcement learning on episodic time-homogeneous Markov Decision Processes (MDP). For tabular MDP with $S$ states and $A$ actions, or linear MDP with anchor points and feature dimension $d$, given the collected $K$ episodes data with minimum visiting probability of (anchor) state-action pairs $d_m$, we obtain nearly horizon $H$-free sample complexity bounds for offline reinfo… ▽ More

    Submitted 9 February, 2022; v1 submitted 25 March, 2021; originally announced March 2021.

    Comments: NeurIPS 2021

  17. arXiv:2103.04985  [pdf, other

    stat.ML cs.LG stat.ME

    Significance tests of feature relevance for a black-box learner

    Authors: Ben Dai, Xiaotong Shen, Wei Pan

    Abstract: An exciting recent development is the uptake of deep neural networks in many scientific fields, where the main objective is outcome prediction with the black-box nature. Significance testing is promising to address the black-box issue and explore novel scientific insights and interpretation of the decision-making process based on a deep learning model. However, testing for a neural network poses a… ▽ More

    Submitted 21 June, 2022; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: Accepted for publication in IEEE Transactions on Neural Networks and Learning Systems

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2022

  18. arXiv:2010.13064  [pdf, other

    stat.ML cs.LG

    Further Analysis of Outlier Detection with Deep Generative Models

    Authors: Ziyu Wang, Bin Dai, David Wipf, Jun Zhu

    Abstract: The recent, counter-intuitive discovery that deep generative models (DGMs) can frequently assign a higher likelihood to outliers has implications for both outlier detection applications as well as our overall understanding of generative modeling. In this work, we present a possible explanation for this phenomenon, starting from the observation that a model's typical set and high-density region may… ▽ More

    Submitted 25 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2020

  19. arXiv:2010.11652  [pdf, other

    cs.LG stat.ML

    CoinDICE: Off-Policy Confidence Interval Estimation

    Authors: Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, Dale Schuurmans

    Abstract: We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning, where the goal is to estimate a confidence interval on a target policy's value, given only access to a static experience dataset collected by unknown behavior policies. Starting from a function space embedding of the linear program formulation of the $Q$-function, we obtain an optimization problem with gene… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: To appear at NeurIPS 2020 as spotlight

  20. arXiv:2008.05808  [pdf, other

    cs.LG stat.ML

    Small Towers Make Big Differences

    Authors: Yuyan Wang, Zhe Zhao, Bo Dai, Christopher Fifty, Dong Lin, Lichan Hong, Ed H. Chi

    Abstract: Multi-task learning aims at solving multiple machine learning tasks at the same time. A good solution to a multi-task learning problem should be generalizable in addition to being Pareto optimal. In this paper, we provide some insights on understanding the trade-off between Pareto efficiency and generalization as a result of parameterization in multi-task deep learning models. As a multi-objective… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

  21. arXiv:2007.03438  [pdf, other

    cs.LG math.OC stat.ML

    Off-Policy Evaluation via the Regularized Lagrangian

    Authors: Mengjiao Yang, Ofir Nachum, Bo Dai, Lihong Li, Dale Schuurmans

    Abstract: The recently proposed distribution correction estimation (DICE) family of estimators has advanced the state of the art in off-policy evaluation from behavior-agnostic data. While these estimators all perform some form of stationary distribution correction, they arise from different derivations and objective functions. In this paper, we unify these estimators as regularized Lagrangians of the same… ▽ More

    Submitted 24 July, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

  22. arXiv:2007.01290  [pdf, other

    stat.ML cs.LG

    Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach

    Authors: Luofeng Liao, You-Lin Chen, Zhuoran Yang, Bo Dai, Zhaoran Wang, Mladen Kolar

    Abstract: Structural equation models (SEMs) are widely used in sciences, ranging from economics to psychology, to uncover causal relationships underlying a complex system under consideration and estimate structural parameters of interest. We study estimation in a class of generalized SEMs where the object of interest is defined as the solution to a linear operator equation. We formulate the linear operator… ▽ More

    Submitted 20 October, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

    Comments: - v1: Submitted to NeurIPS 2020. Under review - v2: Revised after NeurIPS reviews. Major updates: (i) clean presentation of consistency results; (ii) more references for conditional moment problems - v3: Add references

  23. arXiv:2007.00674  [pdf, other

    cs.LG stat.ML

    Sliced Iterative Normalizing Flows

    Authors: Biwei Dai, Uros Seljak

    Abstract: We develop an iterative (greedy) deep learning (DL) algorithm which is able to transform an arbitrary probability distribution function (PDF) into the target PDF. The model is based on iterative Optimal Transport of a series of 1D slices, matching on each slice the marginal PDF to the target. The axes of the orthogonal slices are chosen to maximize the PDF difference using Wasserstein distance at… ▽ More

    Submitted 14 June, 2021; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: 19 pages, 12 figures, 7 tables. Code available at https://github.com/biweidai/SINF

  24. arXiv:2006.15502  [pdf, other

    cs.LG stat.ML

    Scalable Deep Generative Modeling for Sparse Graphs

    Authors: Hanjun Dai, Azade Nazi, Yujia Li, Bo Dai, Dale Schuurmans

    Abstract: Learning graph generative models is a challenging task for deep learning and has wide applicability to a range of domains like chemistry, biology and social science. However current deep neural methods suffer from limited scalability: for a graph with $n$ nodes and $m$ edges, existing deep neural methods require $Ω(n^2)$ complexity by building up the adjacency matrix. On the other hand, many real… ▽ More

    Submitted 28 June, 2020; originally announced June 2020.

    Comments: ICML 2020

  25. arXiv:2006.06600  [pdf, other

    cs.LG cs.AI stat.ML

    Zeroth-Order Supervised Policy Improvement

    Authors: Hao Sun, Zi** Xu, Yuhang Song, Meng Fang, Jiechao Xiong, Bo Dai, Bolei Zhou

    Abstract: Policy gradient (PG) algorithms have been widely used in reinforcement learning (RL). However, PG algorithms rely on exploiting the value function being learned with the first-order update locally, which results in limited sample efficiency. In this work, we propose an alternative method called Zeroth-Order Supervised Policy Improvement (ZOSPI). ZOSPI exploits the estimated value function $Q$ glob… ▽ More

    Submitted 5 July, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

  26. arXiv:2005.10696  [pdf, other

    cs.LG cs.AI stat.ML

    Novel Policy Seeking with Constrained Optimization

    Authors: Hao Sun, Zhenghao Peng, Bo Dai, Jian Guo, Dahua Lin, Bolei Zhou

    Abstract: In problem-solving, we humans can come up with multiple novel solutions to the same problem. However, reinforcement learning algorithms can only produce a set of monotonous policies that maximize the cumulative reward but lack diversity and novelty. In this work, we address the problem of generating novel policies in reinforcement learning tasks. Instead of following the multi-objective framework… ▽ More

    Submitted 29 October, 2022; v1 submitted 21 May, 2020; originally announced May 2020.

  27. arXiv:2004.12909  [pdf, other

    cs.LG stat.ML

    Evolutionary Stochastic Policy Distillation

    Authors: Hao Sun, Xinyu Pan, Bo Dai, Dahua Lin, Bolei Zhou

    Abstract: Solving the Goal-Conditioned Reward Sparse (GCRS) task is a challenging reinforcement learning problem due to the sparsity of reward signals. In this work, we propose a new formulation of GCRS tasks from the perspective of the drifted random walk on the state space, and design a novel method called Evolutionary Stochastic Policy Distillation (ESPD) to solve them based on the insight of reducing th… ▽ More

    Submitted 30 April, 2020; v1 submitted 27 April, 2020; originally announced April 2020.

  28. arXiv:2004.00530  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Learning Sparse Rewarded Tasks from Sub-Optimal Demonstrations

    Authors: Zhuangdi Zhu, Kaixiang Lin, Bo Dai, Jiayu Zhou

    Abstract: Model-free deep reinforcement learning (RL) has demonstrated its superiority on many complex sequential decision-making problems. However, heavy dependence on dense rewards and high sample-complexity impedes the wide adoption of these methods in real-world scenarios. On the other hand, imitation learning (IL) learns effectively in sparse-rewarded tasks by leveraging the existing expert demonstrati… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

  29. arXiv:2003.07521  [pdf, other

    cs.LG stat.ML

    Energy-Based Processes for Exchangeable Data

    Authors: Mengjiao Yang, Bo Dai, Hanjun Dai, Dale Schuurmans

    Abstract: Recently there has been growing interest in modeling sets with exchangeability such as point clouds. A shortcoming of current approaches is that they restrict the cardinality of the sets considered or can only express limited forms of distribution over unobserved data. To overcome these limitations, we introduce Energy-Based Processes (EBPs), which extend energy based models to exchangeable data w… ▽ More

    Submitted 8 July, 2020; v1 submitted 17 March, 2020; originally announced March 2020.

    Journal ref: PMLR 119:2302-2312, 2020

  30. arXiv:2003.00722  [pdf, other

    cs.LG cs.AI stat.ML

    Batch Stationary Distribution Estimation

    Authors: Junfeng Wen, Bo Dai, Lihong Li, Dale Schuurmans

    Abstract: We consider the problem of approximating the stationary distribution of an ergodic Markov chain given a set of sampled transitions. Classical simulation-based approaches assume access to the underlying process so that trajectories of sufficient length can be gathered to approximate stationary sampling. Instead, we consider an alternative setting where a fixed set of transitions has been collected… ▽ More

    Submitted 2 March, 2020; originally announced March 2020.

  31. arXiv:2002.09072  [pdf, other

    stat.ML cs.LG

    GenDICE: Generalized Offline Estimation of Stationary Values

    Authors: Ruiyi Zhang, Bo Dai, Lihong Li, Dale Schuurmans

    Abstract: An important problem that arises in reinforcement learning and Monte Carlo methods is estimating quantities defined by the stationary distribution of a Markov chain. In many real-world applications, access to the underlying transition operator is limited to a fixed set of data that has already been collected, without additional interaction with the environment being available. We show that consist… ▽ More

    Submitted 20 February, 2020; originally announced February 2020.

    Comments: ICLR 2020

  32. arXiv:2002.06504  [pdf, other

    cs.LG stat.ML

    Differentiable Top-k Operator with Optimal Transport

    Authors: Yujia Xie, Hanjun Dai, Minshuo Chen, Bo Dai, Tuo Zhao, Hongyuan Zha, Wei Wei, Tomas Pfister

    Abstract: The top-k operation, i.e., finding the k largest or smallest elements from a collection of scores, is an important model component, which is widely used in information retrieval, machine learning, and data mining. However, if the top-k operation is implemented in an algorithmic way, e.g., using bubble algorithm, the resulting model cannot be trained in an end-to-end way using prevalent gradient de… ▽ More

    Submitted 18 February, 2020; v1 submitted 15 February, 2020; originally announced February 2020.

  33. arXiv:2002.05512  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    Real or Not Real, that is the Question

    Authors: Yuanbo Xiangli, Yubin Deng, Bo Dai, Chen Change Loy, Dahua Lin

    Abstract: While generative adversarial networks (GAN) have been widely adopted in various topics, in this paper we generalize the standard GAN to a new perspective by treating realness as a random variable that can be estimated from multiple angles. In this generalized framework, referred to as RealnessGAN, the discriminator outputs a distribution as the measure of realness. While RealnessGAN shares similar… ▽ More

    Submitted 12 February, 2020; originally announced February 2020.

    Comments: ICLR2020 spotlight. 1) train GAN by maximizing kl-divergence. 2) train non-progressive GAN (DCGAN) architecture at 1024*1024 resolution

  34. arXiv:2001.01866  [pdf, other

    cs.LG stat.ML

    Reinforcement Learning via Fenchel-Rockafellar Duality

    Authors: Ofir Nachum, Bo Dai

    Abstract: We review basic concepts of convex duality, focusing on the very general and supremely useful Fenchel-Rockafellar duality. We summarize how this duality may be applied to a variety of reinforcement learning (RL) settings, including policy evaluation or optimization, online or offline learning, and discounted or undiscounted rewards. The derivations yield a number of intriguing results, including t… ▽ More

    Submitted 9 January, 2020; v1 submitted 6 January, 2020; originally announced January 2020.

  35. arXiv:2001.01408  [pdf, other

    cs.LG stat.ML

    Retrosynthesis Prediction with Conditional Graph Logic Network

    Authors: Hanjun Dai, Chengtao Li, Connor W. Coley, Bo Dai, Le Song

    Abstract: Retrosynthesis is one of the fundamental problems in organic chemistry. The task is to identify reactants that can be used to synthesize a specified product molecule. Recently, computer-aided retrosynthesis is finding renewed interest from both chemistry and computer science communities. Most existing approaches rely on template-based models that define subgraph matching rules, but whether or not… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

    Comments: NeurIPS 2019

  36. arXiv:1912.10702  [pdf, other

    cs.LG cs.CV stat.ML

    The Usual Suspects? Reassessing Blame for VAE Posterior Collapse

    Authors: Bin Dai, Ziyu Wang, David Wipf

    Abstract: In narrow asymptotic settings Gaussian VAE models of continuous data have been shown to possess global optima aligned with ground-truth distributions. Even so, it is well known that poor solutions whereby the latent posterior collapses to an uninformative prior are sometimes obtained in practice. However, contrary to conventional wisdom that largely assigns blame for this phenomena on the undue in… ▽ More

    Submitted 23 December, 2019; originally announced December 2019.

  37. arXiv:1912.01238  [pdf, other

    cs.LG stat.ML

    Overcoming Catastrophic Forgetting by Generative Regularization

    Authors: Patrick H. Chen, Wei Wei, Cho-jui Hsieh, Bo Dai

    Abstract: In this paper, we propose a new method to overcome catastrophic forgetting by adding generative regularization to Bayesian inference framework. Bayesian method provides a general framework for continual learning. We could further construct a generative regularization term for all given classification models by leveraging energy-based models and Langevin-dynamic sampling to enrich the features lear… ▽ More

    Submitted 19 June, 2021; v1 submitted 3 December, 2019; originally announced December 2019.

  38. arXiv:1910.14265  [pdf, other

    cs.LG stat.ML

    Energy-Inspired Models: Learning with Sampler-Induced Distributions

    Authors: Dieterich Lawson, George Tucker, Bo Dai, Rajesh Ranganath

    Abstract: Energy-based models (EBMs) are powerful probabilistic models, but suffer from intractable sampling and density evaluation due to the partition function. As a result, inference in EBMs relies on approximate sampling algorithms, leading to a mismatch between the model and inference. Motivated by this, we consider the sampler-induced distribution as the model of interest and maximize the likelihood o… ▽ More

    Submitted 9 January, 2020; v1 submitted 31 October, 2019; originally announced October 2019.

    Comments: Presented at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

  39. arXiv:1906.04733  [pdf, other

    cs.LG cs.AI stat.ML

    DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

    Authors: Ofir Nachum, Yinlam Chow, Bo Dai, Lihong Li

    Abstract: In many real-world reinforcement learning applications, access to the environment is limited to a fixed dataset, instead of direct (online) interaction with the environment. When using this data for either evaluation or training of a new policy, accurate estimates of discounted stationary distribution ratios -- correction terms which quantify the likelihood that the new policy will experience a ce… ▽ More

    Submitted 4 November, 2019; v1 submitted 10 June, 2019; originally announced June 2019.

    Comments: Appearing in NeurIPS 2019 Vancouver, Canada

  40. arXiv:1906.00291  [pdf, other

    cs.LG stat.ML

    Cooperative neural networks (CoNN): Exploiting prior independence structure for improved classification

    Authors: Harsh Shrivastava, Eugene Bart, Bob Price, Hanjun Dai, Bo Dai, Srinivas Aluru

    Abstract: We propose a new approach, called cooperative neural networks (CoNN), which uses a set of cooperatively trained neural networks to capture latent representations that exploit prior given independence structure. The model is more flexible than traditional graphical models based on exponential family distributions, but incorporates more domain specific prior structure than traditional deep networks… ▽ More

    Submitted 1 June, 2019; originally announced June 2019.

  41. arXiv:1905.10432  [pdf, other

    stat.ME

    Cross validation approaches for penalized Cox regression

    Authors: Biyue Dai, Patrick Breheny

    Abstract: Cross validation is commonly used for selecting tuning parameters in penalized regression, but its use in penalized Cox regression models has received relatively little attention in the literature. Due to its partial likelihood construction, carrying out cross validation for Cox models is not straightforward, and there are several potential approaches for implementation. Here, we propose two new c… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

    Comments: 13 pages, 6 figures

  42. arXiv:1904.12083  [pdf, other

    cs.LG stat.CO stat.ML

    Exponential Family Estimation via Adversarial Dynamics Embedding

    Authors: Bo Dai, Zhen Liu, Hanjun Dai, Niao He, Arthur Gretton, Le Song, Dale Schuurmans

    Abstract: We present an efficient algorithm for maximum likelihood estimation (MLE) of exponential family models, with a general parametrization of the energy function that includes neural networks. We exploit the primal-dual view of the MLE with a kinetics augmented model to obtain an estimate associated with an adversarial dual sampler. To represent this sampler, we introduce a novel neural architecture,… ▽ More

    Submitted 30 March, 2020; v1 submitted 26 April, 2019; originally announced April 2019.

    Comments: Appearing in NeurIPS 2019 Vancouver, Canada; a preliminary version published in NeurIPS2018 Bayesian Deep Learning Workshop

  43. arXiv:1903.05789  [pdf, other

    cs.LG cs.CV stat.ML

    Diagnosing and Enhancing VAE Models

    Authors: Bin Dai, David Wipf

    Abstract: Although variational autoencoders (VAEs) represent a widely influential deep generative model, many aspects of the underlying energy function remain poorly understood. In particular, it is commonly believed that Gaussian encoder/decoder assumptions reduce the effectiveness of VAEs in generating realistic samples. In this regard, we rigorously analyze the VAE objective, differentiating situations w… ▽ More

    Submitted 30 October, 2019; v1 submitted 13 March, 2019; originally announced March 2019.

  44. arXiv:1903.00070  [pdf, other

    cs.LG cs.RO stat.ML

    Learning to Plan in High Dimensions via Neural Exploration-Exploitation Trees

    Authors: Binghong Chen, Bo Dai, Qinjie Lin, Guo Ye, Han Liu, Le Song

    Abstract: We propose a meta path planning algorithm named \emph{Neural Exploration-Exploitation Trees~(NEXT)} for learning from prior experience for solving new path planning problems in high dimensional continuous state and action spaces. Compared to more classical sampling-based methods like RRT, our approach achieves much better sample efficiency in high-dimensions and can benefit from prior experience o… ▽ More

    Submitted 23 February, 2020; v1 submitted 28 February, 2019; originally announced March 2019.

    Comments: 26 pages, 74 figures, ICLR 2020 spotlight

  45. arXiv:1812.09584  [pdf, other

    cs.LG stat.ML

    Meta Architecture Search

    Authors: Albert Shaw, Wei Wei, Weiyang Liu, Le Song, Bo Dai

    Abstract: Neural Architecture Search (NAS) has been quite successful in constructing state-of-the-art models on a variety of tasks. Unfortunately, the computational cost can make it difficult to scale. In this paper, we make the first attempt to study Meta Architecture Search which aims at learning a task-agnostic representation that can be used to speed up the process of architecture search on a large numb… ▽ More

    Submitted 15 November, 2019; v1 submitted 22 December, 2018; originally announced December 2018.

    Comments: 11 pages, 4 figures, 4 tables, 4 pages of appendix; NeurIPS 2019

  46. arXiv:1811.02228  [pdf, other

    cs.LG stat.ML

    Kernel Exponential Family Estimation via Doubly Dual Embedding

    Authors: Bo Dai, Hanjun Dai, Arthur Gretton, Le Song, Dale Schuurmans, Niao He

    Abstract: We investigate penalized maximum log-likelihood estimation for exponential family distributions whose natural parameter resides in a reproducing kernel Hilbert space. Key to our approach is a novel technique, doubly dual embedding, that avoids computation of the partition function. This technique also allows the development of a flexible sampling strategy that amortizes the cost of Monte-Carlo sam… ▽ More

    Submitted 24 April, 2019; v1 submitted 6 November, 2018; originally announced November 2018.

    Comments: 22 pages, 20 figures; AISTATS 2019

  47. arXiv:1811.01213  [pdf, other

    cs.LG cs.CR stat.ML

    Learning to Defend by Learning to Attack

    Authors: Haoming Jiang, Zhehui Chen, Yuyang Shi, Bo Dai, Tuo Zhao

    Abstract: Adversarial training provides a principled approach for training robust neural networks. From an optimization perspective, adversarial training is essentially solving a bilevel optimization problem. The leader problem is trying to learn a robust classifier, while the follower problem is trying to generate adversarial samples. Unfortunately, such a bilevel problem is difficult to solve due to its h… ▽ More

    Submitted 2 May, 2021; v1 submitted 3 November, 2018; originally announced November 2018.

  48. arXiv:1808.03749  [pdf, other

    cs.LG cs.CV stat.ML

    Neural Network Encapsulation

    Authors: Hongyang Li, Xiaoyang Guo, Bo Dai, Wanli Ouyang, Xiaogang Wang

    Abstract: A capsule is a collection of neurons which represents different variants of a pattern in the network. The routing scheme ensures only certain capsules which resemble lower counterparts in the higher layer should be activated. However, the computational complexity becomes a bottleneck for scaling up to larger networks, as lower capsules need to correspond to each and every higher capsule. To resolv… ▽ More

    Submitted 11 August, 2018; originally announced August 2018.

    Comments: ECCV 2018

  49. arXiv:1807.09958  [pdf, other

    cs.CV cs.LG stat.ML

    Rethinking the Form of Latent States in Image Captioning

    Authors: Bo Dai, Deming Ye, Dahua Lin

    Abstract: RNNs and their variants have been widely adopted for image captioning. In RNNs, the production of a caption is driven by a sequence of latent states. Existing captioning models usually represent latent states as vectors, taking this practice for granted. We rethink this choice and study an alternative formulation, namely using two-dimensional maps to encode latent states. This is motivated by the… ▽ More

    Submitted 26 July, 2018; originally announced July 2018.

    Comments: ECCV 2018, first two authors contribute equally

  50. arXiv:1807.08237  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Deep Hidden Nonlinear Dynamics from Aggregate Data

    Authors: Yisen Wang, Bo Dai, Lingkai Kong, Sarah Monazam Erfani, James Bailey, Hongyuan Zha

    Abstract: Learning nonlinear dynamics from diffusion data is a challenging problem since the individuals observed may be different at different time points, generally following an aggregate behaviour. Existing work cannot handle the tasks well since they model such dynamics either directly on observations or enforce the availability of complete longitudinal individual-level trajectories. However, in most of… ▽ More

    Submitted 29 July, 2018; v1 submitted 22 July, 2018; originally announced July 2018.

    Comments: In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2018