-
CLEAR: Can Language Models Really Understand Causal Graphs?
Authors:
Sirui Chen,
Mengying Xu,
Kun Wang,
Xingyu Zeng,
Rui Zhao,
Shengjie Zhao,
Chaochao Lu
Abstract:
Causal reasoning is a cornerstone of how humans interpret the world. To model and reason about causality, causal graphs offer a concise yet effective solution. Given the impressive advancements in language models, a crucial question arises: can they really understand causal graphs? To this end, we pioneer an investigation into language models' understanding of causal graphs. Specifically, we devel…
▽ More
Causal reasoning is a cornerstone of how humans interpret the world. To model and reason about causality, causal graphs offer a concise yet effective solution. Given the impressive advancements in language models, a crucial question arises: can they really understand causal graphs? To this end, we pioneer an investigation into language models' understanding of causal graphs. Specifically, we develop a framework to define causal graph understanding, by assessing language models' behaviors through four practical criteria derived from diverse disciplines (e.g., philosophy and psychology). We then develop CLEAR, a novel benchmark that defines three complexity levels and encompasses 20 causal graph-based tasks across these levels. Finally, based on our framework and benchmark, we conduct extensive experiments on six leading language models and summarize five empirical findings. Our results indicate that while language models demonstrate a preliminary understanding of causal graphs, significant potential for improvement remains. Our project website is at https://github.com/OpenCausaLab/CLEAR.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Multiscale Tests for Point Processes and Longitudinal Networks
Authors:
Youmeng Jiang,
Min Xu
Abstract:
We propose a new testing framework applicable to both the two-sample problem on point processes and the community detection problem on rectangular arrays of point processes, which we refer to as longitudinal networks; the latter problem is useful in situations where we observe interactions among a group of individuals over time. Our framework is based on a multiscale discretization scheme that con…
▽ More
We propose a new testing framework applicable to both the two-sample problem on point processes and the community detection problem on rectangular arrays of point processes, which we refer to as longitudinal networks; the latter problem is useful in situations where we observe interactions among a group of individuals over time. Our framework is based on a multiscale discretization scheme that consider not just the global null but also a collection of nulls local to small regions in the domain; in the two-sample problem, the local rejections tell us where the intensity functions differ and in the longitudinal network problem, the local rejections tell us when the community structure is most salient. We provide theoretical analysis for the two-sample problem and show that our method has minimax optimal power under a Holder continuity condition. We provide extensive simulation and real data analysis demonstrating the practicality of our proposed method.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Optimal convex $M$-estimation via score matching
Authors:
Oliver Y. Feng,
Yu-Chun Kao,
Min Xu,
Richard J. Samworth
Abstract:
In the context of linear regression, we construct a data-driven convex loss function with respect to which empirical risk minimisation yields optimal asymptotic variance in the downstream estimation of the regression coefficients. Our semiparametric approach targets the best decreasing approximation of the derivative of the log-density of the noise distribution. At the population level, this fitti…
▽ More
In the context of linear regression, we construct a data-driven convex loss function with respect to which empirical risk minimisation yields optimal asymptotic variance in the downstream estimation of the regression coefficients. Our semiparametric approach targets the best decreasing approximation of the derivative of the log-density of the noise distribution. At the population level, this fitting process is a nonparametric extension of score matching, corresponding to a log-concave projection of the noise distribution with respect to the Fisher divergence. The procedure is computationally efficient, and we prove that our procedure attains the minimal asymptotic covariance among all convex $M$-estimators. As an example of a non-log-concave setting, for Cauchy errors, the optimal convex loss function is Huber-like, and our procedure yields an asymptotic efficiency greater than 0.87 relative to the oracle maximum likelihood estimator of the regression coefficients that uses knowledge of this error distribution; in this sense, we obtain robustness without sacrificing much efficiency. Numerical experiments confirm the practical merits of our proposal.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
CaMU: Disentangling Causal Effects in Deep Model Unlearning
Authors:
Shaofei Shen,
Chenhao Zhang,
Alina Bialkowski,
Weitong Chen,
Miao Xu
Abstract:
Machine unlearning requires removing the information of forgetting data while kee** the necessary information of remaining data. Despite recent advancements in this area, existing methodologies mainly focus on the effect of removing forgetting data without considering the negative impact this can have on the information of the remaining data, resulting in significant performance degradation afte…
▽ More
Machine unlearning requires removing the information of forgetting data while kee** the necessary information of remaining data. Despite recent advancements in this area, existing methodologies mainly focus on the effect of removing forgetting data without considering the negative impact this can have on the information of the remaining data, resulting in significant performance degradation after data removal. Although some methods try to repair the performance of remaining data after removal, the forgotten information can also return after repair. Such an issue is due to the intricate intertwining of the forgetting and remaining data. Without adequately differentiating the influence of these two kinds of data on the model, existing algorithms take the risk of either inadequate removal of the forgetting data or unnecessary loss of valuable information from the remaining data. To address this shortcoming, the present study undertakes a causal analysis of the unlearning and introduces a novel framework termed Causal Machine Unlearning (CaMU). This framework adds intervention on the information of remaining data to disentangle the causal effects between forgetting data and remaining data. Then CaMU eliminates the causal impact associated with forgetting data while concurrently preserving the causal relevance of the remaining data. Comprehensive empirical results on various datasets and models suggest that CaMU enhances performance on the remaining data and effectively minimizes the influences of forgetting data. Notably, this work is the first to interpret deep model unlearning tasks from a new perspective of causality and provide a solution based on causal analysis, which opens up new possibilities for future research in deep model unlearning.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Efficient Estimation of the Central Mean Subspace via Smoothed Gradient Outer Products
Authors:
Gan Yuan,
Mingyue Xu,
Samory Kpotufe,
Daniel Hsu
Abstract:
We consider the problem of sufficient dimension reduction (SDR) for multi-index models. The estimators of the central mean subspace in prior works either have slow (non-parametric) convergence rates, or rely on stringent distributional conditions (e.g., the covariate distribution $P_{\mathbf{X}}$ being elliptical symmetric). In this paper, we show that a fast parametric convergence rate of form…
▽ More
We consider the problem of sufficient dimension reduction (SDR) for multi-index models. The estimators of the central mean subspace in prior works either have slow (non-parametric) convergence rates, or rely on stringent distributional conditions (e.g., the covariate distribution $P_{\mathbf{X}}$ being elliptical symmetric). In this paper, we show that a fast parametric convergence rate of form $C_d \cdot n^{-1/2}$ is achievable via estimating the \emph{expected smoothed gradient outer product}, for a general class of distribution $P_{\mathbf{X}}$ admitting Gaussian or heavier distributions. When the link function is a polynomial with a degree of at most $r$ and $P_{\mathbf{X}}$ is the standard Gaussian, we show that the prefactor depends on the ambient dimension $d$ as $C_d \propto d^r$.
△ Less
Submitted 24 December, 2023;
originally announced December 2023.
-
Identifiable and interpretable nonparametric factor analysis
Authors:
Maoran Xu,
Amy H. Herring,
David B. Dunson
Abstract:
Factor models have been widely used to summarize the variability of high-dimensional data through a set of factors with much lower dimensionality. Gaussian linear factor models have been particularly popular due to their interpretability and ease of computation. However, in practice, data often violate the multivariate Gaussian assumption. To characterize higher-order dependence and nonlinearity,…
▽ More
Factor models have been widely used to summarize the variability of high-dimensional data through a set of factors with much lower dimensionality. Gaussian linear factor models have been particularly popular due to their interpretability and ease of computation. However, in practice, data often violate the multivariate Gaussian assumption. To characterize higher-order dependence and nonlinearity, models that include factors as predictors in flexible multivariate regression are popular, with GP-LVMs using Gaussian process (GP) priors for the regression function and VAEs using deep neural networks. Unfortunately, such approaches lack identifiability and interpretability and tend to produce brittle and non-reproducible results. To address these problems by simplifying the nonparametric factor model while maintaining flexibility, we propose the NIFTY framework, which parsimoniously transforms uniform latent variables using one-dimensional nonlinear map**s and then applies a linear generative model. The induced multivariate distribution falls into a flexible class while maintaining simple computation and interpretation. We prove that this model is identifiable and empirically study NIFTY using simulated data, observing good performance in density estimation and data visualization. We then apply NIFTY to bird song data in an environmental monitoring application.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Scaling Riemannian Diffusion Models
Authors:
Aaron Lou,
Minkai Xu,
Stefano Ermon
Abstract:
Riemannian diffusion models draw inspiration from standard Euclidean space diffusion models to learn distributions on general manifolds. Unfortunately, the additional geometric complexity renders the diffusion transition term inexpressible in closed form, so prior methods resort to imprecise approximations of the score matching training objective that degrade performance and preclude applications…
▽ More
Riemannian diffusion models draw inspiration from standard Euclidean space diffusion models to learn distributions on general manifolds. Unfortunately, the additional geometric complexity renders the diffusion transition term inexpressible in closed form, so prior methods resort to imprecise approximations of the score matching training objective that degrade performance and preclude applications in high dimensions. In this work, we reexamine these approximations and propose several practical improvements. Our key observation is that most relevant manifolds are symmetric spaces, which are much more amenable to computation. By leveraging and combining various ansätze, we can quickly compute relevant quantities to high precision. On low dimensional datasets, our correction produces a noticeable improvement, allowing diffusion to compete with other methods. Additionally, we show that our method enables us to scale to high dimensional tasks on nontrivial manifolds. In particular, we model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
A Comparison between Markov Switching Zero-inflated and Hurdle Models for Spatio-temporal Infectious Disease Counts
Authors:
Mingchi Xu,
Dirk Douwes-Schultz,
Alexandra M. Schmidt
Abstract:
In epidemiological studies, zero-inflated and hurdle models are commonly used to handle excess zeros in reported infectious disease cases. However, they can not model the persistence (from presence to presence) and reemergence (from absence to presence) of a disease separately. Covariates can sometimes have different effects on the reemergence and persistence of a disease. Recently, a zero-inflate…
▽ More
In epidemiological studies, zero-inflated and hurdle models are commonly used to handle excess zeros in reported infectious disease cases. However, they can not model the persistence (from presence to presence) and reemergence (from absence to presence) of a disease separately. Covariates can sometimes have different effects on the reemergence and persistence of a disease. Recently, a zero-inflated Markov switching negative binomial model was proposed to accommodate this issue. We present a Markov switching negative binomial hurdle model as a competitor of that approach, as hurdle models are often also used as alternatives to zero-inflated models for accommodating excess zeroes. We begin the comparison by inspecting the underlying assumptions made by both models. Hurdle models assume perfect detection of the disease cases while zero-inflated models implicitly assume the case counts can be under-reported, thus we investigate when a negative binomial distribution can approximate the true distribution of reported counts. A comparison of the fit of the two types of Markov switching models is undertaken on chikungunya cases across the neighborhoods of Rio de Janeiro. We find that, among the fitted models, the Markov switching negative binomial zero-inflated model produces the best predictions and both Markov switching models produce remarkably better predictions than more traditional negative binomial hurdle and zero-inflated models.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Optimal Rate of Kernel Regression in Large Dimensions
Authors:
Weihao Lu,
Haobo Zhang,
Yicheng Li,
Manyun Xu,
Qian Lin
Abstract:
We perform a study on kernel regression for large-dimensional data (where the sample size $n$ is polynomially depending on the dimension $d$ of the samples, i.e., $n\asymp d^γ$ for some $γ>0$ ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity $\varepsilon_{n}^{2}$ and the metr…
▽ More
We perform a study on kernel regression for large-dimensional data (where the sample size $n$ is polynomially depending on the dimension $d$ of the samples, i.e., $n\asymp d^γ$ for some $γ>0$ ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity $\varepsilon_{n}^{2}$ and the metric entropy $\bar{\varepsilon}_{n}^{2}$ respectively. When the target function falls into the RKHS associated with a (general) inner product model defined on $\mathbb{S}^{d}$, we utilize the new tool to show that the minimax rate of the excess risk of kernel regression is $n^{-1/2}$ when $n\asymp d^γ$ for $γ=2, 4, 6, 8, \cdots$. We then further determine the optimal rate of the excess risk of kernel regression for all the $γ>0$ and find that the curve of optimal rate varying along $γ$ exhibits several new phenomena including the multiple descent behavior and the periodic plateau behavior. As an application, For the neural tangent kernel (NTK), we also provide a similar explicit description of the curve of optimal rate. As a direct corollary, we know these claims hold for wide neural networks as well.
△ Less
Submitted 28 June, 2024; v1 submitted 8 September, 2023;
originally announced September 2023.
-
Regret Lower Bounds in Multi-agent Multi-armed Bandit
Authors:
Mengfan Xu,
Diego Klabjan
Abstract:
Multi-armed Bandit motivates methods with provable upper bounds on regret and also the counterpart lower bounds have been extensively studied in this context. Recently, Multi-agent Multi-armed Bandit has gained significant traction in various domains, where individual clients face bandit problems in a distributed manner and the objective is the overall system performance, typically measured by reg…
▽ More
Multi-armed Bandit motivates methods with provable upper bounds on regret and also the counterpart lower bounds have been extensively studied in this context. Recently, Multi-agent Multi-armed Bandit has gained significant traction in various domains, where individual clients face bandit problems in a distributed manner and the objective is the overall system performance, typically measured by regret. While efficient algorithms with regret upper bounds have emerged, limited attention has been given to the corresponding regret lower bounds, except for a recent lower bound for adversarial settings, which, however, has a gap with let known upper bounds. To this end, we herein provide the first comprehensive study on regret lower bounds across different settings and establish their tightness. Specifically, when the graphs exhibit good connectivity properties and the rewards are stochastically distributed, we demonstrate a lower bound of order $O(\log T)$ for instance-dependent bounds and $\sqrt{T}$ for mean-gap independent bounds which are tight. Assuming adversarial rewards, we establish a lower bound $O(T^{\frac{2}{3}})$ for connected graphs, thereby bridging the gap between the lower and upper bound in the prior work. We also show a linear regret lower bound when the graph is disconnected. While previous works have explored these settings with upper bounds, we provide a thorough study on tight lower bounds.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Distributed Semi-Supervised Sparse Statistical Inference
Authors:
Jiyuan Tu,
Weidong Liu,
Xiaojun Mao,
Mingyue Xu
Abstract:
The debiased estimator is a crucial tool in statistical inference for high-dimensional model parameters. However, constructing such an estimator involves estimating the high-dimensional inverse Hessian matrix, incurring significant computational costs. This challenge becomes particularly acute in distributed setups, where traditional methods necessitate computing a debiased estimator on every mach…
▽ More
The debiased estimator is a crucial tool in statistical inference for high-dimensional model parameters. However, constructing such an estimator involves estimating the high-dimensional inverse Hessian matrix, incurring significant computational costs. This challenge becomes particularly acute in distributed setups, where traditional methods necessitate computing a debiased estimator on every machine. This becomes unwieldy, especially with a large number of machines. In this paper, we delve into semi-supervised sparse statistical inference in a distributed setup. An efficient multi-round distributed debiased estimator, which integrates both labeled and unlabelled data, is developed. We will show that the additional unlabeled data helps to improve the statistical rate of each round of iteration. Our approach offers tailored debiasing methods for $M$-estimation and generalized linear models according to the specific form of the loss function. Our method also applies to a non-smooth loss like absolute deviation loss. Furthermore, our algorithm is computationally efficient since it requires only one estimation of a high-dimensional inverse covariance matrix. We demonstrate the effectiveness of our method by presenting simulation studies and real data applications that highlight the benefits of incorporating unlabeled data.
△ Less
Submitted 15 December, 2023; v1 submitted 17 June, 2023;
originally announced June 2023.
-
Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards
Authors:
Mengfan Xu,
Diego Klabjan
Abstract:
We study a decentralized multi-agent multi-armed bandit problem in which multiple clients are connected by time dependent random graphs provided by an environment. The reward distributions of each arm vary across clients and rewards are generated independently over time by an environment based on distributions that include both sub-exponential and sub-gaussian distributions. Each client pulls an a…
▽ More
We study a decentralized multi-agent multi-armed bandit problem in which multiple clients are connected by time dependent random graphs provided by an environment. The reward distributions of each arm vary across clients and rewards are generated independently over time by an environment based on distributions that include both sub-exponential and sub-gaussian distributions. Each client pulls an arm and communicates with neighbors based on the graph provided by the environment. The goal is to minimize the overall regret of the entire system through collaborations. To this end, we introduce a novel algorithmic framework, which first provides robust simulation methods for generating random graphs using rapidly mixing Markov chains or the random graph model, and then combines an averaging-based consensus approach with a newly proposed weighting technique and the upper confidence bound to deliver a UCB-type solution. Our algorithms account for the randomness in the graphs, removing the conventional doubly stochasticity assumption, and only require the knowledge of the number of clients at initialization. We derive optimal instance-dependent regret upper bounds of order $\log{T}$ in both sub-gaussian and sub-exponential environments, and a nearly optimal mean-gap independent regret upper bound of order $\sqrt{T}\log T$ up to a $\log T$ factor. Importantly, our regret bounds hold with high probability and capture graph randomness, whereas prior works consider expected regret under assumptions and require more stringent reward distributions.
△ Less
Submitted 17 October, 2023; v1 submitted 8 June, 2023;
originally announced June 2023.
-
A Bayesian Collocation Integral Method for Parameter Estimation in Ordinary Differential Equations
Authors:
Mingwei Xu,
Samuel W. K. Wong,
Peijun Sang
Abstract:
Inferring the parameters of ordinary differential equations (ODEs) from noisy observations is an important problem in many scientific fields. Currently, most parameter estimation methods that bypass numerical integration tend to rely on basis functions or Gaussian processes to approximate the ODE solution and its derivatives. Due to the sensitivity of the ODE solution to its derivatives, these met…
▽ More
Inferring the parameters of ordinary differential equations (ODEs) from noisy observations is an important problem in many scientific fields. Currently, most parameter estimation methods that bypass numerical integration tend to rely on basis functions or Gaussian processes to approximate the ODE solution and its derivatives. Due to the sensitivity of the ODE solution to its derivatives, these methods can be hindered by estimation error, especially when only sparse time-course observations are available. We present a Bayesian collocation framework that operates on the integrated form of the ODEs and also avoids the expensive use of numerical solvers. Our methodology has the capability to handle general nonlinear ODE systems. We demonstrate the accuracy of the proposed method through simulation studies, where the estimated parameters and recovered system trajectories are compared with other recent methods. A real data example is also provided.
△ Less
Submitted 23 October, 2023; v1 submitted 4 April, 2023;
originally announced April 2023.
-
Choosing the $p$ in $L_p$ loss: rate adaptivity on the symmetric location problem
Authors:
Yu-Chun Kao,
Min Xu,
Cun-Hui Zhang
Abstract:
Given univariate random variables $Y_1, \ldots, Y_n$ with the $\text{Uniform}(θ_0 - 1, θ_0 + 1)$ distribution, the sample midrange $\frac{Y_{(n)}+Y_{(1)}}{2}$ is the MLE for $θ_0$ and estimates $θ_0$ with error of order $1/n$, which is much smaller compared with the $1/\sqrt{n}$ error rate of the usual sample mean estimator. However, the sample midrange performs poorly when the data has say the Ga…
▽ More
Given univariate random variables $Y_1, \ldots, Y_n$ with the $\text{Uniform}(θ_0 - 1, θ_0 + 1)$ distribution, the sample midrange $\frac{Y_{(n)}+Y_{(1)}}{2}$ is the MLE for $θ_0$ and estimates $θ_0$ with error of order $1/n$, which is much smaller compared with the $1/\sqrt{n}$ error rate of the usual sample mean estimator. However, the sample midrange performs poorly when the data has say the Gaussian $N(θ_0, 1)$ distribution, with an error rate of $1/\sqrt{\log n}$. In this paper, we propose an estimator of the location $θ_0$ with a rate of convergence that can, in many settings, adapt to the underlying distribution which we assume to be symmetric around $θ_0$ but is otherwise unknown. When the underlying distribution is compactly supported, we show that our estimator attains a rate of convergence of $n^{-\frac{1}α}$ up to polylog factors, where the rate parameter $α$ can take on any value in $(0, 2]$ and depends on the moments of the underlying distribution. Our estimator is formed by the $\ell^γ$-center of the data, for a $γ\geq2$ chosen in a data-driven way -- by minimizing a criterion motivated by the asymptotic variance. Our approach can be directly applied to the regression setting where $θ_0$ is a function of observed features and motivates the use of $\ell^γ$ loss function for $γ> 2$ in certain settings.
△ Less
Submitted 16 August, 2023; v1 submitted 3 March, 2023;
originally announced March 2023.
-
Generalization Ability of Wide Neural Networks on $\mathbb{R}$
Authors:
Jianfa Lai,
Manyun Xu,
Rui Chen,
Qian Lin
Abstract:
We perform a study on the generalization ability of the wide two-layer ReLU neural network on $\mathbb{R}$. We first establish some spectral properties of the neural tangent kernel (NTK): $a)$ $K_{d}$, the NTK defined on $\mathbb{R}^{d}$, is positive definite; $b)$ $λ_{i}(K_{1})$, the $i$-th largest eigenvalue of $K_{1}$, is proportional to $i^{-2}$. We then show that: $i)$ when the width…
▽ More
We perform a study on the generalization ability of the wide two-layer ReLU neural network on $\mathbb{R}$. We first establish some spectral properties of the neural tangent kernel (NTK): $a)$ $K_{d}$, the NTK defined on $\mathbb{R}^{d}$, is positive definite; $b)$ $λ_{i}(K_{1})$, the $i$-th largest eigenvalue of $K_{1}$, is proportional to $i^{-2}$. We then show that: $i)$ when the width $m\rightarrow\infty$, the neural network kernel (NNK) uniformly converges to the NTK; $ii)$ the minimax rate of regression over the RKHS associated to $K_{1}$ is $n^{-2/3}$; $iii)$ if one adopts the early stop** strategy in training a wide neural network, the resulting neural network achieves the minimax rate; $iv)$ if one trains the neural network till it overfits the data, the resulting neural network can not generalize well. Finally, we provide an explanation to reconcile our theory and the widely observed ``benign overfitting phenomenon''.
△ Less
Submitted 12 February, 2023;
originally announced February 2023.
-
Balancing Approach for Causal Inference at Scale
Authors:
Sicheng Lin,
Meng Xu,
Xi Zhang,
Shih-Kang Chao,
Ying-Kai Huang,
Xiaolin Shi
Abstract:
With the modern software and online platforms to collect massive amount of data, there is an increasing demand of applying causal inference methods at large scale when randomized experimentation is not viable. Weighting methods that directly incorporate covariate balancing have recently gained popularity for estimating causal effects in observational studies. These methods reduce the manual effort…
▽ More
With the modern software and online platforms to collect massive amount of data, there is an increasing demand of applying causal inference methods at large scale when randomized experimentation is not viable. Weighting methods that directly incorporate covariate balancing have recently gained popularity for estimating causal effects in observational studies. These methods reduce the manual efforts required by researchers to iterate between propensity score modeling and balance checking until a satisfied covariate balance result. However, conventional solvers for determining weights lack the scalability to apply such methods on large scale datasets in companies like Snap Inc. To address the limitations and improve computational efficiency, in this paper we present scalable algorithms, DistEB and DistMS, for two balancing approaches: entropy balancing and MicroSynth. The solvers have linear time complexity and can be conveniently implemented in distributed computing frameworks such as Spark, Hive, etc. We study the properties of balancing approaches at different scales up to 1 million treated units and 487 covariates. We find that with larger sample size, both bias and variance in the causal effect estimation are significantly reduced. The results emphasize the importance of applying balancing approaches on large scale datasets. We combine the balancing approach with a synthetic control framework and deploy an end-to-end system for causal impact estimation at Snap Inc.
△ Less
Submitted 3 August, 2023; v1 submitted 10 February, 2023;
originally announced February 2023.
-
Pareto Regret Analyses in Multi-objective Multi-armed Bandit
Authors:
Mengfan Xu,
Diego Klabjan
Abstract:
We study Pareto optimality in multi-objective multi-armed bandit by providing a formulation of adversarial multi-objective multi-armed bandit and defining its Pareto regrets that can be applied to both stochastic and adversarial settings. The regrets do not rely on any scalarization functions and reflect Pareto optimality compared to scalarized regrets. We also present new algorithms assuming both…
▽ More
We study Pareto optimality in multi-objective multi-armed bandit by providing a formulation of adversarial multi-objective multi-armed bandit and defining its Pareto regrets that can be applied to both stochastic and adversarial settings. The regrets do not rely on any scalarization functions and reflect Pareto optimality compared to scalarized regrets. We also present new algorithms assuming both with and without prior information of the multi-objective multi-armed bandit setting. The algorithms are shown optimal in adversarial settings and nearly optimal up to a logarithmic factor in stochastic settings simultaneously by our established upper bounds and lower bounds on Pareto regrets. Moreover, the lower bound analyses show that the new regrets are consistent with the existing Pareto regret for stochastic settings and extend an adversarial attack mechanism from bandit to the multi-objective one.
△ Less
Submitted 30 May, 2023; v1 submitted 1 December, 2022;
originally announced December 2022.
-
GLS under Monotone Heteroskedasticity
Authors:
Yoichi Arai,
Taisuke Otsu,
Mengshan Xu
Abstract:
The generalized least square (GLS) is one of the most basic tools in regression analyses. A major issue in implementing the GLS is estimation of the conditional variance function of the error term, which typically requires a restrictive functional form assumption for parametric estimation or smoothing parameters for nonparametric estimation. In this paper, we propose an alternative approach to est…
▽ More
The generalized least square (GLS) is one of the most basic tools in regression analyses. A major issue in implementing the GLS is estimation of the conditional variance function of the error term, which typically requires a restrictive functional form assumption for parametric estimation or smoothing parameters for nonparametric estimation. In this paper, we propose an alternative approach to estimate the conditional variance function under nonparametric monotonicity constraints by utilizing the isotonic regression method. Our GLS estimator is shown to be asymptotically equivalent to the infeasible GLS estimator with knowledge of the conditional error variance, and involves only some tuning to trim boundary observations, not only for point estimation but also for interval estimation or hypothesis testing. Our analysis extends the scope of the isotonic regression method by showing that the isotonic estimates, possibly with generated variables, can be employed as first stage estimates to be plugged in for semiparametric objects. Simulation studies illustrate excellent finite sample performances of the proposed method. As an empirical example, we revisit Acemoglu and Restrepo's (2017) study on the relationship between an aging population and economic growth to illustrate how our GLS estimator effectively reduces estimation errors.
△ Less
Submitted 22 January, 2024; v1 submitted 25 October, 2022;
originally announced October 2022.
-
Positive-Unlabeled Learning using Random Forests via Recursive Greedy Risk Minimization
Authors:
Jonathan Wilton,
Abigail M. Y. Koay,
Ryan K. L. Ko,
Miao Xu,
Nan Ye
Abstract:
The need to learn from positive and unlabeled data, or PU learning, arises in many applications and has attracted increasing interest. While random forests are known to perform well on many tasks with positive and negative data, recent PU algorithms are generally based on deep neural networks, and the potential of tree-based PU learning is under-explored. In this paper, we propose new random fores…
▽ More
The need to learn from positive and unlabeled data, or PU learning, arises in many applications and has attracted increasing interest. While random forests are known to perform well on many tasks with positive and negative data, recent PU algorithms are generally based on deep neural networks, and the potential of tree-based PU learning is under-explored. In this paper, we propose new random forest algorithms for PU-learning. Key to our approach is a new interpretation of decision tree algorithms for positive and negative data as \emph{recursive greedy risk minimization algorithms}. We extend this perspective to the PU setting to develop new decision tree learning algorithms that directly minimizes PU-data based estimators for the expected risk. This allows us to develop an efficient PU random forest algorithm, PU extra trees. Our approach features three desirable properties: it is robust to the choice of the loss function in the sense that various loss functions lead to the same decision trees; it requires little hyperparameter tuning as compared to neural network based PU learning; it supports a feature importance that directly measures a feature's contribution to risk minimization. Our algorithms demonstrate strong performance on several datasets. Our code is available at \url{https://github.com/puetpaper/PUExtraTrees}.
△ Less
Submitted 16 October, 2022;
originally announced October 2022.
-
Statistical Modeling of Data Breach Risks: Time to Identification and Notification
Authors:
Maochao Xu,
Quynh Nhu Nguyen
Abstract:
It is very challenging to predict the cost of a cyber incident owing to the complex nature of cyber risk. However, it is inevitable for insurance companies who offer cyber insurance policies. The time to identifying an incident and the time to noticing the affected individuals are two important components in determining the cost of a cyber incident. In this work, we initialize the study on those t…
▽ More
It is very challenging to predict the cost of a cyber incident owing to the complex nature of cyber risk. However, it is inevitable for insurance companies who offer cyber insurance policies. The time to identifying an incident and the time to noticing the affected individuals are two important components in determining the cost of a cyber incident. In this work, we initialize the study on those two metrics via statistical modeling approaches. Particularly, we propose a novel approach to imputing the missing data, and further develop a dependence model to capture the complex pattern exhibited by those two metrics. The empirical study shows that the proposed approach has a satisfactory predictive performance and is superior to other commonly used models.
△ Less
Submitted 24 September, 2022; v1 submitted 15 September, 2022;
originally announced September 2022.
-
Isotonic propensity score matching
Authors:
Mengshan Xu,
Taisuke Otsu
Abstract:
We propose a one-to-many matching estimator of the average treatment effect based on propensity scores estimated by isotonic regression. The method relies on the monotonicity assumption on the propensity score function, which can be justified in many applications in economics. We show that the nature of the isotonic estimator can help us to fix many problems of existing matching methods, including…
▽ More
We propose a one-to-many matching estimator of the average treatment effect based on propensity scores estimated by isotonic regression. The method relies on the monotonicity assumption on the propensity score function, which can be justified in many applications in economics. We show that the nature of the isotonic estimator can help us to fix many problems of existing matching methods, including efficiency, choice of the number of matches, choice of tuning parameters, robustness to propensity score misspecification, and bootstrap validity. As a by-product, a uniformly consistent isotonic estimator is developed for our proposed matching method.
△ Less
Submitted 18 July, 2022;
originally announced July 2022.
-
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Authors:
Aarohi Srivastava,
Abhinav Rastogi,
Abhishek Rao,
Abu Awal Md Shoeb,
Abubakar Abid,
Adam Fisch,
Adam R. Brown,
Adam Santoro,
Aditya Gupta,
Adrià Garriga-Alonso,
Agnieszka Kluska,
Aitor Lewkowycz,
Akshat Agarwal,
Alethea Power,
Alex Ray,
Alex Warstadt,
Alexander W. Kocurek,
Ali Safaya,
Ali Tazarv,
Alice Xiang,
Alicia Parrish,
Allen Nie,
Aman Hussain,
Amanda Askell,
Amanda Dsouza
, et al. (426 additional authors not shown)
Abstract:
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur…
▽ More
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.
△ Less
Submitted 12 June, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Modeling Ride-Sourcing Matching and Pickup Processes based on Additive Gaussian Process Models
Authors:
Zheng Zhu,
Meng Xu,
Yining Di,
Xiqun Chen,
**gru Yu
Abstract:
Matching and pickup processes are core features of ride-sourcing services. Previous studies have adopted abundant analytical models to depict the two processes and obtain operational insights; while the goodness of fit between models and data was dismissed. To simultaneously consider the fitness between models and data and analytically tractable formations, we propose a data-driven approach based…
▽ More
Matching and pickup processes are core features of ride-sourcing services. Previous studies have adopted abundant analytical models to depict the two processes and obtain operational insights; while the goodness of fit between models and data was dismissed. To simultaneously consider the fitness between models and data and analytically tractable formations, we propose a data-driven approach based on the additive Gaussian Process Model (AGPM) for ride-sourcing market modeling. The framework is tested based on real-world data collected in Hangzhou, China. We fit analytical models, machine learning models, and AGPMs, in which the number of matches or pickups are used as outputs and spatial, temporal, demand, and supply covariates are utilized as inputs. The results demonstrate the advantages of AGPMs in recovering the two processes in terms of estimation accuracy. Furthermore, we illustrate the modeling power of AGPM by utilizing the trained model to design and estimate idle vehicle relocation strategies.
△ Less
Submitted 29 April, 2022;
originally announced April 2022.
-
Random Matrix Time Series
Authors:
Peiyuan Teng,
Min Xu
Abstract:
In this paper, a time series model with coefficients that take values from random matrix ensembles is proposed. Formal definitions, theoretical solutions, and statistical properties are derived. Estimation and forecast methodologies for random matrix time series are discussed with examples. Random matrix differential equations and potential applications of the time series model are suggested at th…
▽ More
In this paper, a time series model with coefficients that take values from random matrix ensembles is proposed. Formal definitions, theoretical solutions, and statistical properties are derived. Estimation and forecast methodologies for random matrix time series are discussed with examples. Random matrix differential equations and potential applications of the time series model are suggested at the end.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace
Authors:
Shu Wan,
Chen Zheng,
Zhonggen Sun,
Mengfan Xu,
Xiaoqing Yang,
Hongtu Zhu,
Jiecheng Guo
Abstract:
Uplift modeling is a rapidly growing approach that utilizes causal inference and machine learning methods to directly estimate the heterogeneous treatment effects, which has been widely applied to various online marketplaces to assist large-scale decision-making in recent years. The existing popular models, like causal forest (CF), are limited to either discrete treatments or posing parametric ass…
▽ More
Uplift modeling is a rapidly growing approach that utilizes causal inference and machine learning methods to directly estimate the heterogeneous treatment effects, which has been widely applied to various online marketplaces to assist large-scale decision-making in recent years. The existing popular models, like causal forest (CF), are limited to either discrete treatments or posing parametric assumptions on the outcome-treatment relationship that may suffer model misspecification. However, continuous treatments (e.g., price, duration) often arise in marketplaces. To alleviate these restrictions, we use a kernel-based doubly robust estimator to recover the non-parametric dose-response functions that can flexibly model continuous treatment effects. Moreover, we propose a generic distance-based splitting criterion to capture the heterogeneity for the continuous treatments. We call the proposed algorithm generalized causal forest (GCF) as it generalizes the use case of CF to a much broader setting. We show the effectiveness of GCF by deriving the asymptotic property of the estimator and comparing it to popular uplift modeling methods on both synthetic and real-world datasets. We implement GCF on Spark and successfully deploy it into a large-scale online pricing system at a leading ride-sharing company. Online A/B testing results further validate the superiority of GCF.
△ Less
Submitted 23 September, 2022; v1 submitted 21 March, 2022;
originally announced March 2022.
-
Distribution and Determinants of Correlation between PM2.5 and O3 in China Mainland: Dynamitic simil-Hu Lines
Authors:
Chenru Chen,
Miaoqing Xu,
Shuyi Liu,
Dehai Zhu,
Jianyu Yang,
Bingbo Gao,
Ziyue Chen
Abstract:
In recent years, China has made great efforts to control air pollution. During the governance process, it is found that fine particulate matter (PM2.5) and ozone (O3) change in the same trend among some areas and the opposite in others, which brings some difficulties to take measures in a planned way. Therefore, this study adopted multi-year and large-scale air quality data to explore the distribu…
▽ More
In recent years, China has made great efforts to control air pollution. During the governance process, it is found that fine particulate matter (PM2.5) and ozone (O3) change in the same trend among some areas and the opposite in others, which brings some difficulties to take measures in a planned way. Therefore, this study adopted multi-year and large-scale air quality data to explore the distribution of correlation between PM2.5 and O3, and proposed a concept called dynamic similar hu lines to replace the single fixed division in the previous research. Furthermore, this study discussed the causes of distribution patterns quantitatively with geographical detector and random forest. The causes included natural factors and anthropogenic factors. And these factors could be divided into three parts according to the characteristics of spatial distribution: broadly changing with longitude, changing with latitude, and having local characteristics. Overall, regions with relatively more densely population, higher GDP, lower altitude, higher humidity, higher atmospheric pressure, higher surface temperature, less sunshine hours and more accumulated precipitation often corresponds to positive correlation coefficient between PM2.5 and O3, no matter in which season. The parts with opposite conditions that mentioned above are essentially negative correlation coefficient. And what's more, humidity, global surface temperature, air temperature and accumulated precipitation are four decisive factors to form the distribution of correlation between PM2.5 and O3. In general, collaborative governance of atmospheric pollutants should consider particular time and space background and also be based on the local actual socio-economic situations, geography and geomorphology, climate and meteorology and other comprehensive factors.
△ Less
Submitted 30 September, 2022; v1 submitted 13 November, 2021;
originally announced November 2021.
-
Bayesian Inference using the Proximal Map**: Uncertainty Quantification under Varying Dimensionality
Authors:
Maoran Xu,
Hua Zhou,
Yujie Hu,
Leo L. Duan
Abstract:
In statistical applications, it is common to encounter parameters supported on a varying or unknown dimensional space. Examples include the fused lasso regression, the matrix recovery under an unknown low rank, etc. Despite the ease of obtaining a point estimate via the optimization, it is much more challenging to quantify their uncertainty -- in the Bayesian framework, a major difficulty is that…
▽ More
In statistical applications, it is common to encounter parameters supported on a varying or unknown dimensional space. Examples include the fused lasso regression, the matrix recovery under an unknown low rank, etc. Despite the ease of obtaining a point estimate via the optimization, it is much more challenging to quantify their uncertainty -- in the Bayesian framework, a major difficulty is that if assigning the prior associated with a $p$-dimensional measure, then there is zero posterior probability on any lower-dimensional subset with dimension $d<p$; to avoid this caveat, one needs to choose another dimension-selection prior on $d$, which often involves a highly combinatorial problem. To significantly reduce the modeling burden, we propose a new generative process for the prior: starting from a continuous random variable such as multivariate Gaussian, we transform it into a varying-dimensional space using the proximal map**.
This leads to a large class of new Bayesian models that can directly exploit the popular frequentist regularizations and their algorithms, such as the nuclear norm penalty and the alternating direction method of multipliers, while providing a principled and probabilistic uncertainty estimation.
We show that this framework is well justified in the geometric measure theory, and enjoys a convenient posterior computation via the standard Hamiltonian Monte Carlo. We demonstrate its use in the analysis of the dynamic flow network data.
△ Less
Submitted 2 October, 2022; v1 submitted 10 August, 2021;
originally announced August 2021.
-
Root and community inference on the latent growth process of a network
Authors:
Harry Crane,
Min Xu
Abstract:
Many existing statistical models for networks overlook the fact that many real world networks are formed through a growth process. To address this, we introduce the PAPER (Preferential Attachment Plus Erdős--Rényi) model for random networks, where we let a random network G be the union of a preferential attachment (PA) tree T and additional Erdős--Rényi (ER) random edges. The PA tree component cap…
▽ More
Many existing statistical models for networks overlook the fact that many real world networks are formed through a growth process. To address this, we introduce the PAPER (Preferential Attachment Plus Erdős--Rényi) model for random networks, where we let a random network G be the union of a preferential attachment (PA) tree T and additional Erdős--Rényi (ER) random edges. The PA tree component captures the underlying growth/recruitment process of a network where vertices and edges are added sequentially, while the ER component can be regarded as random noise. Given only a single snapshot of the final network G, we study the problem of constructing confidence sets for the early history, in particular the root node, of the unobserved growth process; the root node can be patient zero in a disease infection network or the source of fake news in a social media network. We propose an inference algorithm based on Gibbs sampling that scales to networks with millions of nodes and provide theoretical analysis showing that the expected size of the confidence set is small so long as the noise level of the ER edges is not too large. We also propose variations of the model in which multiple growth processes occur simultaneously, reflecting the growth of multiple communities, and we use these models to provide a new approach to community detection.
△ Less
Submitted 7 February, 2023; v1 submitted 30 June, 2021;
originally announced July 2021.
-
Modeling Multivariate Cyber Risks: Deep Learning Dating Extreme Value Theory
Authors:
Mingyue Zhang Wu,
**zhu Luo,
Xing Fang,
Maochao Xu,
Peng Zhao
Abstract:
Modeling cyber risks has been an important but challenging task in the domain of cyber security. It is mainly because of the high dimensionality and heavy tails of risk patterns. Those obstacles have hindered the development of statistical modeling of the multivariate cyber risks. In this work, we propose a novel approach for modeling the multivariate cyber risks which relies on the deep learning…
▽ More
Modeling cyber risks has been an important but challenging task in the domain of cyber security. It is mainly because of the high dimensionality and heavy tails of risk patterns. Those obstacles have hindered the development of statistical modeling of the multivariate cyber risks. In this work, we propose a novel approach for modeling the multivariate cyber risks which relies on the deep learning and extreme value theory. The proposed model not only enjoys the high accurate point predictions via deep learning but also can provide the satisfactory high quantile prediction via extreme value theory. The simulation study shows that the proposed model can model the multivariate cyber risks very well and provide satisfactory prediction performances. The empirical evidence based on real honeypot attack data also shows that the proposed model has very satisfactory prediction performances.
△ Less
Submitted 15 March, 2021;
originally announced March 2021.
-
Functional optimal transport: map estimation and domain adaptation for functional data
Authors:
Jiacheng Zhu,
Aritra Guha,
Dat Do,
Mengdi Xu,
XuanLong Nguyen,
Ding Zhao
Abstract:
We introduce a formulation of optimal transport problem for distributions on function spaces, where the stochastic map between functional domains can be partially represented in terms of an (infinite-dimensional) Hilbert-Schmidt operator map** a Hilbert space of functions to another. For numerous machine learning tasks, data can be naturally viewed as samples drawn from spaces of functions, such…
▽ More
We introduce a formulation of optimal transport problem for distributions on function spaces, where the stochastic map between functional domains can be partially represented in terms of an (infinite-dimensional) Hilbert-Schmidt operator map** a Hilbert space of functions to another. For numerous machine learning tasks, data can be naturally viewed as samples drawn from spaces of functions, such as curves and surfaces, in high dimensions. Optimal transport for functional data analysis provides a useful framework of treatment for such domains. { Since probability measures in infinite dimensional spaces generally lack absolute continuity (that is, with respect to non-degenerate Gaussian measures), the Monge map in the standard optimal transport theory for finite dimensional spaces may not exist. Our approach to the optimal transport problem in infinite dimensions is by a suitable regularization technique -- we restrict the class of transport maps to be a Hilbert-Schmidt space of operators.} To this end, we develop an efficient algorithm for finding the stochastic transport map between functional domains and provide theoretical guarantees on the existence, uniqueness, and consistency of our estimate for the Hilbert-Schmidt operator. We validate our method on synthetic datasets and examine the functional properties of the transport map. Experiments on real-world datasets of robot arm trajectories further demonstrate the effectiveness of our method on applications in domain adaptation.
△ Less
Submitted 28 August, 2023; v1 submitted 7 February, 2021;
originally announced February 2021.
-
Towards Generalized Implementation of Wasserstein Distance in GANs
Authors:
Minkai Xu,
Zhiming Zhou,
Guansong Lu,
Jian Tang,
Weinan Zhang,
Yong Yu
Abstract:
Wasserstein GANs (WGANs), built upon the Kantorovich-Rubinstein (KR) duality of Wasserstein distance, is one of the most theoretically sound GAN models. However, in practice it does not always outperform other variants of GANs. This is mostly due to the imperfect implementation of the Lipschitz condition required by the KR duality. Extensive work has been done in the community with different imple…
▽ More
Wasserstein GANs (WGANs), built upon the Kantorovich-Rubinstein (KR) duality of Wasserstein distance, is one of the most theoretically sound GAN models. However, in practice it does not always outperform other variants of GANs. This is mostly due to the imperfect implementation of the Lipschitz condition required by the KR duality. Extensive work has been done in the community with different implementations of the Lipschitz constraint, which, however, is still hard to satisfy the restriction perfectly in practice. In this paper, we argue that the strong Lipschitz constraint might be unnecessary for optimization. Instead, we take a step back and try to relax the Lipschitz constraint. Theoretically, we first demonstrate a more general dual form of the Wasserstein distance called the Sobolev duality, which relaxes the Lipschitz constraint but still maintains the favorable gradient property of the Wasserstein distance. Moreover, we show that the KR duality is actually a special case of the Sobolev duality. Based on the relaxed duality, we further propose a generalized WGAN training scheme named Sobolev Wasserstein GAN (SWGAN), and empirically demonstrate the improvement of SWGAN over existing methods with extensive experiments.
△ Less
Submitted 12 January, 2021; v1 submitted 6 December, 2020;
originally announced December 2020.
-
How to Measure Your App: A Couple of Pitfalls and Remedies in Measuring App Performance in Online Controlled Experiments
Authors:
Yuxiang Xie,
Meng Xu,
Evan Chow,
Xiaolin Shi
Abstract:
Effectively measuring, understanding, and improving mobile app performance is of paramount importance for mobile app developers. Across the mobile Internet landscape, companies run online controlled experiments (A/B tests) with thousands of performance metrics in order to understand how app performance causally impacts user retention and to guard against service or app regressions that degrade use…
▽ More
Effectively measuring, understanding, and improving mobile app performance is of paramount importance for mobile app developers. Across the mobile Internet landscape, companies run online controlled experiments (A/B tests) with thousands of performance metrics in order to understand how app performance causally impacts user retention and to guard against service or app regressions that degrade user experiences. To capture certain characteristics particular to performance metrics, such as enormous observation volume and high skewness in distribution, an industry-standard practice is to construct a performance metric as a quantile over all performance events in control or treatment buckets in A/B tests. In our experience with thousands of A/B tests provided by Snap, we have discovered some pitfalls in this industry-standard way of calculating performance metrics that can lead to unexplained movements in performance metrics and unexpected misalignment with user engagement metrics. In this paper, we discuss two major pitfalls in this industry-standard practice of measuring performance for mobile apps. One arises from strong heterogeneity in both mobile devices and user engagement, and the other arises from self-selection bias caused by post-treatment user engagement changes. To remedy these two pitfalls, we introduce several scalable methods including user-level performance metric calculation and imputation and matching for missing metric values. We have extensively evaluated these methods on both simulation data and real A/B tests, and have deployed them into Snap's in-house experimentation platform.
△ Less
Submitted 29 November, 2020;
originally announced November 2020.
-
Pointwise Binary Classification with Pairwise Confidence Comparisons
Authors:
Lei Feng,
Senlin Shu,
Nan Lu,
Bo Han,
Miao Xu,
Gang Niu,
Bo An,
Masashi Sugiyama
Abstract:
To alleviate the data requirement for training effective binary classifiers in binary classification, many weakly supervised learning settings have been proposed. Among them, some consider using pairwise but not pointwise labels, when pointwise labels are not accessible due to privacy, confidentiality, or security reasons. However, as a pairwise label denotes whether or not two data points share a…
▽ More
To alleviate the data requirement for training effective binary classifiers in binary classification, many weakly supervised learning settings have been proposed. Among them, some consider using pairwise but not pointwise labels, when pointwise labels are not accessible due to privacy, confidentiality, or security reasons. However, as a pairwise label denotes whether or not two data points share a pointwise label, it cannot be easily collected if either point is equally likely to be positive or negative. Thus, in this paper, we propose a novel setting called pairwise comparison (Pcomp) classification, where we have only pairs of unlabeled data that we know one is more likely to be positive than the other. Firstly, we give a Pcomp data generation process, derive an unbiased risk estimator (URE) with theoretical guarantee, and further improve URE using correction functions. Secondly, we link Pcomp classification to noisy-label learning to develop a progressive URE and improve it by imposing consistency regularization. Finally, we demonstrate by experiments the effectiveness of our methods, which suggests Pcomp is a valuable and practically useful type of pairwise supervision besides the pairwise label.
△ Less
Submitted 13 January, 2022; v1 submitted 5 October, 2020;
originally announced October 2020.
-
Bigeminal Priors Variational auto-encoder
Authors:
Xuming Ran,
Mingkun Xu,
Qi Xu,
Huihui Zhou,
Quanying Liu
Abstract:
Variational auto-encoders (VAEs) are an influential and generally-used class of likelihood-based generative models in unsupervised learning. The likelihood-based generative models have been reported to be highly robust to the out-of-distribution (OOD) inputs and can be a detector by assuming that the model assigns higher likelihoods to the samples from the in-distribution (ID) dataset than an OOD…
▽ More
Variational auto-encoders (VAEs) are an influential and generally-used class of likelihood-based generative models in unsupervised learning. The likelihood-based generative models have been reported to be highly robust to the out-of-distribution (OOD) inputs and can be a detector by assuming that the model assigns higher likelihoods to the samples from the in-distribution (ID) dataset than an OOD dataset. However, recent works reported a phenomenon that VAE recognizes some OOD samples as ID by assigning a higher likelihood to the OOD inputs compared to the one from ID. In this work, we introduce a new model, namely Bigeminal Priors Variational auto-encoder (BPVAE), to address this phenomenon. The BPVAE aims to enhance the robustness of the VAEs by combing the power of VAE with the two independent priors that belong to the training dataset and simple dataset, which complexity is lower than the training dataset, respectively. BPVAE learns two datasets'features, assigning a higher likelihood for the training dataset than the simple dataset. In this way, we can use BPVAE's density estimate for detecting the OOD samples. Quantitative experimental results suggest that our model has better generalization capability and stronger robustness than the standard VAEs, proving the effectiveness of the proposed approach of hybrid learning by collaborative priors. Overall, this work paves a new avenue to potentially overcome the OOD problem via multiple latent priors modeling.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
Regret Bounds and Reinforcement Learning Exploration of EXP-based Algorithms
Authors:
Mengfan Xu,
Diego Klabjan
Abstract:
We study the challenging exploration incentive problem in both bandit and reinforcement learning, where the rewards are scale-free and potentially unbounded, driven by real-world scenarios and differing from existing work. Past works in reinforcement learning either assume costly interactions with an environment or propose algorithms finding potentially low quality local maxima. Motivated by EXP-t…
▽ More
We study the challenging exploration incentive problem in both bandit and reinforcement learning, where the rewards are scale-free and potentially unbounded, driven by real-world scenarios and differing from existing work. Past works in reinforcement learning either assume costly interactions with an environment or propose algorithms finding potentially low quality local maxima. Motivated by EXP-type methods that integrate multiple agents (experts) for exploration in bandits with the assumption that rewards are bounded, we propose new algorithms, namely EXP4.P and EXP4-RL for exploration in the unbounded reward case, and demonstrate their effectiveness in these new settings. Unbounded rewards introduce challenges as the regret cannot be limited by the number of trials, and selecting suboptimal arms may lead to infinite regret. Specifically, we establish EXP4.P's regret upper bounds in both bounded and unbounded linear and stochastic contextual bandits. Surprisingly, we also find that by including one sufficiently competent expert, EXP4.P can achieve global optimality in the linear case. This unbounded reward result is also applicable to a revised version of EXP3.P in the Multi-armed Bandit scenario. In EXP4-RL, we extend EXP4.P from bandit scenarios to reinforcement learning to incentivize exploration by multiple agents, including one high-performing agent, for both efficiency and excellence. This algorithm has been tested on difficult-to-explore games and shows significant improvements in exploration compared to state-of-the-art.
△ Less
Submitted 3 May, 2024; v1 submitted 20 September, 2020;
originally announced September 2020.
-
Experimental Analysis of Legendre Decomposition in Machine Learning
Authors:
Jianye Pang,
Kai Yi,
Wanguang Yin,
Min Xu
Abstract:
In this technical report, we analyze Legendre decomposition for non-negative tensor in theory and application. In theory, the properties of dual parameters and dually flat manifold in Legendre decomposition are reviewed, and the process of tensor projection and parameter updating is analyzed. In application, a series of verification experiments and clustering experiments with parameters on submani…
▽ More
In this technical report, we analyze Legendre decomposition for non-negative tensor in theory and application. In theory, the properties of dual parameters and dually flat manifold in Legendre decomposition are reviewed, and the process of tensor projection and parameter updating is analyzed. In application, a series of verification experiments and clustering experiments with parameters on submanifold were carried out, ho** to find an effective lower dimensional representation of the input tensor. The experimental results show that the parameters on submanifold have no ability to be directly used as low-rank representations. Combined with analysis, we connect Legendre decomposition with neural networks and low-rank representation applications, and put forward some promising prospects.
△ Less
Submitted 21 September, 2020; v1 submitted 12 August, 2020;
originally announced August 2020.
-
A Survey on Concept Factorization: From Shallow to Deep Representation Learning
Authors:
Zhao Zhang,
Yan Zhang,
Mingliang Xu,
Li Zhang,
Yi Yang,
Shuicheng Yan
Abstract:
The quality of learned features by representation learning determines the performance of learning algorithms and the related application tasks (such as high-dimensional data clustering). As a relatively new paradigm for representation learning, Concept Factorization (CF) has attracted a great deal of interests in the areas of machine learning and data mining for over a decade. Lots of effective CF…
▽ More
The quality of learned features by representation learning determines the performance of learning algorithms and the related application tasks (such as high-dimensional data clustering). As a relatively new paradigm for representation learning, Concept Factorization (CF) has attracted a great deal of interests in the areas of machine learning and data mining for over a decade. Lots of effective CF based methods have been proposed based on different perspectives and properties, but note that it still remains not easy to grasp the essential connections and figure out the underlying explanatory factors from exiting studies. In this paper, we therefore survey the recent advances on CF methodologies and the potential benchmarks by categorizing and summarizing the current methods. Specifically, we first re-view the root CF method, and then explore the advancement of CF-based representation learning ranging from shallow to deep/multilayer cases. We also introduce the potential application areas of CF-based methods. Finally, we point out some future directions for studying the CF-based representation learning. Overall, this survey provides an insightful overview of both theoretical basis and current developments in the field of CF, which can also help the interested researchers to understand the current trends of CF and find the most appropriate CF techniques to deal with particular applications.
△ Less
Submitted 31 January, 2021; v1 submitted 31 July, 2020;
originally announced July 2020.
-
Provably Consistent Partial-Label Learning
Authors:
Lei Feng,
Jiaqi Lv,
Bo Han,
Miao Xu,
Gang Niu,
Xin Geng,
Bo An,
Masashi Sugiyama
Abstract:
Partial-label learning (PLL) is a multi-class classification problem, where each training example is associated with a set of candidate labels. Even though many practical PLL methods have been proposed in the last two decades, there lacks a theoretical understanding of the consistency of those methods-none of the PLL methods hitherto possesses a generation process of candidate label sets, and then…
▽ More
Partial-label learning (PLL) is a multi-class classification problem, where each training example is associated with a set of candidate labels. Even though many practical PLL methods have been proposed in the last two decades, there lacks a theoretical understanding of the consistency of those methods-none of the PLL methods hitherto possesses a generation process of candidate label sets, and then it is still unclear why such a method works on a specific dataset and when it may fail given a different dataset. In this paper, we propose the first generation model of candidate label sets, and develop two novel PLL methods that are guaranteed to be provably consistent, i.e., one is risk-consistent and the other is classifier-consistent. Our methods are advantageous, since they are compatible with any deep network or stochastic optimizer. Furthermore, thanks to the generation model, we would be able to answer the two questions above by testing if the generation model matches given candidate label sets. Experiments on benchmark and real-world datasets validate the effectiveness of the proposed generation model and two PLL methods.
△ Less
Submitted 23 October, 2020; v1 submitted 17 July, 2020;
originally announced July 2020.
-
Detecting Out-of-distribution Samples via Variational Auto-encoder with Reliable Uncertainty Estimation
Authors:
Xuming Ran,
Mingkun Xu,
Lingrui Mei,
Qi Xu,
Quanying Liu
Abstract:
Variational autoencoders (VAEs) are influential generative models with rich representation capabilities from the deep neural network architecture and Bayesian method. However, VAE models have a weakness that assign a higher likelihood to out-of-distribution (OOD) inputs than in-distribution (ID) inputs. To address this problem, a reliable uncertainty estimation is considered to be critical for in-…
▽ More
Variational autoencoders (VAEs) are influential generative models with rich representation capabilities from the deep neural network architecture and Bayesian method. However, VAE models have a weakness that assign a higher likelihood to out-of-distribution (OOD) inputs than in-distribution (ID) inputs. To address this problem, a reliable uncertainty estimation is considered to be critical for in-depth understanding of OOD inputs. In this study, we propose an improved noise contrastive prior (INCP) to be able to integrate into the encoder of VAEs, called INCPVAE. INCP is scalable, trainable and compatible with VAEs, and it also adopts the merits from the INCP for uncertainty estimation. Experiments on various datasets demonstrate that compared to the standard VAEs, our model is superior in uncertainty estimation for the OOD data and is robust in anomaly detection tasks. The INCPVAE model obtains reliable uncertainty estimation for OOD inputs and solves the OOD problem in VAE models.
△ Less
Submitted 1 November, 2021; v1 submitted 16 July, 2020;
originally announced July 2020.
-
Pricing cyber insurance for a large-scale network
Authors:
Lei Hua,
Maochao Xu
Abstract:
Facing the lack of cyber insurance loss data, we propose an innovative approach for pricing cyber insurance for a large-scale network based on synthetic data. The synthetic data is generated by the proposed risk spreading and recovering algorithm that allows infection and recovery events to occur sequentially, and allows dependence of random waiting time to infection for different nodes. The scale…
▽ More
Facing the lack of cyber insurance loss data, we propose an innovative approach for pricing cyber insurance for a large-scale network based on synthetic data. The synthetic data is generated by the proposed risk spreading and recovering algorithm that allows infection and recovery events to occur sequentially, and allows dependence of random waiting time to infection for different nodes. The scale-free network framework is adopted to account for the topology uncertainty of the random large-scale network. Extensive simulation studies are conducted to understand the risk spreading and recovering mechanism, and to uncover the most important underwriting risk factors. A case study is also presented to demonstrate that the proposed approach and algorithm can be adapted accordingly to provide reference for cyber insurance pricing.
△ Less
Submitted 29 June, 2020;
originally announced July 2020.
-
Neural Datalog Through Time: Informed Temporal Modeling via Logical Specification
Authors:
Hongyuan Mei,
Guanghui Qin,
Minjie Xu,
Jason Eisner
Abstract:
Learning how to predict future events from patterns of past events is difficult when the set of possible event types is large. Training an unrestricted neural model might overfit to spurious patterns. To exploit domain-specific knowledge of how past events might affect an event's present probability, we propose using a temporal deductive database to track structured facts over time. Rules serve to…
▽ More
Learning how to predict future events from patterns of past events is difficult when the set of possible event types is large. Training an unrestricted neural model might overfit to spurious patterns. To exploit domain-specific knowledge of how past events might affect an event's present probability, we propose using a temporal deductive database to track structured facts over time. Rules serve to prove facts from other facts and from past events. Each fact has a time-varying state---a vector computed by a neural net whose topology is determined by the fact's provenance, including its experience of past events. The possible event types at any time are given by special facts, whose probabilities are neurally modeled alongside their states. In both synthetic and real-world domains, we show that neural probabilistic models derived from concise Datalog programs improve prediction by encoding appropriate domain knowledge in their architecture.
△ Less
Submitted 16 August, 2020; v1 submitted 30 June, 2020;
originally announced June 2020.
-
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising
Authors:
Xiaotian Hao,
Zhaoqing Peng,
Yi Ma,
Guan Wang,
Junqi **,
Jianye Hao,
Shan Chen,
Rongquan Bai,
Mingzhou Xie,
Miao Xu,
Zhenzhe Zheng,
Chuan Yu,
Han Li,
Jian Xu,
Kun Gai
Abstract:
In E-commerce, advertising is essential for merchants to reach their target users. The typical objective is to maximize the advertiser's cumulative revenue over a period of time under a budget constraint. In real applications, an advertisement (ad) usually needs to be exposed to the same user multiple times until the user finally contributes revenue (e.g., places an order). However, existing adver…
▽ More
In E-commerce, advertising is essential for merchants to reach their target users. The typical objective is to maximize the advertiser's cumulative revenue over a period of time under a budget constraint. In real applications, an advertisement (ad) usually needs to be exposed to the same user multiple times until the user finally contributes revenue (e.g., places an order). However, existing advertising systems mainly focus on the immediate revenue with single ad exposures, ignoring the contribution of each exposure to the final conversion, thus usually falls into suboptimal solutions. In this paper, we formulate the sequential advertising strategy optimization as a dynamic knapsack problem. We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space while ensuring the solution quality. To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach. Extensive offline and online experiments show the superior performance of our approaches over state-of-the-art baselines in terms of cumulative revenue.
△ Less
Submitted 29 June, 2020;
originally announced June 2020.
-
Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes
Authors:
Mengdi Xu,
Wenhao Ding,
Jiacheng Zhu,
Zuxin Liu,
Baiming Chen,
Ding Zhao
Abstract:
Continuously learning to solve unseen tasks with limited experience has been extensively pursued in meta-learning and continual learning, but with restricted assumptions such as accessible task distributions, independently and identically distributed tasks, and clear task delineations. However, real-world physical tasks frequently violate these assumptions, resulting in performance degradation. Th…
▽ More
Continuously learning to solve unseen tasks with limited experience has been extensively pursued in meta-learning and continual learning, but with restricted assumptions such as accessible task distributions, independently and identically distributed tasks, and clear task delineations. However, real-world physical tasks frequently violate these assumptions, resulting in performance degradation. This paper proposes a continual online model-based reinforcement learning approach that does not require pre-training to solve task-agnostic problems with unknown task boundaries. We maintain a mixture of experts to handle nonstationarity, and represent each different type of dynamics with a Gaussian Process to efficiently leverage collected data and expressively model uncertainty. We propose a transition prior to account for the temporal dependencies in streaming data and update the mixture online via sequential variational inference. Our approach reliably handles the task distribution shift by generating new models for never-before-seen dynamics and reusing old models for previously seen dynamics. In experiments, our approach outperforms alternative methods in non-stationary tasks, including classic control with changing dynamics and decision making in different driving scenarios.
△ Less
Submitted 30 November, 2020; v1 submitted 19 June, 2020;
originally announced June 2020.
-
Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data
Authors:
Chengxu Yang,
Qipeng Wang,
Mengwei Xu,
Zhenpeng Chen,
Kaigui Bian,
Yunxin Liu,
Xuanzhe Liu
Abstract:
Federated learning (FL) is an emerging, privacy-preserving machine learning paradigm, drawing tremendous attention in both academia and industry. A unique characteristic of FL is heterogeneity, which resides in the various hardware specifications and dynamic states across the participating devices. Theoretically, heterogeneity can exert a huge influence on the FL training process, e.g., causing a…
▽ More
Federated learning (FL) is an emerging, privacy-preserving machine learning paradigm, drawing tremendous attention in both academia and industry. A unique characteristic of FL is heterogeneity, which resides in the various hardware specifications and dynamic states across the participating devices. Theoretically, heterogeneity can exert a huge influence on the FL training process, e.g., causing a device unavailable for training or unable to upload its model updates. Unfortunately, these impacts have never been systematically studied and quantified in existing FL literature.
In this paper, we carry out the first empirical study to characterize the impacts of heterogeneity in FL. We collect large-scale data from 136k smartphones that can faithfully reflect heterogeneity in real-world settings. We also build a heterogeneity-aware FL platform that complies with the standard FL protocol but with heterogeneity in consideration. Based on the data and the platform, we conduct extensive experiments to compare the performance of state-of-the-art FL algorithms under heterogeneity-aware and heterogeneity-unaware settings. Results show that heterogeneity causes non-trivial performance degradation in FL, including up to 9.2% accuracy drop, 2.32x lengthened training time, and undermined fairness. Furthermore, we analyze potential impact factors and find that device failure and participant bias are two potential factors for performance degradation. Our study provides insightful implications for FL practitioners. On the one hand, our findings suggest that FL algorithm designers consider necessary heterogeneity during the evaluation. On the other hand, our findings urge system providers to design specific mechanisms to mitigate the impacts of heterogeneity.
△ Less
Submitted 12 March, 2021; v1 submitted 12 June, 2020;
originally announced June 2020.
-
Bayesian Inference with the l1-ball Prior: Solving Combinatorial Problems with Exact Zeros
Authors:
Maoran Xu,
Leo L. Duan
Abstract:
The l1-regularization is very popular in high dimensional statistics -- it changes a combinatorial problem of choosing which subset of the parameter are zero, into a simple continuous optimization. Using a continuous prior concentrated near zero, the Bayesian counterparts are successful in quantifying the uncertainty in the variable selection problems; nevertheless, the lack of exact zeros makes i…
▽ More
The l1-regularization is very popular in high dimensional statistics -- it changes a combinatorial problem of choosing which subset of the parameter are zero, into a simple continuous optimization. Using a continuous prior concentrated near zero, the Bayesian counterparts are successful in quantifying the uncertainty in the variable selection problems; nevertheless, the lack of exact zeros makes it difficult for broader problems such as the change-point detection and rank selection. Inspired by the duality of the l1-regularization as a constraint onto an l1-ball, we propose a new prior by projecting a continuous distribution onto the l1-ball. This creates a positive probability on the ball boundary, which contains both continuous elements and exact zeros. Unlike the spike-and-slab prior, this l1-ball projection is continuous and differentiable almost surely, making the posterior estimation amenable to the Hamiltonian Monte Carlo algorithm. We examine the properties, such as the volume change due to the projection, the connection to the combinatorial prior, the minimax concentration rate in the linear problem. We demonstrate the usefulness of exact zeros that simplify the combinatorial problems, such as the change-point detection in time series, the dimension selection of mixture model and the low-rank-plus-sparse change detection in the medical images.
△ Less
Submitted 20 February, 2023; v1 submitted 1 June, 2020;
originally announced June 2020.
-
Inference on the History of a Randomly Growing Tree
Authors:
Harry Crane,
Min Xu
Abstract:
The spread of infectious disease in a human community or the proliferation of fake news on social media can be modeled as a randomly growing tree-shaped graph. The history of the random growth process is often unobserved but contains important information such as the source of the infection. We consider the problem of statistical inference on aspects of the latent history using only a single snaps…
▽ More
The spread of infectious disease in a human community or the proliferation of fake news on social media can be modeled as a randomly growing tree-shaped graph. The history of the random growth process is often unobserved but contains important information such as the source of the infection. We consider the problem of statistical inference on aspects of the latent history using only a single snapshot of the final tree. Our approach is to apply random labels to the observed unlabeled tree and analyze the resulting distribution of the growth process, conditional on the final outcome. We show that this conditional distribution is tractable under a shape-exchangeability condition, which we introduce here, and that this condition is satisfied for many popular models for randomly growing trees such as uniform attachment, linear preferential attachment and uniform attachment on a $D$-regular tree. For inference of the root under shape-exchangeability, we propose O(n log n) time algorithms for constructing confidence sets with valid frequentist coverage as well as bounds on the expected size of the confidence sets. We also provide efficient sampling algorithms that extend our methods to a wide class of inference problems.
△ Less
Submitted 13 January, 2021; v1 submitted 18 May, 2020;
originally announced May 2020.
-
Triaging moderate COVID-19 and other viral pneumonias from routine blood tests
Authors:
Forrest Sheng Bao,
Youbiao He,
Jie Liu,
Yuanfang Chen,
Qian Li,
Christina R. Zhang,
Lei Han,
Baoli Zhu,
Yaorong Ge,
Shi Chen,
Ming Xu,
Liu Ouyang
Abstract:
The COVID-19 is swee** the world with deadly consequences. Its contagious nature and clinical similarity to other pneumonias make separating subjects contracted with COVID-19 and non-COVID-19 viral pneumonia a priority and a challenge. However, COVID-19 testing has been greatly limited by the availability and cost of existing methods, even in developed countries like the US. Intrigued by the wid…
▽ More
The COVID-19 is swee** the world with deadly consequences. Its contagious nature and clinical similarity to other pneumonias make separating subjects contracted with COVID-19 and non-COVID-19 viral pneumonia a priority and a challenge. However, COVID-19 testing has been greatly limited by the availability and cost of existing methods, even in developed countries like the US. Intrigued by the wide availability of routine blood tests, we propose to leverage them for COVID-19 testing using the power of machine learning. Two proven-robust machine learning model families, random forests (RFs) and support vector machines (SVMs), are employed to tackle the challenge. Trained on blood data from 208 moderate COVID-19 subjects and 86 subjects with non-COVID-19 moderate viral pneumonia, the best result is obtained in an SVM-based classifier with an accuracy of 84%, a sensitivity of 88%, a specificity of 80%, and a precision of 92%. The results are found explainable from both machine learning and medical perspectives. A privacy-protected web portal is set up to help medical personnel in their practice and the trained models are released for developers to further build other applications. We hope our results can help the world fight this pandemic and welcome clinical verification of our approach on larger populations.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
A Graph Gaussian Embedding Method for Predicting Alzheimer's Disease Progression with MEG Brain Networks
Authors:
Mengjia Xu,
David Lopez Sanz,
Pilar Garces,
Fernando Maestu,
Quanzheng Li,
Dimitrios Pantazis
Abstract:
Characterizing the subtle changes of functional brain networks associated with the pathological cascade of Alzheimer's disease (AD) is important for early diagnosis and prediction of disease progression prior to clinical symptoms. We developed a new deep learning method, termed multiple graph Gaussian embedding model (MG2G), which can learn highly informative network features by map** high-dimen…
▽ More
Characterizing the subtle changes of functional brain networks associated with the pathological cascade of Alzheimer's disease (AD) is important for early diagnosis and prediction of disease progression prior to clinical symptoms. We developed a new deep learning method, termed multiple graph Gaussian embedding model (MG2G), which can learn highly informative network features by map** high-dimensional resting-state brain networks into a low-dimensional latent space. These latent distribution-based embeddings enable a quantitative characterization of subtle and heterogeneous brain connectivity patterns at different regions and can be used as input to traditional classifiers for various downstream graph analytic tasks, such as AD early stage prediction, and statistical evaluation of between-group significant alterations across brain regions. We used MG2G to detect the intrinsic latent dimensionality of MEG brain networks, predict the progression of patients with mild cognitive impairment (MCI) to AD, and identify brain regions with network alterations related to MCI.
△ Less
Submitted 10 November, 2020; v1 submitted 7 May, 2020;
originally announced May 2020.
-
Delay-Aware Multi-Agent Reinforcement Learning for Cooperative and Competitive Environments
Authors:
Baiming Chen,
Mengdi Xu,
Zuxin Liu,
Liang Li,
Ding Zhao
Abstract:
Action and observation delays exist prevalently in the real-world cyber-physical systems which may pose challenges in reinforcement learning design. It is particularly an arduous task when handling multi-agent systems where the delay of one agent could spread to other agents. To resolve this problem, this paper proposes a novel framework to deal with delays as well as the non-stationary training i…
▽ More
Action and observation delays exist prevalently in the real-world cyber-physical systems which may pose challenges in reinforcement learning design. It is particularly an arduous task when handling multi-agent systems where the delay of one agent could spread to other agents. To resolve this problem, this paper proposes a novel framework to deal with delays as well as the non-stationary training issue of multi-agent tasks with model-free deep reinforcement learning. We formally define the Delay-Aware Markov Game that incorporates the delays of all agents in the environment. To solve Delay-Aware Markov Games, we apply centralized training and decentralized execution that allows agents to use extra information to ease the non-stationarity issue of the multi-agent systems during training, without the need of a centralized controller during execution. Experiments are conducted in multi-agent particle environments including cooperative communication, cooperative navigation, and competitive experiments. We also test the proposed algorithm in traffic scenarios that require coordination of all autonomous vehicles to show the practical value of delay-awareness. Results show that the proposed delay-aware multi-agent reinforcement learning algorithm greatly alleviates the performance degradation introduced by delay. Codes and demo videos are available at: https://github.com/baimingc/delay-aware-MARL.
△ Less
Submitted 28 August, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Delay-Aware Model-Based Reinforcement Learning for Continuous Control
Authors:
Baiming Chen,
Mengdi Xu,
Liang Li,
Ding Zhao
Abstract:
Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented states using the Markov reward process. We develop a delay-aware model-based reinforcement learning framework that can incorporate the multi-step delay into the le…
▽ More
Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented states using the Markov reward process. We develop a delay-aware model-based reinforcement learning framework that can incorporate the multi-step delay into the learned system models without learning effort. Experiments with the Gym and MuJoCo platforms show that the proposed delay-aware model-based algorithm is more efficient in training and transferable between systems with various durations of delay compared with off-policy model-free reinforcement learning methods. Codes available at: https://github.com/baimingc/dambrl.
△ Less
Submitted 11 May, 2020;
originally announced May 2020.