-
Optimal Dynamic Treatment Regime Estimation in the Presence of Nonadherence
Authors:
Dylan Spicker,
Michael P. Wallace,
Grace Y. Yi
Abstract:
Dynamic treatment regimes (DTRs) are sequences of functions that formalize the process of precision medicine. DTRs take as input patient information and output treatment recommendations. A major focus of the DTR literature has been on the estimation of optimal DTRs, the sequences of decision rules that result in the best outcome in expectation, across the complete population were they to be applie…
▽ More
Dynamic treatment regimes (DTRs) are sequences of functions that formalize the process of precision medicine. DTRs take as input patient information and output treatment recommendations. A major focus of the DTR literature has been on the estimation of optimal DTRs, the sequences of decision rules that result in the best outcome in expectation, across the complete population were they to be applied. While there is a rich literature on optimal DTR estimation, to date there has been minimal consideration of the impacts of nonadherence on these estimation procedures. Nonadherence refers to any process through that an individual's prescribed treatment does not match their true treatment. We explore the impacts of nonadherence and demonstrate that generally, when nonadherence is ignored, suboptimal regimes will be estimated. In light of these findings we propose a method for estimating optimal DTRs in the presence of nonadherence. The resulting estimators are consistent and asymptotically normal, with a double robustness property. Using simulations we demonstrate the reliability of these results, and illustrate comparable performance between the proposed estimation procedure adjusting for the impacts of nonadherence and estimators that are computed on data without nonadherence.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
A General Form of Covariate Adjustment in Randomized Clinical Trials
Authors:
Marlena S. Bannick,
Jun Shao,
**gyi Liu,
Yu Du,
Yanyao Yi,
Ting Ye
Abstract:
In randomized clinical trials, adjusting for baseline covariates can improve credibility and efficiency for demonstrating and quantifying treatment effects. This article studies the augmented inverse propensity weighted (AIPW) estimator, which is a general form of covariate adjustment that uses linear, generalized linear, and non-parametric or machine learning models for the conditional mean of th…
▽ More
In randomized clinical trials, adjusting for baseline covariates can improve credibility and efficiency for demonstrating and quantifying treatment effects. This article studies the augmented inverse propensity weighted (AIPW) estimator, which is a general form of covariate adjustment that uses linear, generalized linear, and non-parametric or machine learning models for the conditional mean of the response given covariates. Under covariate-adaptive randomization, we establish general theorems that show a complete picture of the asymptotic normality, {efficiency gain, and applicability of AIPW estimators}. In particular, we provide for the first time a rigorous theoretical justification of using machine learning methods with cross-fitting for dependent data under covariate-adaptive randomization. Based on the general theorems, we offer insights on the conditions for guaranteed efficiency gain and universal applicability {under different randomization schemes}, which also motivate a joint calibration strategy using some constructed covariates after applying AIPW. Our methods are implemented in the R package RobinCar.
△ Less
Submitted 25 March, 2024; v1 submitted 16 June, 2023;
originally announced June 2023.
-
Barycenter Estimation of Positive Semi-Definite Matrices with Bures-Wasserstein Distance
Authors:
**gyi Zheng,
Huajun Huang,
Yuyan Yi,
Yuexin Li,
Shu-Chin Lin
Abstract:
Brain-computer interface (BCI) builds a bridge between human brain and external devices by recording brain signals and translating them into commands for devices to perform the user's imagined action. The core of the BCI system is the classifier that labels the input signals as the user's imagined action. The classifiers that directly classify covariance matrices using Riemannian geometry are wide…
▽ More
Brain-computer interface (BCI) builds a bridge between human brain and external devices by recording brain signals and translating them into commands for devices to perform the user's imagined action. The core of the BCI system is the classifier that labels the input signals as the user's imagined action. The classifiers that directly classify covariance matrices using Riemannian geometry are widely used not only in BCI domain but also in a variety of fields including neuroscience, remote sensing, biomedical imaging, etc. However, the existing Affine-Invariant Riemannian-based methods treat covariance matrices as positive definite while they are indeed positive semi-definite especially for high dimensional data. Besides, the Affine-Invariant Riemannian-based barycenter estimation algorithms become time consuming, not robust, and have convergence issues when the dimension and number of covariance matrices become large. To address these challenges, in this paper, we establish the mathematical foundation for Bures-Wasserstein distance and propose new algorithms to estimate the barycenter of positive semi-definite matrices efficiently and robustly. Both theoretical and computational aspects of Bures-Wasserstein distance and barycenter estimation algorithms are discussed. With extensive simulations, we comprehensively investigate the accuracy, efficiency, and robustness of the barycenter estimation algorithms coupled with Bures-Wasserstein distance. The results show that Bures-Wasserstein based barycenter estimation algorithms are more efficient and robust.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
Robust Variance Estimation for Covariate-Adjusted Unconditional Treatment Effect in Randomized Clinical Trials with Binary Outcomes
Authors:
Ting Ye,
Marlena Bannick,
Yanyao Yi,
Jun Shao
Abstract:
To improve precision of estimation and power of testing hypothesis for an unconditional treatment effect in randomized clinical trials with binary outcomes, researchers and regulatory agencies recommend using g-computation as a reliable method of covariate adjustment. However, the practical application of g-computation is hindered by the lack of an explicit robust variance formula that can be used…
▽ More
To improve precision of estimation and power of testing hypothesis for an unconditional treatment effect in randomized clinical trials with binary outcomes, researchers and regulatory agencies recommend using g-computation as a reliable method of covariate adjustment. However, the practical application of g-computation is hindered by the lack of an explicit robust variance formula that can be used for different unconditional treatment effects of interest. To fill this gap, we provide explicit and robust variance estimators for g-computation estimators and demonstrate through simulations that the variance estimators can be reliably applied in practice.
△ Less
Submitted 27 March, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
A matching design for augmenting a randomized clinical trial with external control
Authors:
Jianghao Li,
Yu Du,
Huayu Liu,
Yanyao Yi
Abstract:
The use of information from real world to assess the effectiveness of medical products is becoming increasingly popular and more acceptable by regulatory agencies. According to a strategic real-world evidence framework published by U.S. Food and Drug Administration, a hybrid randomized controlled trial that augments internal control arm with real-world data is a pragmatic approach worth more atten…
▽ More
The use of information from real world to assess the effectiveness of medical products is becoming increasingly popular and more acceptable by regulatory agencies. According to a strategic real-world evidence framework published by U.S. Food and Drug Administration, a hybrid randomized controlled trial that augments internal control arm with real-world data is a pragmatic approach worth more attention. In this paper, we aim to improve on existing matching designs for such a hybrid randomized controlled trial. In particular, we propose to match the entire concurrent randomized clinical trial (RCT) such that (1) the matched external control subjects used to augment the internal control arm are as comparable as possible to the RCT population, (2) every active treatment arm in an RCT with multiple treatments is compared with the same control group, and (3) matching can be conducted and the matched set locked before treatment unblinding to better maintain the data integrity. Besides a weighted estimator, we also introduce a bootstrap method to obtain its variance estimation. The finite sample performance of the proposed method is evaluated by simulations based on data from a real clinical trial.
△ Less
Submitted 18 March, 2022;
originally announced March 2022.
-
Testing for Treatment Effect Twice Using Internal and External Controls in Clinical Trials
Authors:
Yanyao Yi,
Ying Zhang,
Yu Du,
Ting Ye
Abstract:
Leveraging external controls -- relevant individual patient data under control from external trials or real-world data -- has the potential to reduce the cost of randomized controlled trials (RCTs) while increasing the proportion of trial patients given access to novel treatments. However, due to lack of randomization, RCT patients and external controls may differ with respect to covariates that m…
▽ More
Leveraging external controls -- relevant individual patient data under control from external trials or real-world data -- has the potential to reduce the cost of randomized controlled trials (RCTs) while increasing the proportion of trial patients given access to novel treatments. However, due to lack of randomization, RCT patients and external controls may differ with respect to covariates that may or may not have been measured. Hence, after controlling for measured covariates, for instance by matching, testing for treatment effect using external controls may still be subject to unmeasured biases. In this paper, we propose a sensitivity analysis approach to quantify the magnitude of unmeasured bias that would be needed to alter the study conclusion that presumed no unmeasured biases are introduced by employing external controls. Whether leveraging external controls increases power or not depends on the interplay between sample sizes and the magnitude of treatment effect and unmeasured biases, which may be difficult to anticipate. This motivates a combined testing procedure that performs two highly correlated analyses, one with and one without external controls, with a small correction for multiple testing using the joint distribution of the two test statistics. The combined test provides a new method of sensitivity analysis designed for data fusion problems, which anchors at the unbiased analysis based on RCT only and spends a small proportion of the type I error to also test using the external controls. In this way, if leveraging external controls increases power, the power gain compared to the analysis based on RCT only can be substantial; if not, the power loss is small. The proposed method is evaluated in theory and power calculations, and applied to a real trial.
△ Less
Submitted 12 July, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Covariate-Adjusted Log-Rank Test: Guaranteed Efficiency Gain and Universal Applicability
Authors:
Ting Ye,
Jun Shao,
Yanyao Yi
Abstract:
Nonparametric covariate adjustment is considered for log-rank type tests of treatment effect with right-censored time-to-event data from clinical trials applying covariate-adaptive randomization. Our proposed covariate-adjusted log-rank test has a simple explicit formula and a guaranteed efficiency gain over the unadjusted test. We also show that our proposed test achieves universal applicability…
▽ More
Nonparametric covariate adjustment is considered for log-rank type tests of treatment effect with right-censored time-to-event data from clinical trials applying covariate-adaptive randomization. Our proposed covariate-adjusted log-rank test has a simple explicit formula and a guaranteed efficiency gain over the unadjusted test. We also show that our proposed test achieves universal applicability in the sense that the same formula of test can be universally applied to simple randomization and all commonly used covariate-adaptive randomization schemes such as the stratified permuted block and Pocock and Simon's minimization, which is not a property enjoyed by the unadjusted log-rank test. Our method is supported by novel asymptotic theory and empirical results for type I error and power of tests.
△ Less
Submitted 19 January, 2023; v1 submitted 28 January, 2022;
originally announced January 2022.
-
Text mining and sentiment analysis of COVID-19 tweets
Authors:
Qihuang Zhang,
Grace Y. Yi,
Li-Pang Chen,
Wenqing He
Abstract:
The human severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2), causing the COVID-19 disease, has continued to spread all over the world. It menacingly affects not only public health and global economics but also mental health and mood. While the impact of the COVID-19 pandemic has been widely studied, relatively fewer discussions about the sentimental reaction of the population have been…
▽ More
The human severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2), causing the COVID-19 disease, has continued to spread all over the world. It menacingly affects not only public health and global economics but also mental health and mood. While the impact of the COVID-19 pandemic has been widely studied, relatively fewer discussions about the sentimental reaction of the population have been available. In this article, we scrape COVID-19 related tweets on the microblogging platform, Twitter, and examine the tweets from Feb~24, 2020 to Oct~14, 2020 in four Canadian cities (Toronto, Montreal, Vancouver, and Calgary) and four U.S. cities (New York, Los Angeles, Chicago, and Seattle). Applying the Vader and NRC approaches, we evaluate the sentiment intensity scores and visualize the information over different periods of the pandemic. Sentiment scores for the tweets concerning three anti-epidemic measures, masks, vaccine, and lockdown, are computed for comparisons. The results of four Canadian cities are compared with four cities in the United States. We study the causal relationships between the infected cases, the tweet activities, and the sentiment scores of COVID-19 related tweets, by integrating the echo state network method with convergent cross-map**. Our analysis shows that public sentiments regarding COVID-19 vary in different time periods and locations. In general, people have a positive mood about COVID-19 and masks, but negative in the topics of vaccine and lockdown. The causal inference shows that the sentiment influences people's activities on Twitter, which is also correlated to the daily number of infections.
△ Less
Submitted 26 June, 2021;
originally announced June 2021.
-
Generalizations to Corrections for the Effects of Measurement Error in Approximately Consistent Methodologies
Authors:
Dylan Spicker,
Michael P Wallace,
Grace Y Yi
Abstract:
Measurement error is a pervasive issue which renders the results of an analysis unreliable. The measurement error literature contains numerous correction techniques, which can be broadly divided into those which aim to produce exactly consistent estimators, and those which are only approximately consistent. While consistency is a desirable property, it is typically attained only under specific mod…
▽ More
Measurement error is a pervasive issue which renders the results of an analysis unreliable. The measurement error literature contains numerous correction techniques, which can be broadly divided into those which aim to produce exactly consistent estimators, and those which are only approximately consistent. While consistency is a desirable property, it is typically attained only under specific model assumptions. Two techniques, regression calibration and simulation extrapolation, are used frequently in a wide variety of parametric and semiparametric settings. However, in many settings these methods are only approximately consistent. We generalize these corrections, relaxing assumptions placed on replicate measurements. Under regularity conditions, the estimators are shown to be asymptotically normal, with a sandwich estimator for the asymptotic variance. Through simulation, we demonstrate the improved performance of the modified estimators, over the standard techniques, when these assumptions are violated. We motivate these corrections using the Framingham Heart Study, and apply the generalized techniques to an analysis of these data.
△ Less
Submitted 5 November, 2021; v1 submitted 14 June, 2021;
originally announced June 2021.
-
A New Perspective on Debiasing Linear Regressions
Authors:
Yufei Yi,
Matey Neykov
Abstract:
In this paper, we propose an abstract procedure for debiasing constrained or regularized potentially high-dimensional linear models. It is elementary to show that the proposed procedure can produce $\frac{1}{\sqrt{n}}$-confidence intervals for individual coordinates (or even bounded contrasts) in models with unknown covariance, provided that the covariance has bounded spectrum. While the proof of…
▽ More
In this paper, we propose an abstract procedure for debiasing constrained or regularized potentially high-dimensional linear models. It is elementary to show that the proposed procedure can produce $\frac{1}{\sqrt{n}}$-confidence intervals for individual coordinates (or even bounded contrasts) in models with unknown covariance, provided that the covariance has bounded spectrum. While the proof of the statistical guarantees of our procedure is simple, its implementation requires more care due to the complexity of the optimization programs we need to solve. We spend the bulk of this paper giving examples in which the proposed algorithm can be implemented in practice. One fairly general class of instances which are amenable to applications of our procedure include convex constrained least squares. We are able to translate the procedure to an abstract algorithm over this class of models, and we give concrete examples where efficient polynomial time methods for debiasing exist. Those include the constrained version of the group LASSO, regression under monotone constraints, regression with positive monotone constraints and non-negative least squares. We also demonstrate that our method can debias Minkowski gauge selectors such as the ones proposed by Cai et al. (2016) under a certain condition. This solves an open problem posed by Cai et al. (2016) on how to debias such selectors when the covariance is unknown. In addition, we show that our abstract procedure can be applied to efficiently debias group LASSO, SLOPE and square-root SLOPE, among other popular regularized procedures under certain assumptions. We provide thorough simulation results in support of our theoretical findings.
△ Less
Submitted 11 January, 2023; v1 submitted 7 April, 2021;
originally announced April 2021.
-
Toward Better Practice of Covariate Adjustment in Analyzing Randomized Clinical Trials
Authors:
Ting Ye,
Jun Shao,
Yanyao Yi,
Qingyuan Zhao
Abstract:
In randomized clinical trials, adjustments for baseline covariates at both design and analysis stages are highly encouraged by regulatory agencies. A recent trend is to use a model-assisted approach for covariate adjustment to gain credibility and efficiency while producing asymptotically valid inference even when the model is incorrect. In this article we present three considerations for better p…
▽ More
In randomized clinical trials, adjustments for baseline covariates at both design and analysis stages are highly encouraged by regulatory agencies. A recent trend is to use a model-assisted approach for covariate adjustment to gain credibility and efficiency while producing asymptotically valid inference even when the model is incorrect. In this article we present three considerations for better practice when model-assisted inference is applied to adjust for covariates under simple or covariate-adaptive randomized trials: (1) guaranteed efficiency gain: a model-assisted method should often gain but never hurt efficiency; (2) wide applicability: a valid procedure should be applicable, and preferably universally applicable, to all commonly used randomization schemes; (3) robust standard error: variance estimation should be robust to model misspecification and heteroscedasticity. To achieve these, we recommend a model-assisted estimator under an analysis of heterogeneous covariance working model including all covariates utilized in randomization. Our conclusions are based on an asymptotic theory that provides a clear picture of how covariate-adaptive randomization and regression adjustment alter statistical efficiency. Our theory is more general than the existing ones in terms of studying arbitrary functions of response means (including linear contrasts, ratios, and odds ratios), multiple arms, guaranteed efficiency gain, optimality, and universal applicability.
△ Less
Submitted 13 July, 2021; v1 submitted 24 September, 2020;
originally announced September 2020.
-
Sensitivity Analysis of Error-Contaminated Time Series Data under Autoregressive Models with Application of COVID-19 Data
Authors:
Qihuang Zhang,
Grace Y. Yi
Abstract:
Autoregressive (AR) models are useful tools in time series analysis. Inferences under such models are distorted in the presence of measurement error, which is very common in practice. In this article, we establish analytical results for quantifying the biases of the parameter estimation in AR models if the measurement error effects are neglected. We propose two measurement error models to describe…
▽ More
Autoregressive (AR) models are useful tools in time series analysis. Inferences under such models are distorted in the presence of measurement error, which is very common in practice. In this article, we establish analytical results for quantifying the biases of the parameter estimation in AR models if the measurement error effects are neglected. We propose two measurement error models to describe different processes of data contamination. An estimating equation approach is proposed for the estimation of the model parameters with measurement error effects accounted for. We further discuss forecasting using the proposed method. Our work is inspired by COVID-19 data, which are error-contaminated due to multiple reasons including the asymptomatic cases and varying incubation periods. We implement our proposed method by conducting sensitivity analyses and forecasting of the mortality rate of COVID-19 over time for the four most populated provinces in Canada. The results suggest that incorporating or not incorporating measurement error effects yields rather different results for parameter estimation and forecasting.
△ Less
Submitted 12 August, 2020;
originally announced August 2020.
-
Inference on Average Treatment Effect under Minimization and Other Covariate-Adaptive Randomization Methods
Authors:
Ting Ye,
Yanyao Yi,
Jun Shao
Abstract:
Covariate-adaptive randomization schemes such as the minimization and stratified permuted blocks are often applied in clinical trials to balance treatment assignments across prognostic factors. The existing theoretical developments on inference after covariate-adaptive randomization are mostly limited to situations where a correct model between the response and covariates can be specified or the r…
▽ More
Covariate-adaptive randomization schemes such as the minimization and stratified permuted blocks are often applied in clinical trials to balance treatment assignments across prognostic factors. The existing theoretical developments on inference after covariate-adaptive randomization are mostly limited to situations where a correct model between the response and covariates can be specified or the randomization method has well-understood properties. Based on stratification with covariate levels utilized in randomization and a further adjusting for covariates not used in randomization, in this article we propose several estimators for model free inference on average treatment effect defined as the difference between response means under two treatments. We establish asymptotic normality of the proposed estimators under all popular covariate-adaptive randomization schemes including the minimization whose theoretical property is unclear, and we show that the asymptotic distributions are invariant with respect to covariate-adaptive randomization methods. Consistent variance estimators are constructed for asymptotic inference. Asymptotic relative efficiencies and finite sample properties of estimators are also studied. We recommend using one of our proposed estimators for valid and model free inference after covariate-adaptive randomization.
△ Less
Submitted 18 July, 2020;
originally announced July 2020.
-
QTRAN++: Improved Value Transformation for Cooperative Multi-Agent Reinforcement Learning
Authors:
Kyunghwan Son,
Sungsoo Ahn,
Roben Delos Reyes,
**woo Shin,
Yung Yi
Abstract:
QTRAN is a multi-agent reinforcement learning (MARL) algorithm capable of learning the largest class of joint-action value functions up to date. However, despite its strong theoretical guarantee, it has shown poor empirical performance in complex environments, such as Starcraft Multi-Agent Challenge (SMAC). In this paper, we identify the performance bottleneck of QTRAN and propose a substantially…
▽ More
QTRAN is a multi-agent reinforcement learning (MARL) algorithm capable of learning the largest class of joint-action value functions up to date. However, despite its strong theoretical guarantee, it has shown poor empirical performance in complex environments, such as Starcraft Multi-Agent Challenge (SMAC). In this paper, we identify the performance bottleneck of QTRAN and propose a substantially improved version, coined QTRAN++. Our gains come from (i) stabilizing the training objective of QTRAN, (ii) removing the strict role separation between the action-value estimators of QTRAN, and (iii) introducing a multi-head mixing network for value transformation. Through extensive evaluation, we confirm that our diagnosis is correct, and QTRAN++ successfully bridges the gap between empirical performance and theoretical guarantee. In particular, QTRAN++ newly achieves state-of-the-art performance in the SMAC environment. The code will be released.
△ Less
Submitted 5 October, 2020; v1 submitted 22 June, 2020;
originally announced June 2020.
-
Non-Sparse PCA in High Dimensions via Cone Projected Power Iteration
Authors:
Yufei Yi,
Matey Neykov
Abstract:
In this paper, we propose a cone projected power iteration algorithm to recover the first principal eigenvector from a noisy positive semidefinite matrix. When the true principal eigenvector is assumed to belong to a convex cone, the proposed algorithm is fast and has a tractable error. Specifically, the method achieves polynomial time complexity for certain convex cones equipped with fast project…
▽ More
In this paper, we propose a cone projected power iteration algorithm to recover the first principal eigenvector from a noisy positive semidefinite matrix. When the true principal eigenvector is assumed to belong to a convex cone, the proposed algorithm is fast and has a tractable error. Specifically, the method achieves polynomial time complexity for certain convex cones equipped with fast projection such as the monotone cone. It attains a small error when the noisy matrix has a small cone-restricted operator norm. We supplement the above results with a minimax lower bound of the error under the spiked covariance model. Our numerical experiments on simulated and real data, show that our method achieves shorter run time and smaller error in comparison to the ordinary power iteration and some sparse principal component analysis algorithms if the principal eigenvector is in a convex cone.
△ Less
Submitted 28 February, 2021; v1 submitted 15 May, 2020;
originally announced May 2020.
-
Enlarging Discriminative Power by Adding an Extra Class in Unsupervised Domain Adaptation
Authors:
Hai H. Tran,
Sumyeong Ahn,
Taeyoung Lee,
Yung Yi
Abstract:
In this paper, we study the problem of unsupervised domain adaptation that aims at obtaining a prediction model for the target domain using labeled data from the source domain and unlabeled data from the target domain. There exists an array of recent research based on the idea of extracting features that are not only invariant for both domains but also provide high discriminative power for the tar…
▽ More
In this paper, we study the problem of unsupervised domain adaptation that aims at obtaining a prediction model for the target domain using labeled data from the source domain and unlabeled data from the target domain. There exists an array of recent research based on the idea of extracting features that are not only invariant for both domains but also provide high discriminative power for the target domain. In this paper, we propose an idea of empowering the discriminativeness: Adding a new, artificial class and training the model on the data together with the GAN-generated samples of the new class. The trained model based on the new class samples is capable of extracting the features that are more discriminative by repositioning data of current classes in the target domain and therefore drawing the decision boundaries more effectively. Our idea is highly generic so that it is compatible with many existing methods such as DANN, VADA, and DIRT-T. We conduct various experiments for the standard data commonly used for the evaluation of unsupervised domain adaptations and demonstrate that our algorithm achieves the SOTA performance for many scenarios.
△ Less
Submitted 19 February, 2020;
originally announced February 2020.
-
Global-Local Metamodel Assisted Two-Stage Optimization via Simulation
Authors:
Wei Xie,
Yuan Yi,
Hua Zheng
Abstract:
To integrate strategic, tactical and operational decisions, the two-stage optimization has been widely used to guide dynamic decision making. In this paper, we study the two-stage stochastic programming for complex systems with unknown response estimated by simulation. We introduce the global-local metamodel assisted two-stage optimization via simulation that can efficiently employ the simulation…
▽ More
To integrate strategic, tactical and operational decisions, the two-stage optimization has been widely used to guide dynamic decision making. In this paper, we study the two-stage stochastic programming for complex systems with unknown response estimated by simulation. We introduce the global-local metamodel assisted two-stage optimization via simulation that can efficiently employ the simulation resource to iteratively solve for the optimal first- and second-stage decisions. Specifically, at each visited first-stage decision, we develop a local metamodel to simultaneously solve a set of scenario-based second-stage optimization problems, which also allows us to estimate the optimality gap. Then, we construct a global metamodel accounting for the errors induced by: (1) using a finite number of scenarios to approximate the expected future cost occurring in the planning horizon, (2) second-stage optimality gap, and (3) finite visited first-stage decisions. Assisted by the global-local metamodel, we propose a new simulation optimization approach that can efficiently and iteratively search for the optimal first- and second-stage decisions. Our framework can guarantee the convergence of optimal solution for the discrete two-stage optimization with unknown objective, and the empirical study indicates that it achieves substantial efficiency and accuracy.
△ Less
Submitted 13 October, 2019;
originally announced October 2019.
-
Solving Continual Combinatorial Selection via Deep Reinforcement Learning
Authors:
Hyungseok Song,
Hyeryung Jang,
Hai H. Tran,
Se-eun Yoon,
Kyunghwan Son,
Donggyu Yun,
Hyoju Chung,
Yung Yi
Abstract:
We consider the Markov Decision Process (MDP) of selecting a subset of items at each step, termed the Select-MDP (S-MDP). The large state and action spaces of S-MDPs make them intractable to solve with typical reinforcement learning (RL) algorithms especially when the number of items is huge. In this paper, we present a deep RL algorithm to solve this issue by adopting the following key ideas. Fir…
▽ More
We consider the Markov Decision Process (MDP) of selecting a subset of items at each step, termed the Select-MDP (S-MDP). The large state and action spaces of S-MDPs make them intractable to solve with typical reinforcement learning (RL) algorithms especially when the number of items is huge. In this paper, we present a deep RL algorithm to solve this issue by adopting the following key ideas. First, we convert the original S-MDP into an Iterative Select-MDP (IS-MDP), which is equivalent to the S-MDP in terms of optimal actions. IS-MDP decomposes a joint action of selecting K items simultaneously into K iterative selections resulting in the decrease of actions at the expense of an exponential increase of states. Second, we overcome this state space explo-sion by exploiting a special symmetry in IS-MDPs with novel weight shared Q-networks, which prov-ably maintain sufficient expressive power. Various experiments demonstrate that our approach works well even when the item space is large and that it scales to environments with item spaces different from those used in training.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.
-
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
Authors:
Kyunghwan Son,
Daewoo Kim,
Wan Ju Kang,
David Earl Hostallero,
Yung Yi
Abstract:
We explore value-based solutions for multi-agent reinforcement learning (MARL) tasks in the centralized training with decentralized execution (CTDE) regime popularized recently. However, VDN and QMIX are representative examples that use the idea of factorization of the joint action-value function into individual ones for decentralized execution. VDN and QMIX address only a fraction of factorizable…
▽ More
We explore value-based solutions for multi-agent reinforcement learning (MARL) tasks in the centralized training with decentralized execution (CTDE) regime popularized recently. However, VDN and QMIX are representative examples that use the idea of factorization of the joint action-value function into individual ones for decentralized execution. VDN and QMIX address only a fraction of factorizable MARL tasks due to their structural constraint in factorization such as additivity and monotonicity. In this paper, we propose a new factorization method for MARL, QTRAN, which is free from such structural constraints and takes on a new approach to transforming the original joint action-value function into an easily factorizable one, with the same optimal actions. QTRAN guarantees more general factorization than VDN or QMIX, thus covering a much wider class of MARL tasks than does previous methods. Our experiments for the tasks of multi-domain Gaussian-squeeze and modified predator-prey demonstrate QTRAN's superior performance with especially larger margins in games whose payoffs penalize non-cooperative behavior more aggressively.
△ Less
Submitted 14 May, 2019;
originally announced May 2019.
-
Big Data Meet Cyber-Physical Systems: A Panoramic Survey
Authors:
Rachad Atat,
Lingjia Liu,
**song Wu,
Guangyu Li,
Chunxuan Ye,
Yang Yi
Abstract:
The world is witnessing an unprecedented growth of cyber-physical systems (CPS), which are foreseen to revolutionize our world {via} creating new services and applications in a variety of sectors such as environmental monitoring, mobile-health systems, intelligent transportation systems and so on. The {information and communication technology }(ICT) sector is experiencing a significant growth in {…
▽ More
The world is witnessing an unprecedented growth of cyber-physical systems (CPS), which are foreseen to revolutionize our world {via} creating new services and applications in a variety of sectors such as environmental monitoring, mobile-health systems, intelligent transportation systems and so on. The {information and communication technology }(ICT) sector is experiencing a significant growth in { data} traffic, driven by the widespread usage of smartphones, tablets and video streaming, along with the significant growth of sensors deployments that are anticipated in the near future. {It} is expected to outstandingly increase the growth rate of raw sensed data. In this paper, we present the CPS taxonomy {via} providing a broad overview of data collection, storage, access, processing and analysis. Compared with other survey papers, this is the first panoramic survey on big data for CPS, where our objective is to provide a panoramic summary of different CPS aspects. Furthermore, CPS {require} cybersecurity to protect {them} against malicious attacks and unauthorized intrusion, which {become} a challenge with the enormous amount of data that is continuously being generated in the network. {Thus, we also} provide an overview of the different security solutions proposed for CPS big data storage, access and analytics. We also discuss big data meeting green challenges in the contexts of CPS.
△ Less
Submitted 29 October, 2018;
originally announced October 2018.
-
Distributive Dynamic Spectrum Access through Deep Reinforcement Learning: A Reservoir Computing Based Approach
Authors:
Hao-Hsuan Chang,
Hao Song,
Yang Yi,
Jianzhong Zhang,
Haibo He,
Lingjia Liu
Abstract:
Dynamic spectrum access (DSA) is regarded as an effective and efficient technology to share radio spectrum among different networks. As a secondary user (SU), a DSA device will face two critical problems: avoiding causing harmful interference to primary users (PUs), and conducting effective interference coordination with other secondary users. These two problems become even more challenging for a…
▽ More
Dynamic spectrum access (DSA) is regarded as an effective and efficient technology to share radio spectrum among different networks. As a secondary user (SU), a DSA device will face two critical problems: avoiding causing harmful interference to primary users (PUs), and conducting effective interference coordination with other secondary users. These two problems become even more challenging for a distributed DSA network where there is no centralized controllers for SUs. In this paper, we investigate communication strategies of a distributive DSA network under the presence of spectrum sensing errors. To be specific, we apply the powerful machine learning tool, deep reinforcement learning (DRL), for SUs to learn "appropriate" spectrum access strategies in a distributed fashion assuming NO knowledge of the underlying system statistics. Furthermore, a special type of recurrent neural network (RNN), called the reservoir computing (RC), is utilized to realize DRL by taking advantage of the underlying temporal correlation of the DSA network. Using the introduced machine learning-based strategy, SUs could make spectrum access decisions distributedly relying only on their own current and past spectrum sensing outcomes. Through extensive experiments, our results suggest that the RC-based spectrum access strategy can help the SU to significantly reduce the chances of collision with PUs and other SUs. We also show that our scheme outperforms the myopic method which assumes the knowledge of system statistics, and converges faster than the Q-learning method when the number of channels is large.
△ Less
Submitted 28 October, 2018;
originally announced October 2018.
-
Learning Data Dependency with Communication Cost
Authors:
Hyeryung Jang,
HyungSeok Song,
Yung Yi
Abstract:
In this paper, we consider the problem of recovering a graph that represents the statistical data dependency among nodes for a set of data samples generated by nodes, which provides the basic structure to perform an inference task, such as MAP (maximum a posteriori). This problem is referred to as structure learning. When nodes are spatially separated in different locations, running an inference a…
▽ More
In this paper, we consider the problem of recovering a graph that represents the statistical data dependency among nodes for a set of data samples generated by nodes, which provides the basic structure to perform an inference task, such as MAP (maximum a posteriori). This problem is referred to as structure learning. When nodes are spatially separated in different locations, running an inference algorithm requires a non-negligible amount of message passing, incurring some communication cost. We inevitably have the trade-off between the accuracy of structure learning and the cost we need to pay to perform a given message-passing based inference task because the learnt edge structures of data dependency and physical connectivity graph are often highly different. In this paper, we formalize this trade-off in an optimization problem which outputs the data dependency graph that jointly considers learning accuracy and message-passing costs. We focus on a distributed MAP as the target inference task, and consider two different implementations, ASYNC-MAP and SYNC-MAP that have different message-passing mechanisms and thus different cost structures. In ASYNC- MAP, we propose a polynomial time learning algorithm that is optimal, motivated by the problem of finding a maximum weight spanning tree. In SYNC-MAP, we first prove that it is NP-hard and propose a greedy heuristic. For both implementations, we then quantify how the probability that the resulting data graphs from those learning algorithms differ from the ideal data graph decays as the number of data samples grows, using the large deviation principle, where the decaying rate is characterized by some topological structures of both original data dependency and physical connectivity graphs as well as the degree of the trade-off. We validate our theoretical findings through extensive simulations, which confirms that it has a good match.
△ Less
Submitted 29 April, 2018;
originally announced April 2018.
-
Iterative Bayesian Learning for Crowdsourced Regression
Authors:
Jungseul Ok,
Sewoong Oh,
Yunhun Jang,
**woo Shin,
Yung Yi
Abstract:
Crowdsourcing platforms emerged as popular venues for purchasing human intelligence at low cost for large volume of tasks. As many low-paid workers are prone to give noisy answers, a common practice is to add redundancy by assigning multiple workers to each task and then simply average out these answers. However, to fully harness the wisdom of the crowd, one needs to learn the heterogeneous qualit…
▽ More
Crowdsourcing platforms emerged as popular venues for purchasing human intelligence at low cost for large volume of tasks. As many low-paid workers are prone to give noisy answers, a common practice is to add redundancy by assigning multiple workers to each task and then simply average out these answers. However, to fully harness the wisdom of the crowd, one needs to learn the heterogeneous quality of each worker. We resolve this fundamental challenge in crowdsourced regression tasks, i.e., the answer takes continuous labels, where identifying good or bad workers becomes much more non-trivial compared to a classification setting of discrete labels. In particular, we introduce a Bayesian iterative scheme and show that it provably achieves the optimal mean squared error. Our evaluations on synthetic and real-world datasets support our theoretical results and show the superiority of the proposed scheme.
△ Less
Submitted 8 October, 2018; v1 submitted 28 February, 2017;
originally announced February 2017.
-
Adiabatic Persistent Contrastive Divergence Learning
Authors:
Hyeryung Jang,
Hyungwon Choi,
Yung Yi,
**woo Shin
Abstract:
This paper studies the problem of parameter learning in probabilistic graphical models having latent variables, where the standard approach is the expectation maximization algorithm alternating expectation (E) and maximization (M) steps. However, both E and M steps are computationally intractable for high dimensional data, while the substitution of one step to a faster surrogate for combating agai…
▽ More
This paper studies the problem of parameter learning in probabilistic graphical models having latent variables, where the standard approach is the expectation maximization algorithm alternating expectation (E) and maximization (M) steps. However, both E and M steps are computationally intractable for high dimensional data, while the substitution of one step to a faster surrogate for combating against intractability can often cause failure in convergence. We propose a new learning algorithm which is computationally efficient and provably ensures convergence to a correct optimum. Its key idea is to run only a few cycles of Markov Chains (MC) in both E and M steps. Such an idea of running incomplete MC has been well studied only for M step in the literature, called Contrastive Divergence (CD) learning. While such known CD-based schemes find approximated gradients of the log-likelihood via the mean-field approach in E step, our proposed algorithm does exact ones via MC algorithms in both steps due to the multi-time-scale stochastic approximation theory. Despite its theoretical guarantee in convergence, the proposed scheme might suffer from the slow mixing of MC in E step. To tackle it, we also propose a hybrid approach applying both mean-field and MC approximation in E step, where the hybrid approach outperforms the bare mean-field CD scheme in our experiments on real-world datasets.
△ Less
Submitted 14 February, 2017; v1 submitted 26 May, 2016;
originally announced May 2016.
-
Optimal Inference in Crowdsourced Classification via Belief Propagation
Authors:
Jungseul Ok,
Sewoong Oh,
**woo Shin,
Yung Yi
Abstract:
Crowdsourcing systems are popular for solving large-scale labelling tasks with low-paid workers. We study the problem of recovering the true labels from the possibly erroneous crowdsourced labels under the popular Dawid-Skene model. To address this inference problem, several algorithms have recently been proposed, but the best known guarantee is still significantly larger than the fundamental limi…
▽ More
Crowdsourcing systems are popular for solving large-scale labelling tasks with low-paid workers. We study the problem of recovering the true labels from the possibly erroneous crowdsourced labels under the popular Dawid-Skene model. To address this inference problem, several algorithms have recently been proposed, but the best known guarantee is still significantly larger than the fundamental limit. We close this gap by introducing a tighter lower bound on the fundamental limit and proving that Belief Propagation (BP) exactly matches this lower bound. The guaranteed optimality of BP is the strongest in the sense that it is information-theoretically impossible for any other algorithm to correctly label a larger fraction of the tasks. Experimental results suggest that BP is close to optimal for all regimes considered and improves upon competing state-of-the-art algorithms.
△ Less
Submitted 11 January, 2017; v1 submitted 11 February, 2016;
originally announced February 2016.