Search | arXiv e-print repository

An Analysis of Switchback Designs in Reinforcement Learning

Authors: Qianglin Wen, Chengchun Shi, Ying Yang, Niansheng Tang, Hongtu Zhu

Abstract: This paper offers a detailed investigation of switchback designs in A/B testing, which alternate between baseline and new policies over time. Our aim is to thoroughly evaluate the effects of these designs on the accuracy of their resulting average treatment effect (ATE) estimators. We propose a novel "weak signal analysis" framework, which substantially simplifies the calculations of the mean squa… ▽ More This paper offers a detailed investigation of switchback designs in A/B testing, which alternate between baseline and new policies over time. Our aim is to thoroughly evaluate the effects of these designs on the accuracy of their resulting average treatment effect (ATE) estimators. We propose a novel "weak signal analysis" framework, which substantially simplifies the calculations of the mean squared errors (MSEs) of these ATEs in Markov decision process environments. Our findings suggest that (i) when the majority of reward errors are positively correlated, the switchback design is more efficient than the alternating-day design which switches policies in a daily basis. Additionally, increasing the frequency of policy switches tends to reduce the MSE of the ATE estimator. (ii) When the errors are uncorrelated, however, all these designs become asymptotically equivalent. (iii) In cases where the majority of errors are negative correlated, the alternating-day design becomes the optimal choice. These insights are crucial, offering guidelines for practitioners on designing experiments in A/B testing. Our analysis accommodates a variety of policy value estimators, including model-based estimators, least squares temporal difference learning estimators, and double reinforcement learning estimators, thereby offering a comprehensive understanding of optimal design strategies for policy evaluation in reinforcement learning. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2003.05568 [pdf, other]

Dynamic Tensor Recommender Systems

Authors: Yanqing Zhang, Xuan Bi, Niansheng Tang, Annie Qu

Abstract: Recommender systems have been extensively used by the entertainment industry, business marketing and the biomedical industry. In addition to its capacity of providing preference-based recommendations as an unsupervised learning methodology, it has been also proven useful in sales forecasting, product introduction and other production related businesses. Since some consumers and companies need a re… ▽ More Recommender systems have been extensively used by the entertainment industry, business marketing and the biomedical industry. In addition to its capacity of providing preference-based recommendations as an unsupervised learning methodology, it has been also proven useful in sales forecasting, product introduction and other production related businesses. Since some consumers and companies need a recommendation or prediction for future budget, labor and supply chain coordination, dynamic recommender systems for precise forecasting have become extremely necessary. In this article, we propose a new recommendation method, namely the dynamic tensor recommender system (DTRS), which aims particularly at forecasting future recommendation. The proposed method utilizes a tensor-valued function of time to integrate time and contextual information, and creates a time-varying coefficient model for temporal tensor factorization through a polynomial spline approximation. Major advantages of the proposed method include competitive future recommendation predictions and effective prediction interval estimations. In theory, we establish the convergence rate of the proposed tensor factorization and asymptotic normality of the spline coefficient estimator. The proposed method is applied to simulations and IRI marketing data. Numerical studies demonstrate that the proposed method outperforms existing methods in terms of future time forecasting. △ Less

Submitted 11 March, 2020; originally announced March 2020.

arXiv:1809.11084 [pdf, other]

Reuse and Adaptation for Entity Resolution through Transfer Learning

Authors: Saravanan Thirumuruganathan, Shameem A Puthiya Parambath, Mourad Ouzzani, Nan Tang, Shafiq Joty

Abstract: Entity resolution (ER) is one of the fundamental problems in data integration, where machine learning (ML) based classifiers often provide the state-of-the-art results. Considerable human effort goes into feature engineering and training data creation. In this paper, we investigate a new problem: Given a dataset D_T for ER with limited or no training data, is it possible to train a good ML classif… ▽ More Entity resolution (ER) is one of the fundamental problems in data integration, where machine learning (ML) based classifiers often provide the state-of-the-art results. Considerable human effort goes into feature engineering and training data creation. In this paper, we investigate a new problem: Given a dataset D_T for ER with limited or no training data, is it possible to train a good ML classifier on D_T by reusing and adapting the training data of dataset D_S from same or related domain? Our major contributions include (1) a distributed representation based approach to encode each tuple from diverse datasets into a standard feature space; (2) identification of common scenarios where the reuse of training data can be beneficial; and (3) five algorithms for handling each of the aforementioned scenarios. We have performed comprehensive experiments on 12 datasets from 5 different domains (publications, movies, songs, restaurants, and books). Our experiments show that our algorithms provide significant benefits such as providing superior performance for a fixed training data size. △ Less

Submitted 28 September, 2018; originally announced September 2018.

arXiv:1612.08246 [pdf, ps, other]

Exponentially tilted likelihood inference on growing dimensional unconditional moment models

Authors: Nian-Sheng Tang, Xiao-Dong Yan, Pu-Ying Zhao

Abstract: Growing-dimensional data with likelihood unavailable are often encountered in various fields. This paper presents a penalized exponentially tilted likelihood (PETL) for variable selection and parameter estimation for growing dimensional unconditional moment models in the presence of correlation among variables and model misspecifica- tion. Under some regularity conditions, we investigate the consi… ▽ More Growing-dimensional data with likelihood unavailable are often encountered in various fields. This paper presents a penalized exponentially tilted likelihood (PETL) for variable selection and parameter estimation for growing dimensional unconditional moment models in the presence of correlation among variables and model misspecifica- tion. Under some regularity conditions, we investigate the consistent and oracle proper- ties of the PETL estimators of parameters, and show that the constrainedly PETL ratio statistic for testing contrast hypothesis asymptotically follows the central chi-squared distribution. Theoretical results reveal that the PETL approach is robust to model mis- specification. We also study high-order asymptotic properties of the proposed PETL estimators. Simulation studies are conducted to investigate the finite performance of the proposed methodologies. An example from the Boston Housing Study is illustrated. △ Less

Submitted 6 January, 2017; v1 submitted 25 December, 2016; originally announced December 2016.

Showing 1–4 of 4 results for author: Tang, N