-
An Analysis of Switchback Designs in Reinforcement Learning
Authors:
Qianglin Wen,
Chengchun Shi,
Ying Yang,
Niansheng Tang,
Hongtu Zhu
Abstract:
This paper offers a detailed investigation of switchback designs in A/B testing, which alternate between baseline and new policies over time. Our aim is to thoroughly evaluate the effects of these designs on the accuracy of their resulting average treatment effect (ATE) estimators. We propose a novel "weak signal analysis" framework, which substantially simplifies the calculations of the mean squa…
▽ More
This paper offers a detailed investigation of switchback designs in A/B testing, which alternate between baseline and new policies over time. Our aim is to thoroughly evaluate the effects of these designs on the accuracy of their resulting average treatment effect (ATE) estimators. We propose a novel "weak signal analysis" framework, which substantially simplifies the calculations of the mean squared errors (MSEs) of these ATEs in Markov decision process environments. Our findings suggest that (i) when the majority of reward errors are positively correlated, the switchback design is more efficient than the alternating-day design which switches policies in a daily basis. Additionally, increasing the frequency of policy switches tends to reduce the MSE of the ATE estimator. (ii) When the errors are uncorrelated, however, all these designs become asymptotically equivalent. (iii) In cases where the majority of errors are negative correlated, the alternating-day design becomes the optimal choice. These insights are crucial, offering guidelines for practitioners on designing experiments in A/B testing. Our analysis accommodates a variety of policy value estimators, including model-based estimators, least squares temporal difference learning estimators, and double reinforcement learning estimators, thereby offering a comprehensive understanding of optimal design strategies for policy evaluation in reinforcement learning.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Dynamic Tensor Recommender Systems
Authors:
Yanqing Zhang,
Xuan Bi,
Niansheng Tang,
Annie Qu
Abstract:
Recommender systems have been extensively used by the entertainment industry, business marketing and the biomedical industry. In addition to its capacity of providing preference-based recommendations as an unsupervised learning methodology, it has been also proven useful in sales forecasting, product introduction and other production related businesses. Since some consumers and companies need a re…
▽ More
Recommender systems have been extensively used by the entertainment industry, business marketing and the biomedical industry. In addition to its capacity of providing preference-based recommendations as an unsupervised learning methodology, it has been also proven useful in sales forecasting, product introduction and other production related businesses. Since some consumers and companies need a recommendation or prediction for future budget, labor and supply chain coordination, dynamic recommender systems for precise forecasting have become extremely necessary. In this article, we propose a new recommendation method, namely the dynamic tensor recommender system (DTRS), which aims particularly at forecasting future recommendation. The proposed method utilizes a tensor-valued function of time to integrate time and contextual information, and creates a time-varying coefficient model for temporal tensor factorization through a polynomial spline approximation. Major advantages of the proposed method include competitive future recommendation predictions and effective prediction interval estimations. In theory, we establish the convergence rate of the proposed tensor factorization and asymptotic normality of the spline coefficient estimator. The proposed method is applied to simulations and IRI marketing data. Numerical studies demonstrate that the proposed method outperforms existing methods in terms of future time forecasting.
△ Less
Submitted 11 March, 2020;
originally announced March 2020.
-
Reuse and Adaptation for Entity Resolution through Transfer Learning
Authors:
Saravanan Thirumuruganathan,
Shameem A Puthiya Parambath,
Mourad Ouzzani,
Nan Tang,
Shafiq Joty
Abstract:
Entity resolution (ER) is one of the fundamental problems in data integration, where machine learning (ML) based classifiers often provide the state-of-the-art results. Considerable human effort goes into feature engineering and training data creation. In this paper, we investigate a new problem: Given a dataset D_T for ER with limited or no training data, is it possible to train a good ML classif…
▽ More
Entity resolution (ER) is one of the fundamental problems in data integration, where machine learning (ML) based classifiers often provide the state-of-the-art results. Considerable human effort goes into feature engineering and training data creation. In this paper, we investigate a new problem: Given a dataset D_T for ER with limited or no training data, is it possible to train a good ML classifier on D_T by reusing and adapting the training data of dataset D_S from same or related domain? Our major contributions include (1) a distributed representation based approach to encode each tuple from diverse datasets into a standard feature space; (2) identification of common scenarios where the reuse of training data can be beneficial; and (3) five algorithms for handling each of the aforementioned scenarios. We have performed comprehensive experiments on 12 datasets from 5 different domains (publications, movies, songs, restaurants, and books). Our experiments show that our algorithms provide significant benefits such as providing superior performance for a fixed training data size.
△ Less
Submitted 28 September, 2018;
originally announced September 2018.
-
Exponentially tilted likelihood inference on growing dimensional unconditional moment models
Authors:
Nian-Sheng Tang,
Xiao-Dong Yan,
Pu-Ying Zhao
Abstract:
Growing-dimensional data with likelihood unavailable are often encountered in various fields. This paper presents a penalized exponentially tilted likelihood (PETL) for variable selection and parameter estimation for growing dimensional unconditional moment models in the presence of correlation among variables and model misspecifica- tion. Under some regularity conditions, we investigate the consi…
▽ More
Growing-dimensional data with likelihood unavailable are often encountered in various fields. This paper presents a penalized exponentially tilted likelihood (PETL) for variable selection and parameter estimation for growing dimensional unconditional moment models in the presence of correlation among variables and model misspecifica- tion. Under some regularity conditions, we investigate the consistent and oracle proper- ties of the PETL estimators of parameters, and show that the constrainedly PETL ratio statistic for testing contrast hypothesis asymptotically follows the central chi-squared distribution. Theoretical results reveal that the PETL approach is robust to model mis- specification. We also study high-order asymptotic properties of the proposed PETL estimators. Simulation studies are conducted to investigate the finite performance of the proposed methodologies. An example from the Boston Housing Study is illustrated.
△ Less
Submitted 6 January, 2017; v1 submitted 25 December, 2016;
originally announced December 2016.