Search | arXiv e-print repository

Practical Policy Optimization with Personalized Experimentation

Authors: Mia Garrard, Hanson Wang, Ben Letham, Shaun Singh, Abbas Kazerouni, Sarah Tan, Zehui Wang, Yin Huang, Yichun Hu, Chad Zhou, Norm Zhou, Eytan Bakshy

Abstract: Many organizations measure treatment effects via an experimentation platform to evaluate the casual effect of product variations prior to full-scale deployment. However, standard experimentation platforms do not perform optimally for end user populations that exhibit heterogeneous treatment effects (HTEs). Here we present a personalized experimentation framework, Personalized Experiments (PEX), wh… ▽ More Many organizations measure treatment effects via an experimentation platform to evaluate the casual effect of product variations prior to full-scale deployment. However, standard experimentation platforms do not perform optimally for end user populations that exhibit heterogeneous treatment effects (HTEs). Here we present a personalized experimentation framework, Personalized Experiments (PEX), which optimizes treatment group assignment at the user level via HTE modeling and sequential decision policy optimization to optimize multiple short-term and long-term outcomes simultaneously. We describe an end-to-end workflow that has proven to be successful in practice and can be readily implemented using open-source software. △ Less

Submitted 30 March, 2023; originally announced March 2023.

Comments: 5 pages, 2 figures

arXiv:2302.14139 [pdf, other]

Scalable End-to-End ML Platforms: from AutoML to Self-serve

Authors: Igor L. Markov, Pavlos A. Apostolopoulos, Mia R. Garrard, Tanya Qie, Yin Huang, Tanvi Gupta, Anika Li, Cesar Cardoso, George Han, Ryan Maghsoudian, Norm Zhou

Abstract: ML platforms help enable intelligent data-driven applications and maintain them with limited engineering effort. Upon sufficiently broad adoption, such platforms reach economies of scale that bring greater component reuse while improving efficiency of system development and maintenance. For an end-to-end ML platform with broad adoption, scaling relies on pervasive ML automation and system integrat… ▽ More ML platforms help enable intelligent data-driven applications and maintain them with limited engineering effort. Upon sufficiently broad adoption, such platforms reach economies of scale that bring greater component reuse while improving efficiency of system development and maintenance. For an end-to-end ML platform with broad adoption, scaling relies on pervasive ML automation and system integration to reach the quality we term self-serve that we define with ten requirements and six optional capabilities. With this in mind, we identify long-term goals for platform development, discuss related tradeoffs and future work. Our reasoning is illustrated on two commercially-deployed end-to-end ML platforms that host hundreds of real-time use cases -- one general-purpose and one specialized. △ Less

Submitted 3 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

Comments: 10 pages, 1 figure, 2 tables

arXiv:2111.03267 [pdf, other]

Interpretable Personalized Experimentation

Authors: Han Wu, Sarah Tan, Weiwei Li, Mia Garrard, Adam Obeng, Drew Dimmery, Shaun Singh, Hanson Wang, Daniel Jiang, Eytan Bakshy

Abstract: Black-box heterogeneous treatment effect (HTE) models are increasingly being used to create personalized policies that assign individuals to their optimal treatments. However, they are difficult to understand, and can be burdensome to maintain in a production environment. In this paper, we present a scalable, interpretable personalized experimentation system, implemented and deployed in production… ▽ More Black-box heterogeneous treatment effect (HTE) models are increasingly being used to create personalized policies that assign individuals to their optimal treatments. However, they are difficult to understand, and can be burdensome to maintain in a production environment. In this paper, we present a scalable, interpretable personalized experimentation system, implemented and deployed in production at Meta. The system works in a multiple treatment, multiple outcome setting typical at Meta to: (1) learn explanations for black-box HTE models; (2) generate interpretable personalized policies. We evaluate the methods used in the system on publicly available data and Meta use cases, and discuss lessons learnt during the development of the system. △ Less

Submitted 5 August, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

Comments: Camera-ready version for KDD 2022. Previously titled "Distilling Heterogeneity: From Explanations of Heterogeneous Treatment Effect Models to Interpretable Policies". A short version was presented at MIT CODE 2021

arXiv:2110.07554 [pdf, other]

Looper: An end-to-end ML platform for product decisions

Authors: Igor L. Markov, Hanson Wang, Nitya Kasturi, Shaun Singh, Sze Wai Yuen, Mia Garrard, Sarah Tran, Yin Huang, Zehui Wang, Igor Glotov, Tanvi Gupta, Boshuang Huang, Peng Chen, Xiaowen Xie, Michael Belkin, Sal Uryasev, Sam Howie, Eytan Bakshy, Norm Zhou

Abstract: Modern software systems and products increasingly rely on machine learning models to make data-driven decisions based on interactions with users, infrastructure and other systems. For broader adoption, this practice must (i) accommodate product engineers without ML backgrounds, (ii) support finegrain product-metric evaluation and (iii) optimize for product goals. To address shortcomings of prior p… ▽ More Modern software systems and products increasingly rely on machine learning models to make data-driven decisions based on interactions with users, infrastructure and other systems. For broader adoption, this practice must (i) accommodate product engineers without ML backgrounds, (ii) support finegrain product-metric evaluation and (iii) optimize for product goals. To address shortcomings of prior platforms, we introduce general principles for and the architecture of an ML platform, Looper, with simple APIs for decision-making and feedback collection. Looper covers the end-to-end ML lifecycle from collecting training data and model training to deployment and inference, and extends support to personalization, causal evaluation with heterogenous treatment effects, and Bayesian tuning for product goals. During the 2021 production deployment Looper simultaneously hosted 440-1,000 ML models that made 4-6 million real-time decisions per second. We sum up experiences of platform adopters and describe their learning curve. △ Less

Submitted 21 June, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: 11 pages + references, 7 figures; to appear in KDD 2022

Showing 1–4 of 4 results for author: Garrard, M