-
Best of Three Worlds: Adaptive Experimentation for Digital Marketing in Practice
Authors:
Tanner Fiez,
Houssam Nassif,
Yu-Cheng Chen,
Sergio Gamez,
Lalit Jain
Abstract:
Adaptive experimental design (AED) methods are increasingly being used in industry as a tool to boost testing throughput or reduce experimentation cost relative to traditional A/B/N testing methods. However, the behavior and guarantees of such methods are not well-understood beyond idealized stationary settings. This paper shares lessons learned regarding the challenges of naively using AED system…
▽ More
Adaptive experimental design (AED) methods are increasingly being used in industry as a tool to boost testing throughput or reduce experimentation cost relative to traditional A/B/N testing methods. However, the behavior and guarantees of such methods are not well-understood beyond idealized stationary settings. This paper shares lessons learned regarding the challenges of naively using AED systems in industrial settings where non-stationarity is prevalent, while also providing perspectives on the proper objectives and system specifications in such settings. We developed an AED framework for counterfactual inference based on these experiences, and tested it in a commercial environment.
△ Less
Submitted 26 February, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
A Data-Driven State Aggregation Approach for Dynamic Discrete Choice Models
Authors:
Sinong Geng,
Houssam Nassif,
Carlos A. Manzanares
Abstract:
We study dynamic discrete choice models, where a commonly studied problem involves estimating parameters of agent reward functions (also known as "structural" parameters), using agent behavioral data. Maximum likelihood estimation for such models requires dynamic programming, which is limited by the curse of dimensionality. In this work, we present a novel algorithm that provides a data-driven met…
▽ More
We study dynamic discrete choice models, where a commonly studied problem involves estimating parameters of agent reward functions (also known as "structural" parameters), using agent behavioral data. Maximum likelihood estimation for such models requires dynamic programming, which is limited by the curse of dimensionality. In this work, we present a novel algorithm that provides a data-driven method for selecting and aggregating states, which lowers the computational and sample complexity of estimation. Our method works in two stages. In the first stage, we use a flexible inverse reinforcement learning approach to estimate agent Q-functions. We use these estimated Q-functions, along with a clustering algorithm, to select a subset of states that are the most pivotal for driving changes in Q-functions. In the second stage, with these selected "aggregated" states, we conduct maximum likelihood estimation using a commonly used nested fixed-point algorithm. The proposed two-stage approach mitigates the curse of dimensionality by reducing the problem dimension. Theoretically, we derive finite-sample bounds on the associated estimation error, which also characterize the trade-off of computational complexity, estimation error, and sample complexity. We demonstrate the empirical performance of the algorithm in two classic dynamic discrete choice estimation applications.
△ Less
Submitted 31 May, 2023; v1 submitted 10 April, 2023;
originally announced April 2023.
-
Neural Insights for Digital Marketing Content Design
Authors:
Fanjie Kong,
Yuan Li,
Houssam Nassif,
Tanner Fiez,
Ricardo Henao,
Shreya Chakrabarti
Abstract:
In digital marketing, experimenting with new website content is one of the key levers to improve customer engagement. However, creating successful marketing content is a manual and time-consuming process that lacks clear guiding principles. This paper seeks to close the loop between content creation and online experimentation by offering marketers AI-driven actionable insights based on historical…
▽ More
In digital marketing, experimenting with new website content is one of the key levers to improve customer engagement. However, creating successful marketing content is a manual and time-consuming process that lacks clear guiding principles. This paper seeks to close the loop between content creation and online experimentation by offering marketers AI-driven actionable insights based on historical data to improve their creative process. We present a neural-network-based system that scores and extracts insights from a marketing content design, namely, a multimodal neural network predicts the attractiveness of marketing contents, and a post-hoc attribution method generates actionable insights for marketers to improve their content in specific marketing locations. Our insights not only point out the advantages and drawbacks of a given current content, but also provide design recommendations based on historical data. We show that our scoring model and insights work well both quantitatively and qualitatively.
△ Less
Submitted 7 June, 2023; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Adaptive Experimental Design and Counterfactual Inference
Authors:
Tanner Fiez,
Sergio Gamez,
Arick Chen,
Houssam Nassif,
Lalit Jain
Abstract:
Adaptive experimental design methods are increasingly being used in industry as a tool to boost testing throughput or reduce experimentation cost relative to traditional A/B/N testing methods. This paper shares lessons learned regarding the challenges and pitfalls of naively using adaptive experimentation systems in industrial settings where non-stationarity is prevalent, while also providing pers…
▽ More
Adaptive experimental design methods are increasingly being used in industry as a tool to boost testing throughput or reduce experimentation cost relative to traditional A/B/N testing methods. This paper shares lessons learned regarding the challenges and pitfalls of naively using adaptive experimentation systems in industrial settings where non-stationarity is prevalent, while also providing perspectives on the proper objectives and system specifications in these settings. We developed an adaptive experimental design framework for counterfactual inference based on these experiences, and tested it in a commercial environment.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Instance-optimal PAC Algorithms for Contextual Bandits
Authors:
Zhaoqi Li,
Lillian Ratliff,
Houssam Nassif,
Kevin Jamieson,
Lalit Jain
Abstract:
In the stochastic contextual bandit setting, regret-minimizing algorithms have been extensively researched, but their instance-minimizing best-arm identification counterparts remain seldom studied. In this work, we focus on the stochastic bandit problem in the $(ε,δ)$-$\textit{PAC}$ setting: given a policy class $Π$ the goal of the learner is to return a policy $π\in Π$ whose expected reward is wi…
▽ More
In the stochastic contextual bandit setting, regret-minimizing algorithms have been extensively researched, but their instance-minimizing best-arm identification counterparts remain seldom studied. In this work, we focus on the stochastic bandit problem in the $(ε,δ)$-$\textit{PAC}$ setting: given a policy class $Π$ the goal of the learner is to return a policy $π\in Π$ whose expected reward is within $ε$ of the optimal policy with probability greater than $1-δ$. We characterize the first $\textit{instance-dependent}$ PAC sample complexity of contextual bandits through a quantity $ρ_Π$, and provide matching upper and lower bounds in terms of $ρ_Π$ for the agnostic and linear contextual best-arm identification settings. We show that no algorithm can be simultaneously minimax-optimal for regret minimization and instance-dependent PAC for best-arm identification. Our main result is a new instance-optimal and computationally efficient algorithm that relies on a polynomial number of calls to an argmax oracle.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits
Authors:
Kwang-Sung Jun,
Lalit Jain,
Blake Mason,
Houssam Nassif
Abstract:
We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bound by Li et al. (2017) via recent developments of the self-concordant analysis of the logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a direct dependence on $1/κ$, where $κ$ is the minimal variance over all arms' reward distri…
▽ More
We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bound by Li et al. (2017) via recent developments of the self-concordant analysis of the logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a direct dependence on $1/κ$, where $κ$ is the minimal variance over all arms' reward distributions. In general, $1/κ$ scales exponentially with the norm of the unknown linear parameter $θ^*$. Instead of relying on this worst-case quantity, our confidence bound for the reward of any given arm depends directly on the variance of that arm's reward distribution. We present two applications of our novel bounds to pure exploration and regret minimization logistic bandits improving upon state-of-the-art performance guarantees. For pure exploration, we also provide a lower bound highlighting a dependence on $1/κ$ for a family of instances.
△ Less
Submitted 18 March, 2021; v1 submitted 23 November, 2020;
originally announced November 2020.
-
Agent-based Simulation Model and Deep Learning Techniques to Evaluate and Predict Transportation Trends around COVID-19
Authors:
Ding Wang,
Fan Zuo,
**gqin Gao,
Yueshuai He,
Zilin Bian,
Suzana Duran Bernardes,
Chaekuk Na,
**gxing Wang,
John Petinos,
Kaan Ozbay,
Joseph Y. J. Chow,
Shri Iyer,
Hani Nassif,
Xuegang Jeff Ban
Abstract:
The COVID-19 pandemic has affected travel behaviors and transportation system operations, and cities are grappling with what policies can be effective for a phased reopening shaped by social distancing. This edition of the white paper updates travel trends and highlights an agent-based simulation model's results to predict the impact of proposed phased reopening strategies. It also introduces a re…
▽ More
The COVID-19 pandemic has affected travel behaviors and transportation system operations, and cities are grappling with what policies can be effective for a phased reopening shaped by social distancing. This edition of the white paper updates travel trends and highlights an agent-based simulation model's results to predict the impact of proposed phased reopening strategies. It also introduces a real-time video processing method to measure social distancing through cameras on city streets.
△ Less
Submitted 23 September, 2020;
originally announced October 2020.
-
Toward the "New Normal": A Surge in Speeding, New Volume Patterns, and Recent Trends in Taxis/For-Hire Vehicles
Authors:
**gqin Gao,
Abhinav Bhattacharyya,
Ding Wang,
Nick Hudanich,
Siva Sooryaa,
Muruga Thambiran,
Suzana Duran Bernardes,
Chaekuk Na,
Fan Zuo,
Zilin Bian,
Kaan Ozbay,
Shri Iyer,
Hani Nassif,
Joseph Y. J. Chow
Abstract:
Six months into the pandemic and one month after the phase four reopening in New York City (NYC), restrictions are lifting, businesses and schools are reopening, but global infections are still rising. This white paper updates travel trends observed in the aftermath of the COVID-19 outbreak in NYC and highlight some findings toward the "new normal."
Six months into the pandemic and one month after the phase four reopening in New York City (NYC), restrictions are lifting, businesses and schools are reopening, but global infections are still rising. This white paper updates travel trends observed in the aftermath of the COVID-19 outbreak in NYC and highlight some findings toward the "new normal."
△ Less
Submitted 23 September, 2020;
originally announced September 2020.
-
Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions
Authors:
Sinong Geng,
Houssam Nassif,
Carlos A. Manzanares,
A. Max Reppen,
Ronnie Sircar
Abstract:
We propose a reward function estimation framework for inverse reinforcement learning with deep energy-based policies. We name our method PQR, as it sequentially estimates the Policy, the $Q$-function, and the Reward function by deep learning. PQR does not assume that the reward solely depends on the state, instead it allows for a dependency on the choice of action. Moreover, PQR allows for stochas…
▽ More
We propose a reward function estimation framework for inverse reinforcement learning with deep energy-based policies. We name our method PQR, as it sequentially estimates the Policy, the $Q$-function, and the Reward function by deep learning. PQR does not assume that the reward solely depends on the state, instead it allows for a dependency on the choice of action. Moreover, PQR allows for stochastic state transitions. To accomplish this, we assume the existence of one anchor action whose reward is known, typically the action of doing nothing, yielding no reward. We present both estimators and algorithms for the PQR method. When the environment transition is known, we prove that the PQR reward estimator uniquely recovers the true reward. With unknown transitions, we bound the estimation error of PQR. Finally, the performance of PQR is demonstrated by synthetic and real-world datasets.
△ Less
Submitted 14 August, 2020; v1 submitted 14 July, 2020;
originally announced July 2020.
-
Bayesian Meta-Prior Learning Using Empirical Bayes
Authors:
Sareh Nabi,
Houssam Nassif,
Joseph Hong,
Hamed Mamani,
Guido Imbens
Abstract:
Adding domain knowledge to a learning system is known to improve results. In multi-parameter Bayesian frameworks, such knowledge is incorporated as a prior. On the other hand, various model parameters can have different learning rates in real-world problems, especially with skewed data. Two often-faced challenges in Operation Management and Management Science applications are the absence of inform…
▽ More
Adding domain knowledge to a learning system is known to improve results. In multi-parameter Bayesian frameworks, such knowledge is incorporated as a prior. On the other hand, various model parameters can have different learning rates in real-world problems, especially with skewed data. Two often-faced challenges in Operation Management and Management Science applications are the absence of informative priors, and the inability to control parameter learning rates. In this study, we propose a hierarchical Empirical Bayes approach that addresses both challenges, and that can generalize to any Bayesian framework. Our method learns empirical meta-priors from the data itself and uses them to decouple the learning rates of first-order and second-order features (or any other given feature grou**) in a Generalized Linear Model. As the first-order features are likely to have a more pronounced effect on the outcome, focusing on learning first-order weights first is likely to improve performance and convergence time. Our Empirical Bayes method clamps features in each group together and uses the deployed model's observed data to empirically compute a hierarchical prior in hindsight. We report theoretical results for the unbiasedness, strong consistency, and optimal frequentist cumulative regret properties of our meta-prior variance estimator. We apply our method to a standard supervised learning optimization problem, as well as an online combinatorial optimization problem in a contextual bandit setting implemented in an Amazon production system. Both during simulations and live experiments, our method shows marked improvements, especially in cases of small traffic. Our findings are promising, as optimizing over sparse data is often a challenge.
△ Less
Submitted 12 July, 2021; v1 submitted 4 February, 2020;
originally announced February 2020.
-
Seeker: Real-Time Interactive Search
Authors:
Ari Biswas,
Thai T Pham,
Michael Vogelsong,
Benjamin Snyder,
Houssam Nassif
Abstract:
This paper introduces Seeker, a system that allows users to interactively refine search rankings in real time, through feedback in the form of likes and dislikes. When searching online, users may not know how to accurately describe their product of choice in words. An alternative approach is to search an embedding space, allowing the user to query using a representation of the item (like a tune fo…
▽ More
This paper introduces Seeker, a system that allows users to interactively refine search rankings in real time, through feedback in the form of likes and dislikes. When searching online, users may not know how to accurately describe their product of choice in words. An alternative approach is to search an embedding space, allowing the user to query using a representation of the item (like a tune for a song, or a picture for an object). However, this approach requires the user to possess an example representation of their desired item. Additionally, most current search systems do not allow the user to dynamically adapt the results with further feedback. On the other hand, users often have a mental picture of the desired item and are able to answer ordinal questions of the form: "Is this item similar to what you have in mind?" With this assumption, our algorithm allows for users to provide sequential feedback on search results to adapt the search feed. We show that our proposed approach works well both qualitatively and quantitatively. Unlike most previous representation-based search systems, we can quantify the quality of our algorithm by evaluating humans-in-the-loop experiments.
△ Less
Submitted 17 May, 2019;
originally announced May 2019.
-
An Efficient Bandit Algorithm for Realtime Multivariate Optimization
Authors:
Daniel N Hill,
Houssam Nassif,
Yi Liu,
Anand Iyer,
S V N Vishwanathan
Abstract:
Optimization is commonly employed to determine the content of web pages, such as to maximize conversions on landing pages or click-through rates on search engine result pages. Often the layout of these pages can be decoupled into several separate decisions. For example, the composition of a landing page may involve deciding which image to show, which wording to use, what color background to displa…
▽ More
Optimization is commonly employed to determine the content of web pages, such as to maximize conversions on landing pages or click-through rates on search engine result pages. Often the layout of these pages can be decoupled into several separate decisions. For example, the composition of a landing page may involve deciding which image to show, which wording to use, what color background to display, etc. Such optimization is a combinatorial problem over an exponentially large decision space. Randomized experiments do not scale well to this setting, and therefore, in practice, one is typically limited to optimizing a single aspect of a web page at a time. This represents a missed opportunity in both the speed of experimentation and the exploitation of possible interactions between layout decisions.
Here we focus on multivariate optimization of interactive web pages. We formulate an approach where the possible interactions between different components of the page are modeled explicitly. We apply bandit methodology to explore the layout space efficiently and use hill-climbing to select optimal content in realtime. Our algorithm also extends to contextualization and personalization of layout selection. Simulation results show the suitability of our approach to large decision spaces with strong interactions between content. We further apply our algorithm to optimize a message that promotes adoption of an Amazon service. After only a single week of online optimization, we saw a 21% conversion increase compared to the median layout. Our technique is currently being deployed to optimize content across several locations at Amazon.com.
△ Less
Submitted 22 October, 2018;
originally announced October 2018.
-
An Inductive Logic Programming Approach to Validate Hexose Binding Biochemical Knowledge
Authors:
Houssam Nassif,
Hassan Al-Ali,
Sawsan Khuri,
Walid Keirouz,
David Page
Abstract:
Hexoses are simple sugars that play a key role in many cellular pathways, and in the regulation of development and disease mechanisms. Current protein-sugar computational models are based, at least partially, on prior biochemical findings and knowledge. They incorporate different parts of these findings in predictive black-box models. We investigate the empirical support for biochemical findings b…
▽ More
Hexoses are simple sugars that play a key role in many cellular pathways, and in the regulation of development and disease mechanisms. Current protein-sugar computational models are based, at least partially, on prior biochemical findings and knowledge. They incorporate different parts of these findings in predictive black-box models. We investigate the empirical support for biochemical findings by comparing Inductive Logic Programming (ILP) induced rules to actual biochemical results. We mine the Protein Data Bank for a representative data set of hexose binding sites, non-hexose binding sites and surface grooves. We build an ILP model of hexose-binding sites and evaluate our results against several baseline machine learning classifiers. Our method achieves an accuracy similar to that of other black-box classifiers while providing insight into the discriminating process. In addition, it confirms wet-lab findings and reveals a previously unreported Trp-Glu amino acids dependency.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.
-
Contextual Multi-Armed Bandits for Causal Marketing
Authors:
Neela Sawant,
Chitti Babu Namballa,
Narayanan Sadagopan,
Houssam Nassif
Abstract:
This work explores the idea of a causal contextual multi-armed bandit approach to automated marketing, where we estimate and optimize the causal (incremental) effects. Focusing on causal effect leads to better return on investment (ROI) by targeting only the persuadable customers who wouldn't have taken the action organically. Our approach draws on strengths of causal inference, uplift modeling, a…
▽ More
This work explores the idea of a causal contextual multi-armed bandit approach to automated marketing, where we estimate and optimize the causal (incremental) effects. Focusing on causal effect leads to better return on investment (ROI) by targeting only the persuadable customers who wouldn't have taken the action organically. Our approach draws on strengths of causal inference, uplift modeling, and multi-armed bandits. It optimizes on causal treatment effects rather than pure outcome, and incorporates counterfactual generation within data collection. Following uplift modeling results, we optimize over the incremental business metric. Multi-armed bandit methods allow us to scale to multiple treatments and to perform off-policy policy evaluation on logged data. The Thompson sampling strategy in particular enables exploration of treatments on similar customer contexts and materialization of counterfactual outcomes. Preliminary offline experiments on a retail Fashion marketing dataset show merits of our proposal.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.
-
Diversifying Music Recommendations
Authors:
Houssam Nassif,
Kemal Oral Cansizlar,
Mitchell Goodman,
SVN Vishwanathan
Abstract:
We compare submodular and Jaccard methods to diversify Amazon Music recommendations. Submodularity significantly improves recommendation quality and user engagement. Unlike the Jaccard method, our submodular approach incorporates item relevance score within its optimization function, and produces a relevant and uniformly diverse set.
We compare submodular and Jaccard methods to diversify Amazon Music recommendations. Submodularity significantly improves recommendation quality and user engagement. Unlike the Jaccard method, our submodular approach incorporates item relevance score within its optimization function, and produces a relevant and uniformly diverse set.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.
-
Adaptive, Personalized Diversity for Visual Discovery
Authors:
Choon Hui Teo,
Houssam Nassif,
Daniel Hill,
Sriram Srinavasan,
Mitchell Goodman,
Vijai Mohan,
SVN Vishwanathan
Abstract:
Search queries are appropriate when users have explicit intent, but they perform poorly when the intent is difficult to express or if the user is simply looking to be inspired. Visual browsing systems allow e-commerce platforms to address these scenarios while offering the user an engaging shop** experience. Here we explore extensions in the direction of adaptive personalization and item diversi…
▽ More
Search queries are appropriate when users have explicit intent, but they perform poorly when the intent is difficult to express or if the user is simply looking to be inspired. Visual browsing systems allow e-commerce platforms to address these scenarios while offering the user an engaging shop** experience. Here we explore extensions in the direction of adaptive personalization and item diversification within Stream, a new form of visual browsing and discovery by Amazon. Our system presents the user with a diverse set of interesting items while adapting to user interactions. Our solution consists of three components (1) a Bayesian regression model for scoring the relevance of items while leveraging uncertainty, (2) a submodular diversification framework that re-ranks the top scoring items based on category, and (3) personalized category preferences learned from the user's behavior. When tested on live traffic, our algorithms show a strong lift in click-through-rate and session duration.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.