-
Truthful Online Scheduling of Cloud Workloads under Uncertainty
Authors:
Moshe Babaioff,
Ronny Lempel,
Brendan Lucier,
Ishai Menache,
Aleksandrs Slivkins,
Sam Chiu-Wai Wong
Abstract:
Cloud computing customers often submit repeating jobs and computation pipelines on \emph{approximately} regular schedules, with arrival and running times that exhibit variance. This pattern, typical of training tasks in machine learning, allows customers to partially predict future job requirements. We develop a model of cloud computing platforms that receive statements of work (SoWs) in an online…
▽ More
Cloud computing customers often submit repeating jobs and computation pipelines on \emph{approximately} regular schedules, with arrival and running times that exhibit variance. This pattern, typical of training tasks in machine learning, allows customers to partially predict future job requirements. We develop a model of cloud computing platforms that receive statements of work (SoWs) in an online fashion. The SoWs describe future jobs whose arrival times and durations are probabilistic, and whose utility to the submitting agents declines with completion time. The arrival and duration distributions, as well as the utility functions, are considered private customer information and are reported by strategic agents to a scheduler that is optimizing for social welfare.
We design pricing, scheduling, and eviction mechanisms that incentivize truthful reporting of SoWs. An important challenge is maintaining incentives despite the possibility of the platform becoming saturated. We introduce a framework to reduce scheduling under uncertainty to a relaxed scheduling problem without uncertainty. Using this framework, we tackle both adversarial and stochastic submissions of statements of work, and obtain logarithmic and constant competitive mechanisms, respectively.
△ Less
Submitted 2 March, 2022;
originally announced March 2022.
-
Productization Challenges of Contextual Multi-Armed Bandits
Authors:
David Abensur,
Ivan Balashov,
Shaked Bar,
Ronny Lempel,
Nurit Moscovici,
Ilan Orlov,
Danny Rosenstein,
Ido Tamir
Abstract:
Contextual Multi-Armed Bandits is a well-known and accepted online optimization algorithm, that is used in many Web experiences to tailor content or presentation to users' traffic. Much has been published on theoretical guarantees (e.g. regret bounds) of proposed algorithmic variants, but relatively little attention has been devoted to the challenges encountered while productizing contextual bandi…
▽ More
Contextual Multi-Armed Bandits is a well-known and accepted online optimization algorithm, that is used in many Web experiences to tailor content or presentation to users' traffic. Much has been published on theoretical guarantees (e.g. regret bounds) of proposed algorithmic variants, but relatively little attention has been devoted to the challenges encountered while productizing contextual bandits schemes in large scale settings. This work enumerates several productization challenges we encountered while leveraging contextual bandits for two concrete use cases at scale. We discuss how to (1) determine the context (engineer the features) that model the bandit arms; (2) sanity check the health of the optimization process; (3) evaluate the process in an offline manner; (4) add potential actions (arms) on the fly to a running process; (5) subject the decision process to constraints; and (6) iteratively improve the online learning algorithm. For each such challenge, we explain the issue, provide our approach, and relate to prior art where applicable.
△ Less
Submitted 10 July, 2019;
originally announced July 2019.
-
Search-Based Serving Architecture of Embeddings-Based Recommendations
Authors:
Sonya Liberman,
Shaked Bar,
Raphael Vannerom,
Danny Rosenstein,
Ronny Lempel
Abstract:
Over the past 10 years, many recommendation techniques have been based on embedding users and items in latent vector spaces, where the inner product of a (user,item) pair of vectors represents the predicted affinity of the user to the item. A wealth of literature has focused on the various modeling approaches that result in embeddings, and has compared their quality metrics, learning complexity, e…
▽ More
Over the past 10 years, many recommendation techniques have been based on embedding users and items in latent vector spaces, where the inner product of a (user,item) pair of vectors represents the predicted affinity of the user to the item. A wealth of literature has focused on the various modeling approaches that result in embeddings, and has compared their quality metrics, learning complexity, etc. However, much less attention has been devoted to the issues surrounding productization of an embeddings-based high throughput, low latency recommender system. In particular, how the system might keep up with the changing embeddings as new models are learnt. This paper describes a reference architecture of a high-throughput, large scale recommendation service which leverages a search engine as its runtime core. We describe how the search index and the query builder adapt to changes in the embeddings, which often happen at a different cadence than index builds. We provide solutions for both id-based and feature-based embeddings, as well as for batch indexing and incremental indexing setups. The described system is at the core of a Web content discovery service that serves tens of billions recommendations per day in response to billions of user requests.
△ Less
Submitted 7 July, 2019;
originally announced July 2019.
-
Budget-Constrained Item Cold-Start Handling in Collaborative Filtering Recommenders via Optimal Design
Authors:
Oren Anava,
Shahar Golan,
Nadav Golbandi,
Zohar Karnin,
Ronny Lempel,
Oleg Rokhlenko,
Oren Somekh
Abstract:
It is well known that collaborative filtering (CF) based recommender systems provide better modeling of users and items associated with considerable rating history. The lack of historical ratings results in the user and the item cold-start problems. The latter is the main focus of this work. Most of the current literature addresses this problem by integrating content-based recommendation technique…
▽ More
It is well known that collaborative filtering (CF) based recommender systems provide better modeling of users and items associated with considerable rating history. The lack of historical ratings results in the user and the item cold-start problems. The latter is the main focus of this work. Most of the current literature addresses this problem by integrating content-based recommendation techniques to model the new item. However, in many cases such content is not available, and the question arises is whether this problem can be mitigated using CF techniques only. We formalize this problem as an optimization problem: given a new item, a pool of available users, and a budget constraint, select which users to assign with the task of rating the new item in order to minimize the prediction error of our model. We show that the objective function is monotone-supermodular, and propose efficient optimal design based algorithms that attain an approximation to its optimum. Our findings are verified by an empirical study using the Netflix dataset, where the proposed algorithms outperform several baselines for the problem at hand.
△ Less
Submitted 20 September, 2016; v1 submitted 10 June, 2014;
originally announced June 2014.
-
Distributed Exploration in Multi-Armed Bandits
Authors:
Eshcar Hillel,
Zohar Karnin,
Tomer Koren,
Ronny Lempel,
Oren Somekh
Abstract:
We study exploration in Multi-Armed Bandits in a setting where $k$ players collaborate in order to identify an $ε$-optimal arm. Our motivation comes from recent employment of bandit algorithms in computationally intensive, large-scale applications. Our results demonstrate a non-trivial tradeoff between the number of arm pulls required by each of the players, and the amount of communication between…
▽ More
We study exploration in Multi-Armed Bandits in a setting where $k$ players collaborate in order to identify an $ε$-optimal arm. Our motivation comes from recent employment of bandit algorithms in computationally intensive, large-scale applications. Our results demonstrate a non-trivial tradeoff between the number of arm pulls required by each of the players, and the amount of communication between them. In particular, our main result shows that by allowing the $k$ players to communicate only once, they are able to learn $\sqrt{k}$ times faster than a single player. That is, distributing learning to $k$ players gives rise to a factor $\sqrt{k}$ parallel speed-up. We complement this result with a lower bound showing this is in general the best possible. On the other extreme, we present an algorithm that achieves the ideal factor $k$ speed-up in learning performance, with communication only logarithmic in $1/ε$.
△ Less
Submitted 4 November, 2013;
originally announced November 2013.
-
OFF-Set: One-pass Factorization of Feature Sets for Online Recommendation in Persistent Cold Start Settings
Authors:
Michal Aharon,
Natalie Aizenberg,
Edward Bortnikov,
Ronny Lempel,
Roi Adadi,
Tomer Benyamini,
Liron Levin,
Ran Roth,
Ohad Serfaty
Abstract:
One of the most challenging recommendation tasks is recommending to a new, previously unseen user. This is known as the 'user cold start' problem. Assuming certain features or attributes of users are known, one approach for handling new users is to initially model them based on their features.
Motivated by an ad targeting application, this paper describes an extreme online recommendation setting…
▽ More
One of the most challenging recommendation tasks is recommending to a new, previously unseen user. This is known as the 'user cold start' problem. Assuming certain features or attributes of users are known, one approach for handling new users is to initially model them based on their features.
Motivated by an ad targeting application, this paper describes an extreme online recommendation setting where the cold start problem is perpetual. Every user is encountered by the system just once, receives a recommendation, and either consumes or ignores it, registering a binary reward.
We introduce One-pass Factorization of Feature Sets, OFF-Set, a novel recommendation algorithm based on Latent Factor analysis, which models users by map** their features to a latent space. Furthermore, OFF-Set is able to model non-linear interactions between pairs of features. OFF-Set is designed for purely online recommendation, performing lightweight updates of its model per each recommendation-reward observation. We evaluate OFF-Set against several state of the art baselines, and demonstrate its superiority on real ad-targeting data.
△ Less
Submitted 8 August, 2013;
originally announced August 2013.
-
Hierarchical Composable Optimization of Web Pages
Authors:
Ronen Barenboim,
Edward Bortnikov,
Nadav Golbandi,
Amit Kagian,
Liran Katzir,
Ronny Lempel,
Hayim Makabee,
Scott Roy,
Oren Somekh
Abstract:
The process of creating modern Web media experiences is challenged by the need to adapt the content and presentation choices to dynamic real-time fluctuations of user interest across multiple audiences. We introduce FAME - a Framework for Agile Media Experiences - which addresses this scalability problem. FAME allows media creators to define abstract page models that are subsequently transformed i…
▽ More
The process of creating modern Web media experiences is challenged by the need to adapt the content and presentation choices to dynamic real-time fluctuations of user interest across multiple audiences. We introduce FAME - a Framework for Agile Media Experiences - which addresses this scalability problem. FAME allows media creators to define abstract page models that are subsequently transformed into real experiences through algorithmic experimentation. FAME's page models are hierarchically composed of simple building blocks, mirroring the structure of most Web pages. They are resolved into concrete page instances by pluggable algorithms which optimize the pages for specific business goals. Our framework allows retrieving dynamic content from multiple sources, defining the experimentation's degrees of freedom, and constraining the algorithmic choices. It offers an effective separation of concerns in the media creation process, enabling multiple stakeholders with profoundly different skills to apply their crafts and perform their duties independently, composing and reusing each other's work in modular ways.
△ Less
Submitted 4 October, 2011;
originally announced October 2011.
-
On the Impact of Random Index-Partitioning on Index Compression
Authors:
M. Feldman,
R. Lempel,
O. Somekh,
K. Vornovitsky
Abstract:
The performance of processing search queries depends heavily on the stored index size. Accordingly, considerable research efforts have been devoted to the development of efficient compression techniques for inverted indexes. Roughly, index compression relies on two factors: the ordering of the indexed documents, which strives to position similar documents in proximity, and the encoding of the inve…
▽ More
The performance of processing search queries depends heavily on the stored index size. Accordingly, considerable research efforts have been devoted to the development of efficient compression techniques for inverted indexes. Roughly, index compression relies on two factors: the ordering of the indexed documents, which strives to position similar documents in proximity, and the encoding of the inverted lists that result from the ordered stream of documents. Large commercial search engines index tens of billions of pages of the ever growing Web. The sheer size of their indexes dictates the distribution of documents among thousands of servers in a scheme called local index-partitioning, such that each server indexes only several millions pages. Due to engineering and runtime performance considerations, random distribution of documents to servers is common. However, random index-partitioning among many servers adversely impacts the resulting index sizes, as it decreases the effectiveness of document ordering schemes. We study the impact of random index-partitioning on document ordering schemes. We show that index-partitioning decreases the aggregated size of the inverted lists logarithmically with the number of servers, when documents within each server are randomly reordered. On the other hand, the aggregated partitioned index size increases logarithmically with the number of servers, when state-of-the-art document ordering schemes, such as lexical URL sorting and clustering with TSP, are applied. Finally, we justify the common practice of randomly distributing documents to servers, as we qualitatively show that despite its ill-effects on the ensuing compression, it decreases key factors in distributed query evaluation time by an order of magnitude as compared with partitioning techniques that compress better.
△ Less
Submitted 28 July, 2011;
originally announced July 2011.