Search | arXiv e-print repository

Contextual Bandits for Advertising Campaigns: A Diffusion-Model Independent Approach (Extended Version)

Authors: Alexandra Iacob, Bogdan Cautis, Silviu Maniu

Abstract: Motivated by scenarios of information diffusion and advertising in social media, we study an influence maximization problem in which little is assumed to be known about the diffusion network or about the model that determines how information may propagate. In such a highly uncertain environment, one can focus on multi-round diffusion campaigns, with the objective to maximize the number of distinct… ▽ More Motivated by scenarios of information diffusion and advertising in social media, we study an influence maximization problem in which little is assumed to be known about the diffusion network or about the model that determines how information may propagate. In such a highly uncertain environment, one can focus on multi-round diffusion campaigns, with the objective to maximize the number of distinct users that are influenced or activated, starting from a known base of few influential nodes. During a campaign, spread seeds are selected sequentially at consecutive rounds, and feedback is collected in the form of the activated nodes at each round. A round's impact (reward) is then quantified as the number of newly activated nodes. Overall, one must maximize the campaign's total spread, as the sum of rounds' rewards. In this setting, an explore-exploit approach could be used to learn the key underlying diffusion parameters, while running the campaign. We describe and compare two methods of contextual multi-armed bandits, with upper-confidence bounds on the remaining potential of influencers, one using a generalized linear model and the Good-Turing estimator for remaining potential (GLM-GT-UCB), and another one that directly adapts the LinUCB algorithm to our setting (LogNorm-LinUCB). We show that they outperform baseline methods using state-of-the-art ideas, on synthetic and real-world data, while at the same time exhibiting different and complementary behavior, depending on the scenarios in which they are deployed. △ Less

Submitted 13 January, 2022; originally announced January 2022.

Comments: Extended version of conference article in SIAM International Conference on Data Mining (SDM) 2022. 14 pages, 2 figures, 4 tables

arXiv:2112.01132 [pdf, other]

A Practical Dynamic Programming Approach to Datalog Provenance Computation

Authors: Yann Ramusat, Silviu Maniu, Pierre Senellart

Abstract: We establish a translation between a formalism for dynamic programming over hypergraphs and the computation of semiring-based provenance for Datalog programs. The benefit of this translation is a new method for computing provenance for a specific class of semirings. Theoretical and practical optimizations lead to an efficient implementation using \textsc{Soufflé}, a state-of-the-art Datalog interp… ▽ More We establish a translation between a formalism for dynamic programming over hypergraphs and the computation of semiring-based provenance for Datalog programs. The benefit of this translation is a new method for computing provenance for a specific class of semirings. Theoretical and practical optimizations lead to an efficient implementation using \textsc{Soufflé}, a state-of-the-art Datalog interpreter. Experimental results on real-world data suggest this approach to be efficient in practical contexts, even competing with our previous dedicated solutions for computing provenance in annotated graph databases. The cost overhead compared to plain Datalog evaluation is fairly moderate in many cases of interest. △ Less

Submitted 2 December, 2021; originally announced December 2021.

arXiv:2009.10135 [pdf, other]

Bandits Under The Influence (Extended Version)

Authors: Silviu Maniu, Stratis Ioannidis, Bogdan Cautis

Abstract: Recommender systems should adapt to user interests as the latter evolve. A prevalent cause for the evolution of user interests is the influence of their social circle. In general, when the interests are not known, online algorithms that explore the recommendation space while also exploiting observed preferences are preferable. We present online recommendation algorithms rooted in the linear multi-… ▽ More Recommender systems should adapt to user interests as the latter evolve. A prevalent cause for the evolution of user interests is the influence of their social circle. In general, when the interests are not known, online algorithms that explore the recommendation space while also exploiting observed preferences are preferable. We present online recommendation algorithms rooted in the linear multi-armed bandit literature. Our bandit algorithms are tailored precisely to recommendation scenarios where user interests evolve under social influence. In particular, we show that our adaptations of the classic LinREL and Thompson Sampling algorithms maintain the same asymptotic regret bounds as in the non-social case. We validate our approach experimentally using both synthetic and real datasets. △ Less

Submitted 21 September, 2020; originally announced September 2020.

Comments: 27 pages, 4 figures, 6 tables. Extended version of accepted ICDM 2020 conference article

arXiv:1901.06862 [pdf, other]

An Experimental Study of the Treewidth of Real-World Graph Data (Extended Version)

Authors: Silviu Maniu, Pierre Senellart, Suraj Jog

Abstract: Treewidth is a parameter that measures how tree-like a relational instance is, and whether it can reasonably be decomposed into a tree. Many computation tasks are known to be tractable on databases of small treewidth, but computing the treewidth of a given instance is intractable. This article is the first large-scale experimental study of treewidth and tree decompositions of real-world database i… ▽ More Treewidth is a parameter that measures how tree-like a relational instance is, and whether it can reasonably be decomposed into a tree. Many computation tasks are known to be tractable on databases of small treewidth, but computing the treewidth of a given instance is intractable. This article is the first large-scale experimental study of treewidth and tree decompositions of real-world database instances (25 datasets from 8 different domains, with sizes ranging from a few thousand to a few million vertices). The goal is to determine which data, if any, can benefit of the wealth of algorithms for databases of small treewidth. For each dataset, we obtain upper and lower bound estimations of their treewidth, and study the properties of their tree decompositions. We show in particular that, even when treewidth is high, using partial tree decompositions can result in data structures that can assist algorithms. △ Less

Submitted 21 January, 2019; originally announced January 2019.

Comments: Extended version of an article published in the proceedings of ICDT 2019

arXiv:1702.05354 [pdf, other]

Algorithms for Online Influencer Marketing

Authors: Paul Lagrée, Olivier Cappé, Bogdan Cautis, Silviu Maniu

Abstract: Influence maximization is the problem of finding influential users, or nodes, in a graph so as to maximize the spread of information. It has many applications in advertising and marketing on social networks. In this paper, we study a highly generic version of influence maximization, one of optimizing influence campaigns by sequentially selecting "spread seeds" from a set of influencers, a small su… ▽ More Influence maximization is the problem of finding influential users, or nodes, in a graph so as to maximize the spread of information. It has many applications in advertising and marketing on social networks. In this paper, we study a highly generic version of influence maximization, one of optimizing influence campaigns by sequentially selecting "spread seeds" from a set of influencers, a small subset of the node population, under the hypothesis that, in a given campaign, previously activated nodes remain "persistently" active throughout and thus do not yield further rewards. This problem is in particular relevant for an important form of online marketing, known as influencer marketing, in which the marketers target a sub-population of influential people, instead of the entire base of potential buyers. Importantly, we make no assumptions on the underlying diffusion model and we work in a setting where neither a diffusion network nor historical activation data are available. We call this problem online influencer marketing with persistence (in short, OIMP). We first discuss motivating scenarios and present our general approach. We introduce an estimator on the influencers' remaining potential -- the expected number of nodes that can still be reached from a given influencer -- and justify its strength to rapidly estimate the desired value, relying on real data gathered from Twitter. We then describe a novel algorithm, GT-UCB, relying on upper confidence bounds on the remaining potential. We show that our approach leads to high-quality spreads on both simulated and real datasets, even though it makes almost no assumptions on the diffusion medium. Importantly, it is orders of magnitude faster than state-of-the-art influence maximization methods, making it possible to deal with large-scale online scenarios. △ Less

Submitted 23 October, 2017; v1 submitted 17 February, 2017; originally announced February 2017.

Comments: 29 pages

arXiv:1607.05538 [pdf, other]

doi 10.1007/978-3-319-45856-4_22

Challenges for Efficient Query Evaluation on Structured Probabilistic Data

Authors: Antoine Amarilli, Silviu Maniu, Mikaël Monet

Abstract: Query answering over probabilistic data is an important task but is generally intractable. However, a new approach for this problem has recently been proposed, based on structural decompositions of input databases, following, e.g., tree decompositions. This paper presents a vision for a database management system for probabilistic data built following this structural approach. We review our existi… ▽ More Query answering over probabilistic data is an important task but is generally intractable. However, a new approach for this problem has recently been proposed, based on structural decompositions of input databases, following, e.g., tree decompositions. This paper presents a vision for a database management system for probabilistic data built following this structural approach. We review our existing and ongoing work on this topic and highlight many theoretical and practical challenges that remain to be addressed. △ Less

Submitted 19 July, 2016; originally announced July 2016.

Comments: 9 pages, 1 figure, 23 references. Accepted for publication at SUM 2016

arXiv:1506.01188 [pdf, other]

doi 10.1145/2783258.2783271

Online Influence Maximization (Extended Version)

Authors: Siyu Lei, Silviu Maniu, Luyi Mo, Reynold Cheng, Pierre Senellart

Abstract: Social networks are commonly used for marketing purposes. For example, free samples of a product can be given to a few influential social network users (or "seed nodes"), with the hope that they will convince their friends to buy it. One way to formalize marketers' objective is through influence maximization (or IM), whose goal is to find the best seed nodes to activate under a fixed budget, so th… ▽ More Social networks are commonly used for marketing purposes. For example, free samples of a product can be given to a few influential social network users (or "seed nodes"), with the hope that they will convince their friends to buy it. One way to formalize marketers' objective is through influence maximization (or IM), whose goal is to find the best seed nodes to activate under a fixed budget, so that the number of people who get influenced in the end is maximized. Recent solutions to IM rely on the influence probability that a user influences another one. However, this probability information may be unavailable or incomplete. In this paper, we study IM in the absence of complete information on influence probability. We call this problem Online Influence Maximization (OIM) since we learn influence probabilities at the same time we run influence campaigns. To solve OIM, we propose a multiple-trial approach, where (1) some seed nodes are selected based on existing influence information; (2) an influence campaign is started with these seed nodes; and (3) users' feedback is used to update influence information. We adopt the Explore-Exploit strategy, which can select seed nodes using either the current influence probability estimation (exploit), or the confidence bound on the estimation (explore). Any existing IM algorithm can be used in this framework. We also develop an incremental algorithm that can significantly reduce the overhead of handling users' feedback information. Our experiments show that our solution is more effective than traditional IM methods on the partial information. △ Less

Submitted 3 June, 2015; originally announced June 2015.

Comments: 13 pages. To appear in KDD 2015. Extended version

arXiv:1104.1605 [pdf, ps, other]

Efficient Top-K Retrieval in Online Social Tagging Networks

Authors: Silviu Maniu, Bogdan Cautis

Abstract: We consider in this paper top-k query answering in social tagging systems, also known as folksonomies. This problem requires a significant departure from existing, socially agnostic techniques. In a network-aware context, one can (and should) exploit the social links, which can indicate how users relate to the seeker and how much weight their tagging actions should have in the result build-up. We… ▽ More We consider in this paper top-k query answering in social tagging systems, also known as folksonomies. This problem requires a significant departure from existing, socially agnostic techniques. In a network-aware context, one can (and should) exploit the social links, which can indicate how users relate to the seeker and how much weight their tagging actions should have in the result build-up. We propose an algorithm that has the potential to scale to current applications. While the problem has already been considered in previous literature, this was done either under strong simplifying assumptions or under choices that cannot scale to even moderate-size real world applications. We first consider a key aspect of the problem, which is accessing the closest or most relevant users for a given seeker. We describe how this can be done on the fly (without any pre-computations) for several possible choices - arguably the most natural ones - of proximity computation in a user network. Based on this, our top-k algorithm is sound and complete, while addressing the scalability issues of the existing ones. Importantly, our technique is instance optimal in the case when the search relies exclusively on the social weight of tagging actions. To further reduce response times, we then consider directions for efficiency by approximation. Extensive experiments on real world data show that our techniques can drastically improve the response time, without sacrificing precision. △ Less

Submitted 5 October, 2012; v1 submitted 8 April, 2011; originally announced April 2011.

Showing 1–8 of 8 results for author: Maniu, S