Skip to main content

Showing 1–50 of 82 results for author: Perchet, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15680  [pdf, other

    cs.GT econ.TH

    Calibrated Forecasting and Persuasion

    Authors: Atulya Jain, Vianney Perchet

    Abstract: How should an expert send forecasts to maximize her utility subject to passing a calibration test? We consider a dynamic game where an expert sends probabilistic forecasts to a decision maker. The decision maker uses a calibration test based on past outcomes to verify the expert's forecasts. We characterize the optimal forecasting strategy by reducing the dynamic game to a static persuasion proble… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: The conference version of this work has been accepted to the Twenty-Fifth ACM Conference on Economics and Computation (EC'24)

  2. arXiv:2406.11316  [pdf, ps, other

    stat.ML cs.DS cs.GT cs.LG

    Improved Algorithms for Contextual Dynamic Pricing

    Authors: Matilde Tullii, Solenne Gaucher, Nadav Merlis, Vianney Perchet

    Abstract: In contextual dynamic pricing, a seller sequentially prices goods based on contextual information. Buyers will purchase products only if the prices are below their valuations. The goal of the seller is to design a pricing strategy that collects as much revenue as possible. We focus on two different valuation models. The first assumes that valuations linearly depend on the context and are further d… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2406.06805  [pdf, other

    cs.DS

    Lookback Prophet Inequalities

    Authors: Ziyad Benomar, Dorian Baudry, Vianney Perchet

    Abstract: Prophet inequalities are fundamental optimal stop** problems, where a decision-maker observes sequentially items with values sampled independently from known distributions, and must decide at each new observation to either stop and gain the current value or reject it irrevocably and move to the next step. This model is often too pessimistic and does not adequately represent real-world online sel… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  4. arXiv:2405.18183  [pdf, other

    cs.GT

    Feature-Based Online Bilateral Trade

    Authors: Solenne Gaucher, Martino Bernasconi, Matteo Castiglioni, Andrea Celli, Vianney Perchet

    Abstract: Bilateral trade models the problem of facilitating trades between a seller and a buyer having private valuations for the item being sold. In the online version of the problem, the learner faces a new seller and buyer at each time step, and has to post a price for each of the two parties without any knowledge of their valuations. We consider a scenario where, at each time step, before posting price… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  5. arXiv:2405.09920  [pdf, ps, other

    cs.DS

    Dynamic online matching with budget refills

    Authors: Maria Cherifa, Clément Calauzènes, Vianney Perchet

    Abstract: Inspired by sequential budgeted allocation problems, we study the online matching problem with budget refills. In this context, we consider an online bipartite graph G=(U,V,E), where the nodes in $V$ are discovered sequentially and nodes in $U$ are known beforehand. Each $u\in U$ is endowed with a budget $b_{u,t}\in \mathbb{N}$ that dynamically evolves over time. Unlike the canonical setting, in… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  6. arXiv:2405.01013  [pdf, other

    cs.LG cs.AI cs.DS

    Non-clairvoyant Scheduling with Partial Predictions

    Authors: Ziyad Benomar, Vianney Perchet

    Abstract: The non-clairvoyant scheduling problem has gained new interest within learning-augmented algorithms, where the decision-maker is equipped with predictions without any quality guarantees. In practical settings, access to predictions may be reduced to specific instances, due to cost or data limitations. Our investigation focuses on scenarios where predictions for only $B$ job sizes out of $n$ are av… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  7. arXiv:2403.11637  [pdf, other

    cs.LG stat.ML

    The Value of Reward Lookahead in Reinforcement Learning

    Authors: Nadav Merlis, Dorian Baudry, Vianney Perchet

    Abstract: In reinforcement learning (RL), agents sequentially interact with changing environments while aiming to maximize the obtained rewards. Usually, rewards are observed only after acting, and so the goal is to maximize the expected cumulative reward. Yet, in many practical settings, reward information is observed in advance -- prices are observed before performing transactions; nearby traffic informat… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  8. arXiv:2403.00397  [pdf, other

    cs.GT

    The Price of Fairness in Bipartite Matching

    Authors: Rémi Castera, Felipe Garrido-Lucero, Mathieu Molina, Simon Mauras, Patrick Loiseau, Vianney Perchet

    Abstract: We investigate notions of group fairness in bipartite matching markets involving agents and jobs, where agents are grouped based on sensitive attributes. Employing a geometric approach, we characterize how many agents can be matched in each group, showing that the set of feasible matchings forms a (discrete) polymatroid. We show how we can define weakly-fair matchings geometrically, for which poly… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  9. arXiv:2402.13079  [pdf, ps, other

    stat.ML cs.IR cs.IT cs.LG

    Mode Estimation with Partial Feedback

    Authors: Charles Arnal, Vivien Cabannes, Vianney Perchet

    Abstract: The combination of lightly supervised pre-training and online fine-tuning has played a key role in recent AI developments. These new learning pipelines call for new theoretical frameworks. In this paper, we formalize core aspects of weakly supervised and active learning with a simple problem: the estimation of the mode of a distribution using partial feedback. We show how entropy coding allows for… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    MSC Class: 62L05; 62B86; 62D10; 62B10

  10. arXiv:2309.00656  [pdf, other

    cs.GT cs.LG stat.ML

    Local and adaptive mirror descents in extensive-form games

    Authors: Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko

    Abstract: We study how to learn $ε$-optimal strategies in zero-sum imperfect information games (IIG) with trajectory feedback. In this setting, players update their policies sequentially based on their observations over a fixed number of episodes, denoted by $T$. Existing procedures suffer from high variance due to the use of importance sampling over sequences of actions (Steinberger et al., 2020; McAleer e… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

  11. arXiv:2306.13440  [pdf, other

    cs.LG

    Trading-off price for data quality to achieve fair online allocation

    Authors: Mathieu Molina, Nicolas Gast, Patrick Loiseau, Vianney Perchet

    Abstract: We consider the problem of online allocation subject to a long-term fairness penalty. Contrary to existing works, however, we do not assume that the decision-maker observes the protected attributes -- which is often unrealistic in practice. Instead they can purchase data that help estimate them from sources of different quality; and hence reduce the fairness penalty at some cost. We model this pro… ▽ More

    Submitted 4 December, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

  12. arXiv:2306.07891  [pdf, other

    cs.DS

    Online Matching in Geometric Random Graphs

    Authors: Flore Sentenac, Nathan Noiry, Matthieu Lerasle, Laurent Ménard, Vianney Perchet

    Abstract: We investigate online maximum cardinality matching, a central problem in ad allocation. In this problem, users are revealed sequentially, and each new user can be paired with any previously unmatched campaign that it is compatible with. Despite the limited theoretical guarantees, the greedy algorithm, which matches incoming users with any available campaign, exhibits outstanding performance in pra… ▽ More

    Submitted 5 October, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

  13. arXiv:2306.02071  [pdf, other

    cs.AI cs.GT stat.CO stat.ML

    DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation

    Authors: Felipe Garrido-Lucero, Benjamin Heymann, Maxime Vono, Patrick Loiseau, Vianney Perchet

    Abstract: We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain, to some relevant pre-defined utility of a machine learning task, of aggregating an individual dataset to others. The Shapley value is a natural tool to perform dataset valuation due to its formal axiomatic justification, which can be combined with Monte Carlo integration to overcome the computation… ▽ More

    Submitted 17 June, 2024; v1 submitted 3 June, 2023; originally announced June 2023.

    Comments: 22 pages

  14. arXiv:2305.19691  [pdf, other

    cs.LG stat.ML

    Constant or logarithmic regret in asynchronous multiplayer bandits

    Authors: Hugo Richard, Etienne Boursier, Vianney Perchet

    Abstract: Multiplayer bandits have recently been extensively studied because of their application to cognitive radio networks. While the literature mostly considers synchronous players, radio networks (e.g. for IoT) tend to have asynchronous devices. This motivates the harder, asynchronous multiplayer bandits problem, which was first tackled with an explore-then-commit (ETC) algorithm (see Dakdouk, 2022),… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

  15. arXiv:2303.09205  [pdf, other

    cs.GT cs.DS

    Addressing bias in online selection with limited budget of comparisons

    Authors: Ziyad Benomar, Evgenii Chzhen, Nicolas Schreuder, Vianney Perchet

    Abstract: Consider a hiring process with candidates coming from different universities. It is easy to order candidates with the same background, yet it can be challenging to compare them otherwise. The latter case requires additional costly assessments, leading to a potentially high total cost for the hiring organization. Given an assigned budget, what would be an optimal strategy to select the most qualifi… ▽ More

    Submitted 20 February, 2024; v1 submitted 16 March, 2023; originally announced March 2023.

  16. arXiv:2212.12567  [pdf, other

    stat.ML cs.LG

    Adapting to game trees in zero-sum imperfect information games

    Authors: Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko

    Abstract: Imperfect information games (IIG) are games in which each player only partially observes the current game state. We study how to learn $ε$-optimal strategies in a zero-sum IIG through self-play with trajectory feedback. We give a problem-independent lower bound $\widetilde{\mathcal{O}}(H(A_{\mathcal{X}}+B_{\mathcal{Y}})/ε^2)$ on the required number of realizations to learn these strategies with hi… ▽ More

    Submitted 15 February, 2023; v1 submitted 23 December, 2022; originally announced December 2022.

  17. arXiv:2211.16275  [pdf, ps, other

    stat.ML cs.GT cs.LG

    A survey on multi-player bandits

    Authors: Etienne Boursier, Vianney Perchet

    Abstract: Due mostly to its application to cognitive radio networks, multiplayer bandits gained a lot of interest in the last decade. A considerable progress has been made on its theoretical aspect. However, the current algorithms are far from applicable and many obstacles remain between these theoretical results and a possible implementation of multiplayer bandits algorithms in real cognitive radio network… ▽ More

    Submitted 3 June, 2024; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: final version, accepted at JMLR

  18. arXiv:2210.12882  [pdf, other

    stat.ML cs.LG math.OC math.ST

    Stochastic Mirror Descent for Large-Scale Sparse Recovery

    Authors: Sasila Ilandarideva, Yannis Bekri, Anatoli Juditsky, Vianney Perchet

    Abstract: In this paper we discuss an application of Stochastic Approximation to statistical estimation of high-dimensional sparse parameters. The proposed solution reduces to resolving a penalized stochastic optimization problem on each stage of a multistage algorithm; each problem being solved to a prescribed accuracy by the non-Euclidean Composite Stochastic Mirror Descent (CSMD) algorithm. Assuming that… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

  19. arXiv:2205.15695  [pdf, other

    cs.LG stat.ML

    On Preemption and Learning in Stochastic Scheduling

    Authors: Nadav Merlis, Hugo Richard, Flore Sentenac, Corentin Odic, Mathieu Molina, Vianney Perchet

    Abstract: We study single-machine scheduling of jobs, each belonging to a job type that determines its duration distribution. We start by analyzing the scenario where the type characteristics are known and then move to two learning scenarios where the types are unknown: non-preemptive problems, where each started job must be completed before moving to another job; and preemptive problems, where job executio… ▽ More

    Submitted 1 June, 2023; v1 submitted 31 May, 2022; originally announced May 2022.

    Comments: Accepted to ICML 2023

  20. arXiv:2205.13255  [pdf, other

    cs.LG cs.AI cs.IR stat.ML

    Active Labeling: Streaming Stochastic Gradients

    Authors: Vivien Cabannes, Francis Bach, Vianney Perchet, Alessandro Rudi

    Abstract: The workhorse of machine learning is stochastic gradient descent. To access stochastic gradients, it is common to consider iteratively input/output pairs of a training dataset. Interestingly, it appears that one does not need full supervision to access stochastic gradients, which is the main motivation of this paper. After formalizing the "active labeling" problem, which focuses on active learning… ▽ More

    Submitted 7 December, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: 38 pages (9 main pages), 9 figures

    MSC Class: 68T37 ACM Class: G.3

  21. arXiv:2202.07318  [pdf, ps, other

    cs.GT cs.LG

    An algorithmic solution to the Blotto game using multi-marginal couplings

    Authors: Vianney Perchet, Philippe Rigollet, Thibaut Le Gouic

    Abstract: We describe an efficient algorithm to compute solutions for the general two-player Blotto game on n battlefields with heterogeneous values. While explicit constructions for such solutions have been limited to specific, largely symmetric or homogeneous, setups, this algorithmic resolution covers the most general situation to date: value-asymmetric game with asymmetric budget. The proposed algorithm… ▽ More

    Submitted 31 May, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

  22. arXiv:2112.06008  [pdf, ps, other

    cs.LG

    Privacy Amplification via Shuffling for Linear Contextual Bandits

    Authors: Evrard Garcelon, Kamalika Chaudhuri, Vianney Perchet, Matteo Pirotta

    Abstract: Contextual bandit algorithms are widely used in domains where it is desirable to provide a personalized service by leveraging contextual information, that may contain sensitive information that needs to be protected. Inspired by this scenario, we study the contextual linear bandit problem with differential privacy (DP) constraints. While the literature has focused on either centralized (joint DP)… ▽ More

    Submitted 11 December, 2021; originally announced December 2021.

  23. arXiv:2111.01602  [pdf, other

    cs.LG

    Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge

    Authors: Reda Ouhamma, Odalric Maillard, Vianney Perchet

    Abstract: We consider the problem of online linear regression in the stochastic setting. We derive high probability regret bounds for online ridge regression and the forward algorithm. This enables us to compare online regression algorithms more accurately and eliminate assumptions of bounded observations and predictions. Our study advocates for the use of the forward algorithm in lieu of ridge due to its e… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

    Comments: 11+12 pages. To be published in the proceedings of NeurIPS 2021

  24. arXiv:2110.09133  [pdf, other

    cs.LG

    Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

    Authors: Reda Ouhamma, Rémy Degenne, Pierre Gaillard, Vianney Perchet

    Abstract: In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions. It then predicts whether the mean of each distribution is larger or lower than a given threshold. We introduce a large family of algorithms (containing most existing relevant ones), inspired by the Frank-Wolfe algorithm, and provide a thorough yet generic an… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: 10+15 pages. To be published in the proceedings of NeurIPS 2021

  25. arXiv:2108.00230  [pdf, other

    stat.ML cs.LG

    Pure Exploration and Regret Minimization in Matching Bandits

    Authors: Flore Sentenac, Jialin Yi, Clément Calauzènes, Vianney Perchet, Milan Vojnovic

    Abstract: Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of ver… ▽ More

    Submitted 31 July, 2021; originally announced August 2021.

    Journal ref: Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021

  26. arXiv:2107.00995  [pdf, other

    cs.DS stat.ML

    Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of Greedy Algorithm

    Authors: Nathan Noiry, Flore Sentenac, Vianney Perchet

    Abstract: Motivated by sequential budgeted allocation problems, we investigate online matching problems where connections between vertices are not i.i.d., but they have fixed degree distributions -- the so-called configuration model. We estimate the competitive ratio of the simplest algorithm, GREEDY, by approximating some relevant stochastic discrete processes by their continuous counterparts, that are sol… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

  27. arXiv:2106.06536  [pdf, other

    cs.LG

    Unsupervised Neural Hidden Markov Models with a Continuous latent state space

    Authors: Firas Jarboui, Vianney Perchet

    Abstract: We introduce a new procedure to neuralize unsupervised Hidden Markov Models in the continuous case. This provides higher flexibility to solve problems with underlying latent variables. This approach is evaluated on both synthetic and real data. On top of generating likely model parameters with comparable performances to off-the-shelf neural architecture (LSTMs, GRUs,..), the obtained results are e… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

  28. arXiv:2106.05068  [pdf, other

    cs.LG

    Offline Inverse Reinforcement Learning

    Authors: Firas Jarboui, Vianney Perchet

    Abstract: The objective of offline RL is to learn optimal policies when a fixed exploratory demonstrations data-set is available and sampling additional observations is impossible (typically if this operation is either costly or rises ethical questions). In order to solve this problem, off the shelf approaches require a properly defined cost function (or its evaluation on the provided data-set), which are s… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  29. arXiv:2106.05061  [pdf, other

    cs.LG stat.ML

    Quickest change detection with unknown parameters: Constant complexity and near optimality

    Authors: Firas Jarboui, Viannet Perchet

    Abstract: We consider the quickest change detection problem where both the parameters of pre- and post- change distributions are unknown, which prevents the use of classical simple hypothesis testing. Without additional assumptions, optimal solutions are not tractable as they rely on some minimax and robust variant of the objective. As a consequence, change points might be detected too late for practical ap… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  30. arXiv:2106.04228  [pdf, ps, other

    stat.ML cs.GT cs.LG cs.NI

    Decentralized Learning in Online Queuing Systems

    Authors: Flore Sentenac, Etienne Boursier, Vianney Perchet

    Abstract: Motivated by packet routing in computer networks, online queuing systems are composed of queues receiving packets at different rates. Repeatedly, they send packets to servers, each of them treating only at most one packet at a time. In the centralized case, the number of accumulated packets remains bounded (i.e., the system is \textit{stable}) as long as the ratio between service rates and arrival… ▽ More

    Submitted 4 November, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 camera ready

  31. arXiv:2105.11812  [pdf, other

    cs.LG

    A Generalised Inverse Reinforcement Learning Framework

    Authors: Firas Jarboui, Vianney Perchet

    Abstract: The gloabal objective of inverse Reinforcement Learning (IRL) is to estimate the unknown cost function of some MDP base on observed trajectories generated by (approximate) optimal policies. The classical approach consists in tuning this cost function so that associated optimal trajectories (that minimise the cumulative discounted cost, i.e. the classical RL loss) are 'similar' to the observed ones… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  32. arXiv:2103.09927  [pdf, other

    cs.LG

    Encrypted Linear Contextual Bandit

    Authors: Evrard Garcelon, Vianney Perchet, Matteo Pirotta

    Abstract: Contextual bandit is a general framework for online learning in sequential decision-making problems that has found application in a wide range of domains, including recommendation systems, online advertising, and clinical trials. A critical aspect of bandit methods is that they require to observe the contexts --i.e., individual or group-level data-- and rewards in order to solve the sequential p… ▽ More

    Submitted 23 March, 2022; v1 submitted 17 March, 2021; originally announced March 2021.

  33. arXiv:2102.08087  [pdf, other

    stat.ML cs.LG math.OC stat.OT

    Making the most of your day: online learning for optimal allocation of time

    Authors: Etienne Boursier, Tristan Garrec, Vianney Perchet, Marco Scarsini

    Abstract: We study online learning for optimal allocation when the resource to be allocated is time. %Examples of possible applications include job scheduling for a computing server, a driver filling a day with rides, a landlord renting an estate, etc. An agent receives task proposals sequentially according to a Poisson process and can either accept or reject a proposed task. If she accepts the proposal, sh… ▽ More

    Submitted 4 November, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: NeurIPS 2021 camera ready

  34. arXiv:2101.01086  [pdf, other

    cs.LG

    Be Greedy in Multi-Armed Bandits

    Authors: Matthieu Jedor, Jonathan Louëdec, Vianney Perchet

    Abstract: The Greedy algorithm is the simplest heuristic in sequential decision problem that carelessly takes the locally optimal choice at each round, disregarding any advantages of exploring and/or information gathering. Theoretically, it is known to sometimes have poor performances, for instance even a linear regret (with respect to the time horizon) in the standard multi-armed bandit problem. On the oth… ▽ More

    Submitted 4 January, 2021; originally announced January 2021.

  35. arXiv:2012.14264  [pdf, other

    cs.LG stat.ML

    Lifelong Learning in Multi-Armed Bandits

    Authors: Matthieu Jedor, Jonathan Louëdec, Vianney Perchet

    Abstract: Continuously learning and leveraging the knowledge accumulated from prior tasks in order to improve future performance is a long standing machine learning problem. In this paper, we study the problem in the multi-armed bandit framework with the objective to minimize the total regret incurred over a series of tasks. While most bandit algorithms are designed to have a low worst-case regret, we exami… ▽ More

    Submitted 28 December, 2020; originally announced December 2020.

  36. arXiv:2011.09365  [pdf, other

    cs.GT

    Learning in repeated auctions

    Authors: Thomas Nedelec, Clément Calauzènes, Noureddine El Karoui, Vianney Perchet

    Abstract: Online auctions are one of the most fundamental facets of the modern economy and power an industry generating hundreds of billions of dollars a year in revenue. Auction theory has historically focused on the question of designing the best way to sell a single item to potential buyers, with the concurrent objectives of maximizing revenue generated or welfare created. Theoretical results in this are… ▽ More

    Submitted 22 September, 2021; v1 submitted 18 November, 2020; originally announced November 2020.

  37. arXiv:2011.04298  [pdf, other

    cs.LG

    Robustness of Community Detection to Random Geometric Perturbations

    Authors: Sandrine Peche, Vianney Perchet

    Abstract: We consider the stochastic block model where connection between vertices is perturbed by some latent (and unobserved) random geometric graph. The objective is to prove that spectral methods are robust to this type of noise, even if they are agnostic to the presence (or not) of the random graph. We provide explicit regimes where the second eigenvector of the adjacency matrix is highly correlated to… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: NeurIPS-2020

  38. arXiv:2010.07778  [pdf, other

    cs.LG

    Local Differential Privacy for Regret Minimization in Reinforcement Learning

    Authors: Evrard Garcelon, Vianney Perchet, Ciara Pike-Burke, Matteo Pirotta

    Abstract: Reinforcement learning algorithms are widely used in domains where it is desirable to provide a personalized service. In these domains it is common that user data contains sensitive information that needs to be protected from third parties. Motivated by this, we study privacy in the context of finite-horizon Markov Decision Processes (MDPs) by requiring information to be obfuscated on the user sid… ▽ More

    Submitted 27 October, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

  39. arXiv:2007.09996  [pdf, ps, other

    math.OC cs.LG stat.OT

    Social Learning in Non-Stationary Environments

    Authors: Etienne Boursier, Vianney Perchet, Marco Scarsini

    Abstract: Potential buyers of a product or service, before making their decisions, tend to read reviews written by previous consumers. We consider Bayesian consumers with heterogeneous preferences, who sequentially decide whether to buy an item of unknown quality, based on previous buyers' reviews. The quality is multi-dimensional and may occasionally vary over time; the reviews are also multi-dimensional.… ▽ More

    Submitted 23 February, 2022; v1 submitted 20 July, 2020; originally announced July 2020.

  40. arXiv:2006.06613  [pdf, ps, other

    stat.ML cs.LG

    Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits

    Authors: Pierre Perrault, Etienne Boursier, Vianney Perchet, Michal Valko

    Abstract: We investigate stochastic combinatorial multi-armed bandit with semi-bandit feedback (CMAB). In CMAB, the question of the existence of an efficient policy with an optimal asymptotic regret (up to a factor poly-logarithmic with the action size) is still open for many families of distributions, including mutually independent outcomes, and more generally the multivariate sub-Gaussian family. We propo… ▽ More

    Submitted 3 January, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: accepted to NeurIPS 2020

  41. arXiv:2005.01656  [pdf, ps, other

    cs.LG stat.ML

    Categorized Bandits

    Authors: Matthieu Jedor, Jonathan Louedec, Vianney Perchet

    Abstract: We introduce a new stochastic multi-armed bandit setting where arms are grouped inside ``ordered'' categories. The motivating example comes from e-commerce, where a customer typically has a greater appetence for items of a specific well-identified but unknown category than any other one. We introduce three concepts of ordering between categories, inspired by stochastic dominance between random var… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

  42. arXiv:2002.01197  [pdf, ps, other

    cs.LG stat.ML

    Selfish Robustness and Equilibria in Multi-Player Bandits

    Authors: Etienne Boursier, Vianney Perchet

    Abstract: Motivated by cognitive radios, stochastic multi-player multi-armed bandits gained a lot of interest recently. In this class of problems, several players simultaneously pull arms and encounter a collision - with 0 reward - if some of them pull the same arm at the same time. While the cooperative case where players maximize the collective reward (obediently following some fixed protocol) has been mo… ▽ More

    Submitted 19 June, 2020; v1 submitted 4 February, 2020; originally announced February 2020.

  43. arXiv:1909.06806  [pdf, other

    cs.GT

    Adversarial learning for revenue-maximizing auctions

    Authors: Thomas Nedelec, Jules Baudet, Vianney Perchet, Noureddine El Karoui

    Abstract: We introduce a new numerical framework to learn optimal bidding strategies in repeated auctions when the seller uses past bids to optimize her mechanism. Crucially, we do not assume that the bidders know what optimization mechanism is used by the seller. We recover essentially all state-of-the-art analytical results for the single-item framework derived previously in the setup where the bidder kno… ▽ More

    Submitted 8 February, 2021; v1 submitted 15 September, 2019; originally announced September 2019.

  44. Markov Decision Process for MOOC users behavioral inference

    Authors: Firas Jarboui, Célya Gruson-daniel, Pierre Chanial, Alain Durmus, Vincent Rocchisani, Sophie-helene Goulet Ebongue, Anneliese Depoux, Wilfried Kirschenmann, Vianney Perchet

    Abstract: Studies on massive open online courses (MOOCs) users discuss the existence of typical profiles and their impact on the learning process of the students. However defining the typical behaviors as well as classifying the users accordingly is a difficult task. In this paper we suggest two methods to model MOOC users behaviour given their log data. We mold their behavior into a Markov Decision Process… ▽ More

    Submitted 10 March, 2021; v1 submitted 10 July, 2019; originally announced July 2019.

  45. arXiv:1906.08509  [pdf, other

    stat.ML cs.LG math.OC

    Online A-Optimal Design and Active Linear Regression

    Authors: Xavier Fontaine, Pierre Perrault, Michal Valko, Vianney Perchet

    Abstract: We consider in this paper the problem of optimal experiment design where a decision maker can choose which points to sample to obtain an estimate $\hatβ$ of the hidden parameter $β^{\star}$ of an underlying linear model. The key challenge of this work lies in the heteroscedasticity assumption that we make, meaning that each covariate has a different and unknown variance. The goal of the decision m… ▽ More

    Submitted 30 December, 2020; v1 submitted 20 June, 2019; originally announced June 2019.

    Comments: 29 pages, 5 figures

  46. arXiv:1905.13031  [pdf, other

    cs.GT

    Robust Stackelberg buyers in repeated auctions

    Authors: Clément Calauzènes, Thomas Nedelec, Vianney Perchet, Noureddine El Karoui

    Abstract: We consider the practical and classical setting where the seller is using an exploration stage to learn the value distributions of the bidders before running a revenue-maximizing auction in a exploitation phase. In this two-stage process, we exhibit practical, simple and robust strategies with large utility uplifts for the bidders. We quantify precisely the seller revenue against non-discounted bu… ▽ More

    Submitted 29 May, 2019; originally announced May 2019.

    Comments: arXiv admin note: text overlap with arXiv:1808.06979

  47. arXiv:1905.11797  [pdf, ps, other

    cs.LG stat.ML

    ROI Maximization in Stochastic Online Decision-Making

    Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Yishay Mansour, Vianney Perchet

    Abstract: We introduce a novel theoretical framework for Return On Investment (ROI) maximization in repeated decision-making. Our setting is motivated by the use case of companies that regularly receive proposals for technological innovations and want to quickly decide whether they are worth implementing. We design an algorithm for learning ROI-maximizing decision-making policies over a sequence of innovati… ▽ More

    Submitted 22 December, 2021; v1 submitted 28 May, 2019; originally announced May 2019.

  48. arXiv:1905.11148  [pdf, other

    stat.ML cs.LG stat.AP

    Utility/Privacy Trade-off through the lens of Optimal Transport

    Authors: Etienne Boursier, Vianney Perchet

    Abstract: Strategic information is valuable either by remaining private (for instance if it is sensitive) or, on the other hand, by being used publicly to increase some utility. These two objectives are antagonistic and leaking this information might be more rewarding than concealing it. Unlike classical solutions that focus on the first point, we consider instead agents that optimize a natural trade-off be… ▽ More

    Submitted 2 March, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: AISTATS 2020

  49. arXiv:1902.10427  [pdf, other

    cs.GT

    Learning to bid in revenue-maximizing auctions

    Authors: Thomas Nedelec, Noureddine El Karoui, Vianney Perchet

    Abstract: We consider the problem of the optimization of bidding strategies in prior-dependent revenue-maximizing auctions, when the seller fixes the reserve prices based on the bid distributions. Our study is done in the setting where one bidder is strategic. Using a variational approach, we study the complexity of the original objective and we introduce a relaxation of the objective functional in order to… ▽ More

    Submitted 14 May, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

  50. arXiv:1902.04376  [pdf, ps, other

    stat.ML cs.LG math.OC

    An adaptive stochastic optimization algorithm for resource allocation

    Authors: Xavier Fontaine, Shie Mannor, Vianney Perchet

    Abstract: We consider the classical problem of sequential resource allocation where a decision maker must repeatedly divide a budget between several resources, each with diminishing returns. This can be recast as a specific stochastic optimization problem where the objective is to maximize the cumulative reward, or equivalently to minimize the regret. We construct an algorithm that is {\em adaptive} to the… ▽ More

    Submitted 16 January, 2020; v1 submitted 12 February, 2019; originally announced February 2019.

    Comments: ALT2020, 45 pages, 9 figures

    Journal ref: Proceedings of Machine Learning Research (PMLR), volume 117, 2020