Skip to main content

Showing 1–50 of 83 results for author: Cesa-Bianchi, N

.
  1. arXiv:2406.16802  [pdf, other

    cs.LG stat.ML

    Improved Regret Bounds for Bandits with Expert Advice

    Authors: Nicolò Cesa-Bianchi, Khaled Eldowa, Emmanuel Esposito, Julia Olkhovskaya

    Abstract: In this research note, we revisit the bandits with expert advice problem. Under a restricted feedback model, we prove a lower bound of order $\sqrt{K T \ln(N/K)}$ for the worst-case regret, where $K$ is the number of actions, $N>K$ the number of experts, and $T$ the time horizon. This matches a previously known upper bound of the same order and improves upon the best available lower bound of… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2406.10529  [pdf, ps, other

    cs.LG cs.AI stat.ML

    A Theory of Interpretable Approximations

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Emmanuel Esposito, Yishay Mansour, Shay Moran, Maximilian Thiessen

    Abstract: Can a deep neural network be approximated by a small decision tree based on simple features? This question and its variants are behind the growing demand for machine learning models that are *interpretable* by humans. In this work we study such questions by introducing *interpretable approximations*, a notion that captures the idea of approximating a target concept $c$ by a small aggregation of co… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: To appear at COLT 2024

  3. arXiv:2406.01192  [pdf, other

    cs.LG stat.ML

    Sparsity-Agnostic Linear Bandits with Adaptive Adversaries

    Authors: Tianyuan **, Kyoungseok Jang, Nicolò Cesa-Bianchi

    Abstract: We study stochastic linear bandits where, in each round, the learner receives a set of actions (i.e., feature vectors), from which it chooses an element and obtains a stochastic reward. The expected reward is a fixed but unknown linear function of the chosen action. We study sparse regret bounds, that depend on the number $S$ of non-zero coefficients in the linear reward function. Previous works f… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 25 pages

  4. arXiv:2405.13919  [pdf, ps, other

    cs.GT cs.LG

    Fair Online Bilateral Trade

    Authors: François Bachoc, Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni

    Abstract: In online bilateral trade, a platform posts prices to incoming pairs of buyers and sellers that have private valuations for a certain good. If the price is lower than the buyers' valuation and higher than the sellers' valuation, then a trade takes place. Previous work focused on the platform perspective, with the goal of setting prices maximizing the gain from trade (the sum of sellers' and buyers… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  5. arXiv:2402.10282  [pdf, other

    cs.LG stat.ML

    Information Capacity Regret Bounds for Bandits with Mediator Feedback

    Authors: Khaled Eldowa, Nicolò Cesa-Bianchi, Alberto Maria Metelli, Marcello Restelli

    Abstract: This work addresses the mediator feedback problem, a bandit game where the decision set consists of a number of policies, each associated with a probability distribution over a common space of outcomes. Upon choosing a policy, the learner observes an outcome sampled from its distribution and incurs the loss assigned to this outcome in the present round. We introduce the policy set capacity as an i… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  6. arXiv:2312.15433  [pdf, ps, other

    cs.LG stat.ML

    Best-of-Both-Worlds Algorithms for Linear Contextual Bandits

    Authors: Yuko Kuroki, Alberto Rumi, Taira Tsuchiya, Fabio Vitale, Nicolò Cesa-Bianchi

    Abstract: We study best-of-both-worlds algorithms for $K$-armed linear contextual bandits. Our algorithms deliver near-optimal regret bounds in both the adversarial and stochastic regimes, without prior knowledge about the environment. In the stochastic regime, we achieve the polylogarithmic rate $\frac{(dK)^2\mathrm{poly}\log(dKT)}{Δ_{\min}}$, where $Δ_{\min}$ is the minimum suboptimality gap over the $d$-… ▽ More

    Submitted 19 February, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

    Comments: Accepted at AISTATS2024

  7. arXiv:2311.05975  [pdf, other

    cs.LG

    Sum-max Submodular Bandits

    Authors: Stephen Pasteris, Alberto Rumi, Fabio Vitale, Nicolò Cesa-Bianchi

    Abstract: Many online decision-making problems correspond to maximizing a sequence of submodular functions. In this work, we introduce sum-max functions, a subclass of monotone submodular functions capturing several interesting problems, including best-of-$K$-bandits, combinatorial bandits, and the bandit versions on facility location, $M$-medians, and hitting sets. We show that all functions in this class… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  8. arXiv:2310.17385  [pdf, other

    cs.LG

    Multitask Online Learning: Listen to the Neighborhood Buzz

    Authors: Juliette Achddou, Nicolò Cesa-Bianchi, Pierre Laforgue

    Abstract: We study multitask online learning in a setting where agents can only exchange information with their neighbors on an arbitrary communication network. We introduce $\texttt{MT-CO}_2\texttt{OL}$, a decentralized algorithm for this setting whose regret depends on the interplay between the task similarities and the network structure. Our analysis shows that the regret of… ▽ More

    Submitted 8 April, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

  9. arXiv:2310.09597  [pdf, other

    econ.EM cs.LG stat.ML

    Adaptive maximization of social welfare

    Authors: Nicolo Cesa-Bianchi, Roberto Colomboni, Maximilian Kasy

    Abstract: We consider the problem of repeatedly choosing policies to maximize social welfare. Welfare is a weighted sum of private utility and public revenue. Earlier outcomes inform later policies. Utility is not observed, but indirectly inferred. Response functions are learned through experimentation. We derive a lower bound on regret, and a matching adversarial upper bound for a variant of the Exp3 alg… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

  10. arXiv:2308.07588  [pdf, ps, other

    cs.LG cs.IT math.ST

    High-Probability Risk Bounds via Sequential Predictors

    Authors: Dirk van der Hoeven, Nikita Zhivotovskiy, Nicolò Cesa-Bianchi

    Abstract: Online learning methods yield sequential regret bounds under minimal assumptions and provide in-expectation risk bounds for statistical learning. However, despite the apparent advantage of online guarantees over their statistical counterparts, recent findings indicate that in many important cases, regret bounds may not guarantee tight high-probability risk bounds in the statistical setting. In thi… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: 24 pages

  11. arXiv:2308.01744  [pdf, other

    cs.LG

    Multitask Learning with No Regret: from Improved Confidence Bounds to Active Learning

    Authors: Pier Giuseppe Sessa, Pierre Laforgue, Nicolò Cesa-Bianchi, Andreas Krause

    Abstract: Multitask learning is a powerful framework that enables one to simultaneously learn multiple related tasks by sharing information between them. Quantifying uncertainty in the estimated tasks is of pivotal importance for many downstream applications, such as online or active learning. In this work, we provide novel multitask confidence intervals in the challenging agnostic setting, i.e., when neith… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

  12. arXiv:2307.09478  [pdf, other

    cs.GT cs.DS cs.LG

    The Role of Transparency in Repeated First-Price Auctions with Unknown Valuations

    Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, Stefano Leonardi

    Abstract: We study the problem of regret minimization for a single bidder in a sequence of first-price auctions where the bidder discovers the item's value only if the auction is won. Our main contribution is a complete characterization, up to logarithmic factors, of the minimax regret in terms of the auction's \emph{transparency}, which controls the amount of information on competing bids disclosed by the… ▽ More

    Submitted 21 March, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: Accepted at STOC 2024

  13. arXiv:2307.00836  [pdf, other

    stat.ML cs.LG

    Trading-Off Payments and Accuracy in Online Classification with Paid Stochastic Experts

    Authors: Dirk van der Hoeven, Ciara Pike-Burke, Hao Qiu, Nicolo Cesa-Bianchi

    Abstract: We investigate online classification with paid stochastic experts. Here, before making their prediction, each expert must be paid. The amount that we pay each expert directly influences the accuracy of their prediction through some unknown Lipschitz "productivity" function. In each round, the learner must decide how much to pay each expert and then make a prediction. They incur a cost equal to a w… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: ICML 2023

  14. arXiv:2305.19036  [pdf, other

    cs.LG

    Delayed Bandits: When Do Intermediate Observations Help?

    Authors: Emmanuel Esposito, Saeed Masoudian, Hao Qiu, Dirk van der Hoeven, Nicolò Cesa-Bianchi, Yevgeny Seldin

    Abstract: We study a $K$-armed bandit with delayed feedback and intermediate observations. We consider a model where intermediate observations have a form of a finite state, which is observed immediately after taking an action, whereas the loss is observed after an adversarially chosen delay. We show that the regime of the map** of states to losses determines the complexity of the problem, irrespective of… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  15. arXiv:2305.15383  [pdf, ps, other

    cs.LG

    On the Minimax Regret for Online Learning with Feedback Graphs

    Authors: Khaled Eldowa, Emmanuel Esposito, Tommaso Cesari, Nicolò Cesa-Bianchi

    Abstract: In this work, we improve on the upper and lower bounds for the regret of online learning with strongly observable undirected feedback graphs. The best known upper bound for this problem is $\mathcal{O}\bigl(\sqrt{αT\ln K}\bigr)$, where $K$ is the number of actions, $α$ is the independence number of the graph, and $T$ is the time horizon. The $\sqrt{\ln K}$ factor is known to be necessary when… ▽ More

    Submitted 28 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  16. arXiv:2305.08629  [pdf, ps, other

    cs.LG

    A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs

    Authors: Dirk van der Hoeven, Lukas Zierahn, Tal Lancewicki, Aviv Rosenberg, Nicoló Cesa-Bianchi

    Abstract: We derive a new analysis of Follow The Regularized Leader (FTRL) for online learning with delayed bandit feedback. By separating the cost of delayed feedback from that of bandit feedback, our analysis allows us to obtain new results in three important settings. On the one hand, we derive the first optimal (up to logarithmic factors) regret bounds for combinatorial semi-bandits with delay and adver… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  17. arXiv:2303.08102  [pdf, ps, other

    cs.LG stat.ML

    Information-Theoretic Regret Bounds for Bandits with Fixed Expert Advice

    Authors: Khaled Eldowa, Nicolò Cesa-Bianchi, Alberto Maria Metelli, Marcello Restelli

    Abstract: We investigate the problem of bandits with expert advice when the experts are fixed and known distributions over the actions. Improving on previous analyses, we show that the regret in this setting is controlled by information-theoretic quantities that measure the similarity between experts. In some natural special cases, this allows us to obtain the first regret bound for EXP4 that can get arbitr… ▽ More

    Submitted 15 March, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

  18. arXiv:2302.10805  [pdf, ps, other

    cs.LG cs.DS cs.GT

    Repeated Bilateral Trade Against a Smoothed Adversary

    Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, Stefano Leonardi

    Abstract: We study repeated bilateral trade where an adaptive $σ$-smooth adversary generates the valuations of sellers and buyers. We provide a complete characterization of the regret regimes for fixed-price mechanisms under different feedback models in the two cases where the learner can post either the same or different prices to buyers and sellers. We begin by showing that the minimax regret after $T$ ro… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Journal ref: Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:1095-1130, 2023

  19. arXiv:2302.08345  [pdf, other

    cs.LG

    Linear Bandits with Memory: from Rotting to Rising

    Authors: Giulia Clerici, Pierre Laforgue, Nicolò Cesa-Bianchi

    Abstract: Nonstationary phenomena, such as satiation effects in recommendations, have mostly been modeled using bandits with finitely many arms. However, the richer action space provided by linear bandits is often preferred in practice. In this work, we introduce a novel nonstationary linear bandit model, where current rewards are influenced by the learner's past actions in a fixed-size window. Our model, w… ▽ More

    Submitted 25 May, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

  20. arXiv:2210.04229  [pdf, ps, other

    cs.LG cs.DS

    Learning on the Edge: Online Learning with Stochastic Feedback Graphs

    Authors: Emmanuel Esposito, Federico Fusco, Dirk van der Hoeven, Nicolò Cesa-Bianchi

    Abstract: The framework of feedback graphs is a generalization of sequential decision-making with bandit or full information feedback. In this work, we study an extension where the directed feedback graph is stochastic, following a distribution similar to the classical Erdős-Rényi model. Specifically, in each round every edge in the graph is either realized or not with a distinct probability for each edge.… ▽ More

    Submitted 9 October, 2022; originally announced October 2022.

    Journal ref: Advances in Neural Information Processing Systems 35 (NeurIPS 2022)

  21. arXiv:2209.03996  [pdf, ps, other

    cs.LG

    Active Learning of Classifiers with Label and Seed Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice, Maximilian Thiessen

    Abstract: We study exact active learning of binary and multiclass classifiers with margin. Given an $n$-point set $X \subset \mathbb{R}^m$, we want to learn any unknown classifier on $X$ whose classes have finite strong convex hull margin, a new notion extending the SVM margin. In the standard active learning setting, where only label queries are allowed, learning a classifier with strong convex hull margin… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  22. arXiv:2207.04054  [pdf, ps, other

    cs.GT cs.LG

    Online Learning in Supply-Chain Games

    Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Takayuki Osogami, Marco Scarsini, Segev Wasserkrug

    Abstract: We study a repeated game between a supplier and a retailer who want to maximize their respective profits without full knowledge of the problem parameters. After characterizing the uniqueness of the Stackelberg equilibrium of the stage game with complete information, we show that even with partial knowledge of the joint distribution of demand and production costs, natural learning dynamics guarante… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

  23. arXiv:2206.02656  [pdf, ps, other

    cs.LG stat.ML

    A Regret-Variance Trade-Off in Online Learning

    Authors: Dirk van der Hoeven, Nikita Zhivotovskiy, Nicolò Cesa-Bianchi

    Abstract: We consider prediction with expert advice for strongly convex and bounded losses, and investigate trade-offs between regret and "variance" (i.e., squared difference of learner's predictions and best expert predictions). With $K$ experts, the Exponentially Weighted Average (EWA) algorithm is known to achieve $O(\log K)$ regret. We prove that a variant of EWA either achieves a negative regret (i.e.,… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

  24. arXiv:2206.00557  [pdf, ps, other

    cs.LG

    A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs

    Authors: Chloé Rouyer, Dirk van der Hoeven, Nicolò Cesa-Bianchi, Yevgeny Seldin

    Abstract: We consider online learning with feedback graphs, a sequential decision-making framework where the learner's feedback is determined by a directed graph over the action set. We present a computationally efficient algorithm for learning in this framework that simultaneously achieves near-optimal regret bounds in both stochastic and adversarial environments. The bound against oblivious adversaries is… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  25. arXiv:2205.15802   

    cs.LG

    AdaTask: Adaptive Multitask Online Learning

    Authors: Pierre Laforgue, Andrea Della Vecchia, Nicolò Cesa-Bianchi, Lorenzo Rosasco

    Abstract: We introduce and analyze AdaTask, a multitask online learning algorithm that adapts to the unknown structure of the tasks. When the $N$ tasks are stochastically activated, we show that the regret of AdaTask is better, by a factor that can be as large as $\sqrt{N}$, than the regret achieved by running $N$ independent algorithms, one for each task. AdaTask can be seen as a comparator-adaptive versio… ▽ More

    Submitted 27 October, 2023; v1 submitted 31 May, 2022; originally announced May 2022.

    Comments: The proof of Theorem 3 is wrong: in the display equation below Equation (22), bottom of page 15, the gradient of $φ_{t+1}$ is missing a factor $1/(αη_t)$

  26. arXiv:2112.02866  [pdf, ps, other

    cs.LG

    Nonstochastic Bandits with Composite Anonymous Feedback

    Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Claudio Gentile, Yishay Mansour

    Abstract: We investigate a nonstochastic bandit setting in which the loss of an action is not immediately charged to the player, but rather spread over the subsequent rounds in an adversarial way. The instantaneous loss observed by the player at the end of each round is then a sum of many loss components of previously played actions. This setting encompasses as a special case the easier task of bandits with… ▽ More

    Submitted 24 September, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

  27. arXiv:2111.01589  [pdf, ps, other

    cs.LG

    Nonstochastic Bandits and Experts with Arm-Dependent Delays

    Authors: Dirk van der Hoeven, Nicolò Cesa-Bianchi

    Abstract: We study nonstochastic bandits and experts in a delayed setting where delays depend on both time and arms. While the setting in which delays only depend on time has been extensively studied, the arm-dependent delay setting better captures real-world applications at the cost of introducing new technical challenges. In the full information (experts) setting, we design an algorithm with a first-order… ▽ More

    Submitted 21 December, 2021; v1 submitted 2 November, 2021; originally announced November 2021.

  28. arXiv:2110.11819  [pdf, other

    cs.LG

    A Last Switch Dependent Analysis of Satiation and Seasonality in Bandits

    Authors: Pierre Laforgue, Giulia Clerici, Nicolò Cesa-Bianchi, Ran Gilad-Bachrach

    Abstract: Motivated by the fact that humans like some level of unpredictability or novelty, and might therefore get quickly bored when interacting with a stationary policy, we introduce a novel non-stationary bandit problem, where the expected reward of an arm is fully determined by the time elapsed since the arm last took part in a switch of actions. Our model generalizes previous notions of delay-dependen… ▽ More

    Submitted 7 March, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

  29. arXiv:2109.12974  [pdf, ps, other

    cs.GT cs.LG econ.TH

    Bilateral Trade: A Regret Minimization Perspective

    Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, Stefano Leonardi

    Abstract: Bilateral trade, a fundamental topic in economics, models the problem of intermediating between two strategic agents, a seller and a buyer, willing to trade a good for which they hold private valuations. In this paper, we cast the bilateral trade problem in a regret minimization framework over $T$ rounds of seller/buyer interactions, with no prior knowledge on their private valuations. Our main co… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2102.08754

  30. arXiv:2106.04982  [pdf, other

    cs.LG stat.ML

    Cooperative Online Learning with Feedback Graphs

    Authors: Nicolò Cesa-Bianchi, Tommaso R. Cesari, Riccardo Della Vecchia

    Abstract: We study the interplay between feedback and communication in a cooperative online learning setting where a network of agents solves a task in which the learners' feedback is determined by an arbitrary graph. We characterize regret in terms of the independence number of the strong product between the feedback graph and the communication network. Our analysis recovers as special cases many previousl… ▽ More

    Submitted 24 September, 2022; v1 submitted 9 June, 2021; originally announced June 2021.

  31. arXiv:2106.04913  [pdf, ps, other

    cs.LG stat.ML

    On Margin-Based Cluster Recovery with Oracle Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

    Abstract: We study an active cluster recovery problem where, given a set of $n$ points and an oracle answering queries like "are these two points in the same cluster?", the task is to recover exactly all clusters using as few queries as possible. We begin by introducing a simple but general notion of margin between clusters that captures, as special cases, the margins used in previous work, the classic SVM… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  32. arXiv:2106.03596  [pdf, other

    cs.LG

    Beyond Bandit Feedback in Online Multiclass Classification

    Authors: Dirk van der Hoeven, Federico Fusco, Nicolò Cesa-Bianchi

    Abstract: We study the problem of online multiclass classification in a setting where the learner's feedback is determined by an arbitrary directed graph. While including bandit feedback as a special case, feedback graphs allow a much richer set of applications, including filtering and label efficient classification. We introduce Gappletron, the first online multiclass algorithm that works with arbitrary fe… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Journal ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  33. arXiv:2106.02393  [pdf, other

    cs.LG

    Multitask Online Mirror Descent

    Authors: Nicolò Cesa-Bianchi, Pierre Laforgue, Andrea Paudice, Massimiliano Pontil

    Abstract: We introduce and analyze MT-OMD, a multitask generalization of Online Mirror Descent (OMD) which operates by sharing updates between tasks. We prove that the regret of MT-OMD is of order $\sqrt{1 + σ^2(N-1)}\sqrt{T}$, where $σ^2$ is the task variance according to the geometry induced by the regularizer, $N$ is the number of tasks, and $T$ is the time horizon. Whenever tasks are similar, that is… ▽ More

    Submitted 1 November, 2022; v1 submitted 4 June, 2021; originally announced June 2021.

  34. arXiv:2102.11834  [pdf, other

    cs.GT econ.TH math.CO

    Finding Stable Matchings in PhD Markets with Consistent Preferences and Cooperative Partners

    Authors: Maximilian Mordig, Riccardo Della Vecchia, Nicolò Cesa-Bianchi, Bernhard Schölkopf

    Abstract: We introduce a new algorithm for finding stable matchings in multi-sided matching markets. Our setting is motivated by a PhD market of students, advisors, and co-advisors, and can be generalized to supply chain networks viewed as $n$-sided markets. In the three-sided PhD market, students primarily care about advisors and then about co-advisors (consistent preferences), while advisors and co-adviso… ▽ More

    Submitted 6 July, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

  35. arXiv:2102.09864  [pdf, other

    cs.LG stat.ML

    An Algorithm for Stochastic and Adversarial Bandits with Switching Costs

    Authors: Chloé Rouyer, Yevgeny Seldin, Nicolò Cesa-Bianchi

    Abstract: We propose an algorithm for stochastic and adversarial multiarmed bandits with switching costs, where the algorithm pays a price $λ$ every time it switches the arm being played. Our algorithm is based on adaptation of the Tsallis-INF algorithm of Zimmert and Seldin (2021) and requires no prior knowledge of the regime or time horizon. In the oblivious adversarial setting it achieves the minimax opt… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

  36. arXiv:2102.08754  [pdf, ps, other

    cs.LG econ.TH stat.ML

    A Regret Analysis of Bilateral Trade

    Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, Stefano Leonardi

    Abstract: Bilateral trade, a fundamental topic in economics, models the problem of intermediating between two strategic agents, a seller and a buyer, willing to trade a good for which they hold private valuations. Despite the simplicity of this problem, a classical result by Myerson and Satterthwaite (1983) affirms the impossibility of designing a mechanism which is simultaneously efficient, incentive compa… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Journal ref: EC '21: Proceedings of the 22nd ACM Conference on Economics and Computation (2021))

  37. arXiv:2102.00504  [pdf, other

    cs.LG stat.ML

    Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

    Abstract: We investigate the problem of exact cluster recovery using oracle queries. Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only $O(\log n)$ same-cluster queries, where $n$ is the number of input points. In this work, we study this problem in the more challenging non-convex setting. We introduce a structural char… ▽ More

    Submitted 13 July, 2021; v1 submitted 31 January, 2021; originally announced February 2021.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2021

  38. arXiv:2101.12080  [pdf, other

    cs.GT econ.TH math.CO

    Two-Sided Matching Markets in the ELLIS 2020 PhD Program

    Authors: Maximilian Mordig, Riccardo Della Vecchia, Nicolò Cesa-Bianchi, Bernhard Schölkopf

    Abstract: The ELLIS PhD program is a European initiative that supports excellent young researchers by connecting them to leading researchers in AI. In particular, PhD students are supervised by two advisors from different countries: an advisor and a co-advisor. In this work we summarize the procedure that, in its final step, matches students to advisors in the ELLIS 2020 PhD program. The steps of the proced… ▽ More

    Submitted 11 March, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

  39. arXiv:2006.04675  [pdf, other

    cs.LG stat.ML

    Exact Recovery of Mangled Clusters with Same-Cluster Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

    Abstract: We study the cluster recovery problem in the semi-supervised active clustering framework. Given a finite set of input points, and an oracle revealing whether any two points lie in the same cluster, our goal is to recover all clusters exactly using as few queries as possible. To this end, we relax the spherical $k$-means cluster assumption of Ashtiani et al.\ to allow for arbitrary ellipsoidal clus… ▽ More

    Submitted 30 October, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: To appear at NeurIPS 2020 (oral)

  40. arXiv:2002.01882  [pdf, other

    cs.LG stat.ML

    Locally-Adaptive Nonparametric Online Learning

    Authors: Ilja Kuzborskij, Nicolò Cesa-Bianchi

    Abstract: One of the main strengths of online algorithms is their ability to adapt to arbitrary data sequences. This is especially important in nonparametric settings, where performance is measured against rich classes of comparator functions that are able to fit complex environments. Although such hard comparators and complex environments may exhibit local regularities, efficient algorithms, which can prov… ▽ More

    Submitted 1 November, 2020; v1 submitted 5 February, 2020; originally announced February 2020.

  41. arXiv:1910.02757  [pdf, other

    stat.ML cs.LG

    Stochastic Bandits with Delay-Dependent Payoffs

    Authors: Leonardo Cella, Nicolò Cesa-Bianchi

    Abstract: Motivated by recommendation problems in music streaming platforms, we propose a nonstationary stochastic bandit model in which the expected reward of an arm depends on the number of rounds that have passed since the arm was last pulled. After proving that finding an optimal policy is NP-hard even when all model parameters are known, we introduce a class of ranking policies provably approximating,… ▽ More

    Submitted 19 February, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

  42. arXiv:1906.00670  [pdf, other

    cs.LG stat.ML

    Nonstochastic Multiarmed Bandits with Unrestricted Delays

    Authors: Tobias Sommer Thune, Nicolò Cesa-Bianchi, Yevgeny Seldin

    Abstract: We investigate multiarmed bandits with delayed feedback, where the delays need neither be identical nor bounded. We first prove that "delayed" Exp3 achieves the $O(\sqrt{(KT + D)\ln K} )$ regret bound conjectured by Cesa-Bianchi et al. [2019] in the case of variable, but bounded delays. Here, $K$ is the number of actions and $D$ is the total delay over $T$ rounds. We then introduce a new algorithm… ▽ More

    Submitted 19 November, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: 9 pages, Neurips camera ready

  43. arXiv:1905.11902  [pdf, other

    cs.LG stat.ML

    Correlation Clustering with Adaptive Similarity Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Andrea Paudice, Fabio Vitale

    Abstract: In correlation clustering, we are given $n$ objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disag… ▽ More

    Submitted 14 January, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

  44. arXiv:1905.11797  [pdf, ps, other

    cs.LG stat.ML

    ROI Maximization in Stochastic Online Decision-Making

    Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Yishay Mansour, Vianney Perchet

    Abstract: We introduce a novel theoretical framework for Return On Investment (ROI) maximization in repeated decision-making. Our setting is motivated by the use case of companies that regularly receive proposals for technological innovations and want to quickly decide whether they are worth implementing. We design an algorithm for learning ROI-maximizing decision-making policies over a sequence of innovati… ▽ More

    Submitted 22 December, 2021; v1 submitted 28 May, 2019; originally announced May 2019.

  45. arXiv:1902.01846  [pdf, other

    cs.LG stat.ML

    Distribution-Dependent Analysis of Gibbs-ERM Principle

    Authors: Ilja Kuzborskij, Nicolò Cesa-Bianchi, Csaba Szepesvári

    Abstract: Gibbs-ERM learning is a natural idealized model of learning with stochastic optimization algorithms (such as Stochastic Gradient Langevin Dynamics and ---to some extent--- Stochastic Gradient Descent), while it also arises in other contexts, including PAC-Bayesian theory, and sampling mechanisms. In this work we study the excess risk suffered by a Gibbs-ERM learner that uses non-convex, regularize… ▽ More

    Submitted 5 February, 2019; originally announced February 2019.

  46. arXiv:1901.08082  [pdf, ps, other

    cs.LG stat.ML

    Cooperative Online Learning: Kee** your Neighbors Updated

    Authors: Nicolò Cesa-Bianchi, Tommaso R. Cesari, Claire Monteleoni

    Abstract: We study an asynchronous online learning setting with a network of agents. At each time step, some of the agents are activated, requested to make a prediction, and pay the corresponding loss. The loss function is then revealed to these agents and also to their neighbors in the network. Our results characterize how much knowing the network structure affects the regret as a function of the model of… ▽ More

    Submitted 15 January, 2020; v1 submitted 23 January, 2019; originally announced January 2019.

  47. arXiv:1809.11033  [pdf, other

    cs.LG stat.ML

    Efficient Linear Bandits through Matrix Sketching

    Authors: Ilja Kuzborskij, Leonardo Cella, Nicolò Cesa-Bianchi

    Abstract: We prove that two popular linear contextual bandit algorithms, OFUL and Thompson Sampling, can be made efficient using Frequent Directions, a deterministic online sketching technique. More precisely, we show that a sketch of size $m$ allows a $\mathcal{O}(md)$ update time for both algorithms, as opposed to $Ω(d^2)$ required by their non-sketched versions in general (where $d$ is the dimension of c… ▽ More

    Submitted 21 March, 2022; v1 submitted 28 September, 2018; originally announced September 2018.

  48. arXiv:1807.03288  [pdf, ps, other

    cs.LG stat.ML

    Dynamic Pricing with Finitely Many Unknown Valuations

    Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Vianney Perchet

    Abstract: Motivated by posted price auctions where buyers are grouped in an unknown number of latent types characterized by their private values for the good on sale, we investigate revenue maximization in stochastic dynamic pricing when the distribution of buyers' private values is supported on an unknown set of points in [0,1] of unknown cardinality $K$. This setting can be viewed as an instance of a stoc… ▽ More

    Submitted 5 March, 2019; v1 submitted 9 July, 2018; originally announced July 2018.

  49. arXiv:1805.07331  [pdf, other

    cs.LG q-bio.QM stat.ML

    Positive and Unlabeled Learning through Negative Selection and Imbalance-aware Classification

    Authors: Marco Frasca, Nicolò Cesa-Bianchi

    Abstract: Motivated by applications in protein function prediction, we consider a challenging supervised classification setting in which positive labels are scarce and there are no explicit negative labels. The learning algorithm must thus select which unlabeled examples to use as negative training points, possibly ending up with an unbalanced learning problem. We address these issues by proposing an algori… ▽ More

    Submitted 25 January, 2019; v1 submitted 18 May, 2018; originally announced May 2018.

  50. arXiv:1705.10257  [pdf, ps, other

    cs.LG stat.ML

    Boltzmann Exploration Done Right

    Authors: Nicolò Cesa-Bianchi, Claudio Gentile, Gábor Lugosi, Gergely Neu

    Abstract: Boltzmann exploration is a classic strategy for sequential decision-making under uncertainty, and is one of the most standard tools in Reinforcement Learning (RL). Despite its widespread use, there is virtually no theoretical understanding about the limitations or the actual benefits of this exploration scheme. Does it drive exploration in a meaningful way? Is it prone to misidentifying the optima… ▽ More

    Submitted 7 November, 2017; v1 submitted 29 May, 2017; originally announced May 2017.