Search | arXiv e-print repository

On (Random-order) Online Contention Resolution Schemes for the Matching Polytope of (Bipartite) Graphs

Authors: Calum MacRury, Will Ma, Nathaniel Grammel

Abstract: Online Contention Resolution Schemes (OCRS's) represent a modern tool for selecting a subset of elements, subject to resource constraints, when the elements are presented to the algorithm sequentially. OCRS's have led to some of the best-known competitive ratio guarantees for online resource allocation problems, with the added benefit of treating different online decisions -- accept/reject, probin… ▽ More Online Contention Resolution Schemes (OCRS's) represent a modern tool for selecting a subset of elements, subject to resource constraints, when the elements are presented to the algorithm sequentially. OCRS's have led to some of the best-known competitive ratio guarantees for online resource allocation problems, with the added benefit of treating different online decisions -- accept/reject, probing, pricing -- in a unified manner. This paper analyzes OCRS's for resource constraints defined by matchings in graphs, a fundamental structure in combinatorial optimization. We consider two dimensions of variants: the elements being presented in adversarial or random order; and the graph being bipartite or general. We improve the state of the art for all combinations of variants, both in terms of algorithmic guarantees and impossibility results. Some of our algorithmic guarantees are best-known even compared to Contention Resolution Schemes that can choose the order of arrival or are offline. All in all, our results for OCRS directly improve the best-known competitive ratios for online accept/reject, probing, and pricing problems on graphs in a unified manner. △ Less

Submitted 1 April, 2024; v1 submitted 15 September, 2022; originally announced September 2022.

Comments: This version improves our previous bipartite RCRS lower bound from 0.4761 to 0.4789

ACM Class: F.2.2; G.2.2

Journal ref: SODA 2023

arXiv:2111.08793 [pdf, other]

doi 10.1016/j.dam.2021.12.001

The Stochastic Boolean Function Evaluation Problem for Symmetric Boolean Functions

Authors: Dimitrios Gkenosis, Nathaniel Grammel, Lisa Hellerstein, Devorah Kletenik

Abstract: We give two approximation algorithms solving the Stochastic Boolean Function Evaluation (SBFE) problem for symmetric Boolean functions. The first is an $O(\log n)$-approximation algorithm, based on the submodular goal-value approach of Deshpande, Hellerstein and Kletenik. Our second algorithm, which is simple, is based on the algorithm solving the SBFE problem for $k$-of-$n$ functions, due to Sall… ▽ More We give two approximation algorithms solving the Stochastic Boolean Function Evaluation (SBFE) problem for symmetric Boolean functions. The first is an $O(\log n)$-approximation algorithm, based on the submodular goal-value approach of Deshpande, Hellerstein and Kletenik. Our second algorithm, which is simple, is based on the algorithm solving the SBFE problem for $k$-of-$n$ functions, due to Salloum, Breuer, and Ben-Dov. It achieves a $(B-1)$ approximation factor, where $B$ is the number of blocks of 0's and 1's in the standard vector representation of the symmetric Boolean function. As part of the design of the first algorithm, we prove that the goal value of any symmetric Boolean function is less than $n(n+1)/2$. Finally, we give an example showing that for symmetric Boolean functions, minimum expected verification cost and minimum expected evaluation cost are not necessarily equal. This contrasts with a previous result, given by Das, Jafarpour, Orlitsky, Pan and Suresh, which showed that equality holds in the unit-cost case. △ Less

Submitted 4 January, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

Comments: Preliminary versions of these results appeared on Arxiv in arXiv:1806.10660. That paper contains results for both arbitrary costs and unit costs. This paper considers only arbitrary costs. Updated January 2022 include journal information

Journal ref: Discrete Applied Mathematics 309 (2022), 269-277

arXiv:2106.06892 [pdf, ps, other]

Improved Guarantees for Offline Stochastic Matching via new Ordered Contention Resolution Schemes

Authors: Brian Brubach, Nathaniel Grammel, Will Ma, Aravind Srinivasan

Abstract: Matching is one of the most fundamental and broadly applicable problems across many domains. In these diverse real-world applications, there is often a degree of uncertainty in the input which has led to the study of stochastic matching models. Here, each edge in the graph has a known, independent probability of existing derived from some prediction. Algorithms must probe edges to determine existe… ▽ More Matching is one of the most fundamental and broadly applicable problems across many domains. In these diverse real-world applications, there is often a degree of uncertainty in the input which has led to the study of stochastic matching models. Here, each edge in the graph has a known, independent probability of existing derived from some prediction. Algorithms must probe edges to determine existence and match them irrevocably if they exist. Further, each vertex may have a patience constraint denoting how many of its neighboring edges can be probed. We present new ordered contention resolution schemes yielding improved approximation guarantees for some of the foundational problems studied in this area. For stochastic matching with patience constraints in general graphs, we provide a 0.382-approximate algorithm, significantly improving over the previous best 0.31-approximation of Baveja et al. (2018). When the vertices do not have patience constraints, we describe a 0.432-approximate random order probing algorithm with several corollaries such as an improved guarantee for the Prophet Secretary problem under Edge Arrivals. Finally, for the special case of bipartite graphs with unit patience constraints on one of the partitions, we show a 0.632-approximate algorithm that improves on the recent $1/3$-guarantee of Hikima et al. (2021). △ Less

Submitted 8 August, 2022; v1 submitted 12 June, 2021; originally announced June 2021.

Comments: full version of Neurips 2021 paper

arXiv:2009.14471 [pdf, other]

PettingZoo: Gym for Multi-Agent Reinforcement Learning

Authors: J. K. Terry, Benjamin Black, Nathaniel Grammel, Mario Jayakumar, Ananth Hari, Ryan Sullivan, Luis Santos, Rodrigo Perez, Caroline Horsch, Clemens Dieffendahl, Niall L. Williams, Yashas Lokesh, Praveen Ravi

Abstract: This paper introduces the PettingZoo library and the accompanying Agent Environment Cycle ("AEC") games model. PettingZoo is a library of diverse sets of multi-agent environments with a universal, elegant Python API. PettingZoo was developed with the goal of accelerating research in Multi-Agent Reinforcement Learning ("MARL"), by making work more interchangeable, accessible and reproducible akin… ▽ More This paper introduces the PettingZoo library and the accompanying Agent Environment Cycle ("AEC") games model. PettingZoo is a library of diverse sets of multi-agent environments with a universal, elegant Python API. PettingZoo was developed with the goal of accelerating research in Multi-Agent Reinforcement Learning ("MARL"), by making work more interchangeable, accessible and reproducible akin to what OpenAI's Gym library did for single-agent reinforcement learning. PettingZoo's API, while inheriting many features of Gym, is unique amongst MARL APIs in that it's based around the novel AEC games model. We argue, in part through case studies on major problems in popular MARL environments, that the popular game models are poor conceptual models of games commonly used in MARL and accordingly can promote confusing bugs that are hard to detect, and that the AEC games model addresses these problems. △ Less

Submitted 26 October, 2021; v1 submitted 30 September, 2020; originally announced September 2020.

arXiv:2009.13051

Agent Environment Cycle Games

Authors: J K Terry, Nathaniel Grammel, Benjamin Black, Ananth Hari, Caroline Horsch, Luis Santos

Abstract: Partially Observable Stochastic Games (POSGs) are the most general and common model of games used in Multi-Agent Reinforcement Learning (MARL). We argue that the POSG model is conceptually ill suited to software MARL environments, and offer case studies from the literature where this mismatch has led to severely unexpected behavior. In response to this, we introduce the Agent Environment Cycle G… ▽ More Partially Observable Stochastic Games (POSGs) are the most general and common model of games used in Multi-Agent Reinforcement Learning (MARL). We argue that the POSG model is conceptually ill suited to software MARL environments, and offer case studies from the literature where this mismatch has led to severely unexpected behavior. In response to this, we introduce the Agent Environment Cycle Games (AEC Games) model, which is more representative of software implementation. We then prove it's as an equivalent model to POSGs. The AEC games model is also uniquely useful in that it can elegantly represent both all forms of MARL environments, whereas for example POSGs cannot elegantly represent strictly turn based games like chess. △ Less

Submitted 1 May, 2021; v1 submitted 28 September, 2020; originally announced September 2020.

Comments: This work of this paper has been merged into the paper "PettingZoo: Gym for Multi-Agent Reinforcement Learning" arXiv:2009.14471

arXiv:2008.03325 [pdf, ps, other]

Stochastic Optimization and Learning for Two-Stage Supplier Problems

Authors: Brian Brubach, Nathaniel Grammel, David G. Harris, Aravind Srinivasan, Leonidas Tsepenekas, Anil Vullikanti

Abstract: The main focus of this paper is radius-based (supplier) clustering in the two-stage stochastic setting with recourse, where the inherent stochasticity of the model comes in the form of a budget constraint. In addition to the standard (homogeneous) setting where all clients must be within a distance $R$ of the nearest facility, we provide results for the more general problem where the radius demand… ▽ More The main focus of this paper is radius-based (supplier) clustering in the two-stage stochastic setting with recourse, where the inherent stochasticity of the model comes in the form of a budget constraint. In addition to the standard (homogeneous) setting where all clients must be within a distance $R$ of the nearest facility, we provide results for the more general problem where the radius demands may be inhomogeneous (i.e., different for each client). We also explore a number of variants where additional constraints are imposed on the first-stage decisions, specifically matroid and multi-knapsack constraints, and provide results for these settings. We derive results for the most general distributional setting, where there is only black-box access to the underlying distribution. To accomplish this, we first develop algorithms for the polynomial scenarios setting; we then employ a novel scenario-discarding variant of the standard Sample Average Approximation (SAA) method, which crucially exploits properties of the restricted-case algorithms. We note that the scenario-discarding modification to the SAA method is necessary in order to optimize over the radius. △ Less

Submitted 7 April, 2024; v1 submitted 7 August, 2020; originally announced August 2020.

arXiv:2006.06870

Multi-Agent Informational Learning Processes

Authors: J. K. Terry, Nathaniel Grammel

Abstract: We introduce a new mathematical model of multi-agent reinforcement learning, the Multi-Agent Informational Learning Processor "MAILP" model. The model is based on the notion that agents have policies for a certain amount of information, models how this information iteratively evolves and propagates through many agents. This model is very general, and the only meaningful assumption made is that l… ▽ More We introduce a new mathematical model of multi-agent reinforcement learning, the Multi-Agent Informational Learning Processor "MAILP" model. The model is based on the notion that agents have policies for a certain amount of information, models how this information iteratively evolves and propagates through many agents. This model is very general, and the only meaningful assumption made is that learning for individual agents progressively slows over time. △ Less

Submitted 25 February, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

Comments: We are withdrawing this paper as section 2.1.1 implicitly assumes information gain at all points is homogenous. A researcher has provided us an example showing that this assumption causes our model to make unexpected and pathological predictions, and we are aware of now way to remove this assumption from our work

arXiv:2005.13625 [pdf, other]

Revisiting Parameter Sharing in Multi-Agent Deep Reinforcement Learning

Authors: J. K. Terry, Nathaniel Grammel, Sanghyun Son, Benjamin Black, Aakriti Agrawal

Abstract: Parameter sharing, where each agent independently learns a policy with fully shared parameters between all policies, is a popular baseline method for multi-agent deep reinforcement learning. Unfortunately, since all agents share the same policy network, they cannot learn different policies or tasks. This issue has been circumvented experimentally by adding an agent-specific indicator signal to obs… ▽ More Parameter sharing, where each agent independently learns a policy with fully shared parameters between all policies, is a popular baseline method for multi-agent deep reinforcement learning. Unfortunately, since all agents share the same policy network, they cannot learn different policies or tasks. This issue has been circumvented experimentally by adding an agent-specific indicator signal to observations, which we term "agent indication". Agent indication is limited, however, in that without modification it does not allow parameter sharing to be applied to environments where the action spaces and/or observation spaces are heterogeneous. This work formalizes the notion of agent indication and proves that it enables convergence to optimal policies for the first time. Next, we formally introduce methods to extend parameter sharing to learning in heterogeneous observation and action spaces, and prove that these methods allow for convergence to optimal policies. Finally, we experimentally confirm that the methods we introduce function empirically, and conduct a wide array of experiments studying the empirical efficacy of many different agent indication schemes for image based observation spaces. △ Less

Submitted 31 October, 2023; v1 submitted 27 May, 2020; originally announced May 2020.

arXiv:1907.03963 [pdf, ps, other]

Online Matching Frameworks under Stochastic Rewards, Product Ranking, and Unknown Patience

Authors: Brian Brubach, Nathaniel Grammel, Will Ma, Aravind Srinivasan

Abstract: We study generalizations of online bipartite matching in which each arriving vertex (customer) views a ranked list of offline vertices (products) and matches to (purchases) the first one they deem acceptable. The number of products that the customer has patience to view can be stochastic and dependent on the products seen. We develop a framework that views the interaction with each customer as an… ▽ More We study generalizations of online bipartite matching in which each arriving vertex (customer) views a ranked list of offline vertices (products) and matches to (purchases) the first one they deem acceptable. The number of products that the customer has patience to view can be stochastic and dependent on the products seen. We develop a framework that views the interaction with each customer as an abstract resource consumption process, and derive new results for these online matching problems under the adversarial, non-stationary, and IID arrival models, assuming we can (approximately) solve the product ranking problem for each single customer. To that end, we show new results for product ranking under two cascade-click models: an optimal algorithm when each item has its own hazard rate for making the customer depart, and a 1/2-approximate algorithm when the customer has a general item-independent patience distribution. We also present a constant-factor 0.027-approximate algorithm in a new model where items are not initially available and arrive over time. We complement these positive results by presenting three additional negative results relating to these problems. △ Less

Submitted 26 June, 2023; v1 submitted 8 July, 2019; originally announced July 2019.

arXiv:1806.10660 [pdf, other]

The Stochastic Score Classification Problem

Authors: Dimitrios Gkenosis, Nathaniel Grammel, Lisa Hellerstein, Devorah Kletenik

Abstract: Consider the following Stochastic Score Classification Problem. A doctor is assessing a patient's risk of develo** a certain disease, and can perform $n$ tests on the patient. Each test has a binary outcome, positive or negative. A positive test result is an indication of risk, and a patient's score is the total number of positive test results. The doctor needs to classify the patient into one o… ▽ More Consider the following Stochastic Score Classification Problem. A doctor is assessing a patient's risk of develo** a certain disease, and can perform $n$ tests on the patient. Each test has a binary outcome, positive or negative. A positive test result is an indication of risk, and a patient's score is the total number of positive test results. The doctor needs to classify the patient into one of $B$ risk classes, depending on the score (e.g., LOW, MEDIUM, and HIGH risk). Each of these classes corresponds to a contiguous range of scores. Test $i$ has probability $p_i$ of being positive, and it costs $c_i$ to perform the test. To reduce costs, instead of performing all tests, the doctor will perform them sequentially and stop testing when it is possible to determine the risk category for the patient. The problem is to determine the order in which the doctor should perform the tests, so as to minimize the expected testing cost. We provide approximation algorithms for adaptive and non-adaptive versions of this problem, and pose a number of open questions. △ Less

Submitted 27 June, 2018; originally announced June 2018.

arXiv:1603.03158 [pdf, other]

Scenario Submodular Cover

Authors: Nathaniel Grammel, Lisa Hellerstein, Devorah Kletenik, Patrick Lin

Abstract: Many problems in Machine Learning can be modeled as submodular optimization problems. Recent work has focused on stochastic or adaptive versions of these problems. We consider the Scenario Submodular Cover problem, which is a counterpart to the Stochastic Submodular Cover problem studied by Golovin and Krause. In Scenario Submodular Cover, the goal is to produce a cover with minimum expected cost,… ▽ More Many problems in Machine Learning can be modeled as submodular optimization problems. Recent work has focused on stochastic or adaptive versions of these problems. We consider the Scenario Submodular Cover problem, which is a counterpart to the Stochastic Submodular Cover problem studied by Golovin and Krause. In Scenario Submodular Cover, the goal is to produce a cover with minimum expected cost, where the expectation is with respect to an empirical joint distribution, given as input by a weighted sample of realizations. In contrast, in Stochastic Submodular Cover, the variables of the input distribution are assumed to be independent, and the distribution of each variable is given as input. Building on algorithms developed by Cicalese et al. and Golovin and Krause for related problems, we give two approximation algorithms for Scenario Submodular Cover over discrete distributions. The first achieves an approximation factor of O(log Qm), where m is the size of the sample and Q is the goal utility. The second, simpler algorithm achieves an approximation bound of O(log QW), where Q is the goal utility and W is the sum of the integer weights. (Both bounds assume an integer-valued utility function.) Our results yield approximation bounds for other problems involving non-independent distributions that are explicitly specified by their support. △ Less

Submitted 10 March, 2016; originally announced March 2016.

Comments: 32 pages, 1 figure

Showing 1–11 of 11 results for author: Grammel, N