-
Data-dependent and Oracle Bounds on Forgetting in Continual Learning
Authors:
Lior Friedman,
Ron Meir
Abstract:
In continual learning, knowledge must be preserved and re-used between tasks, maintaining good transfer to future tasks and minimizing forgetting of previously learned ones. While several practical algorithms have been devised for this setting, there have been few theoretical works aiming to quantify and bound the degree of Forgetting in general settings. We provide both data-dependent and oracle…
▽ More
In continual learning, knowledge must be preserved and re-used between tasks, maintaining good transfer to future tasks and minimizing forgetting of previously learned ones. While several practical algorithms have been devised for this setting, there have been few theoretical works aiming to quantify and bound the degree of Forgetting in general settings. We provide both data-dependent and oracle upper bounds that apply regardless of model and algorithm choice, as well as bounds for Gibbs posteriors. We derive an algorithm inspired by our bounds and demonstrate empirically that our approach yields improved forward and backward transfer.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Analysis of the Identifying Regulation with Adversarial Surrogates Algorithm
Authors:
Ron Teichner,
Ron Meir,
Michael Margaliot
Abstract:
Given a time-series of noisy measured outputs of a dynamical system z[k], k=1...N, the Identifying Regulation with Adversarial Surrogates (IRAS) algorithm aims to find a non-trivial first integral of the system, namely, a scalar function g() such that g(z[i]) = g(z[j]), for all i,j. IRAS has been suggested recently and was used successfully in several learning tasks in models from biology and phys…
▽ More
Given a time-series of noisy measured outputs of a dynamical system z[k], k=1...N, the Identifying Regulation with Adversarial Surrogates (IRAS) algorithm aims to find a non-trivial first integral of the system, namely, a scalar function g() such that g(z[i]) = g(z[j]), for all i,j. IRAS has been suggested recently and was used successfully in several learning tasks in models from biology and physics. Here, we give the first rigorous analysis of this algorithm in a specific setting. We assume that the observations admit a linear first integral and that they are contaminated by Gaussian noise. We show that in this case the IRAS iterations are closely related to the self-consistent-field (SCF) iterations for solving a generalized Rayleigh quotient minimization problem. Using this approach, we derive several sufficient conditions guaranteeing local convergence of IRAS to the correct first integral.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Concept-Best-Matching: Evaluating Compositionality in Emergent Communication
Authors:
Boaz Carmeli,
Yonatan Belinkov,
Ron Meir
Abstract:
Artificial agents that learn to communicate in order to accomplish a given task acquire communication protocols that are typically opaque to a human. A large body of work has attempted to evaluate the emergent communication via various evaluation measures, with \emph{compositionality} featuring as a prominent desired trait. However, current evaluation procedures do not directly expose the composit…
▽ More
Artificial agents that learn to communicate in order to accomplish a given task acquire communication protocols that are typically opaque to a human. A large body of work has attempted to evaluate the emergent communication via various evaluation measures, with \emph{compositionality} featuring as a prominent desired trait. However, current evaluation procedures do not directly expose the compositionality of the emergent communication. We propose a procedure to assess the compositionality of emergent communication by finding the best-match between emerged words and natural language concepts. The best-match algorithm provides both a global score and a translation-map from emergent words to natural language concepts. To the best of our knowledge, it is the first time that such direct and interpretable map** between emergent words and human concepts is provided.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
Statistical curriculum learning: An elimination algorithm achieving an oracle risk
Authors:
Omer Cohen,
Ron Meir,
Nir Weinberger
Abstract:
We consider a statistical version of curriculum learning (CL) in a parametric prediction setting. The learner is required to estimate a target parameter vector, and can adaptively collect samples from either the target model, or other source models that are similar to the target model, but less noisy. We consider three types of learners, depending on the level of side-information they receive. The…
▽ More
We consider a statistical version of curriculum learning (CL) in a parametric prediction setting. The learner is required to estimate a target parameter vector, and can adaptively collect samples from either the target model, or other source models that are similar to the target model, but less noisy. We consider three types of learners, depending on the level of side-information they receive. The first two, referred to as strong/weak-oracle learners, receive high/low degrees of information about the models, and use these to learn. The third, a fully adaptive learner, estimates the target parameter vector without any prior information. In the single source case, we propose an elimination learning method, whose risk matches that of a strong-oracle learner. In the multiple source case, we advocate that the risk of the weak-oracle learner is a realistic benchmark for the risk of adaptive learners. We develop an adaptive multiple elimination-rounds CL algorithm, and characterize instance-dependent conditions for its risk to match that of the weak-oracle learner. We consider instance-dependent minimax lower bounds, and discuss the challenges associated with defining the class of instances for the bound. We derive two minimax lower bounds, and determine the conditions under which the performance weak-oracle learner is minimax optimal.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Characterization of the Distortion-Perception Tradeoff for Finite Channels with Arbitrary Metrics
Authors:
Dror Freirich,
Nir Weinberger,
Ron Meir
Abstract:
Whenever inspected by humans, reconstructed signals should not be distinguished from real ones. Typically, such a high perceptual quality comes at the price of high reconstruction error, and vice versa. We study this distortion-perception (DP) tradeoff over finite-alphabet channels, for the Wasserstein-$1$ distance induced by a general metric as the perception index, and an arbitrary distortion ma…
▽ More
Whenever inspected by humans, reconstructed signals should not be distinguished from real ones. Typically, such a high perceptual quality comes at the price of high reconstruction error, and vice versa. We study this distortion-perception (DP) tradeoff over finite-alphabet channels, for the Wasserstein-$1$ distance induced by a general metric as the perception index, and an arbitrary distortion matrix. Under this setting, we show that computing the DP function and the optimal reconstructions is equivalent to solving a set of linear programming problems. We provide a structural characterization of the DP tradeoff, where the DP function is piecewise linear in the perception index. We further derive a closed-form expression for the case of binary sources.
△ Less
Submitted 3 February, 2024;
originally announced February 2024.
-
Efficient Online Crowdsourcing with Complex Annotations
Authors:
Reshef Meir,
Viet-An Nguyen,
Xu Chen,
Jagdish Ramakrishnan,
Udi Weinsberg
Abstract:
Crowdsourcing platforms use various truth discovery algorithms to aggregate annotations from multiple labelers. In an online setting, however, the main challenge is to decide whether to ask for more annotations for each item to efficiently trade off cost (i.e., the number of annotations) for quality of the aggregated annotations. In this paper, we propose a novel approach for general complex annot…
▽ More
Crowdsourcing platforms use various truth discovery algorithms to aggregate annotations from multiple labelers. In an online setting, however, the main challenge is to decide whether to ask for more annotations for each item to efficiently trade off cost (i.e., the number of annotations) for quality of the aggregated annotations. In this paper, we propose a novel approach for general complex annotation (such as bounding boxes and taxonomy paths), that works in an online crowdsourcing setting. We prove that the expected average similarity of a labeler is linear in their accuracy \emph{conditional on the reported label}. This enables us to infer reported label accuracy in a broad range of scenarios. We conduct extensive evaluations on real-world crowdsourcing data from Meta and show the effectiveness of our proposed online algorithms in improving the cost-quality trade-off.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Meta-Learning Adversarial Bandit Algorithms
Authors:
Mikhail Khodak,
Ilya Osadchiy,
Keegan Harris,
Maria-Florina Balcan,
Kfir Y. Levy,
Ron Meir,
Zhiwei Steven Wu
Abstract:
We study online meta-learning with bandit feedback, with the goal of improving performance across multiple tasks if they are similar according to some natural similarity measure. As the first to target the adversarial online-within-online partial-information setting, we design meta-algorithms that combine outer learners to simultaneously tune the initialization and other hyperparameters of an inne…
▽ More
We study online meta-learning with bandit feedback, with the goal of improving performance across multiple tasks if they are similar according to some natural similarity measure. As the first to target the adversarial online-within-online partial-information setting, we design meta-algorithms that combine outer learners to simultaneously tune the initialization and other hyperparameters of an inner learner for two important cases: multi-armed bandits (MAB) and bandit linear optimization (BLO). For MAB, the meta-learners initialize and set hyperparameters of the Tsallis-entropy generalization of Exp3, with the task-averaged regret improving if the entropy of the optima-in-hindsight is small. For BLO, we learn to initialize and tune online mirror descent (OMD) with self-concordant barrier regularizers, showing that task-averaged regret varies directly with an action space-dependent measure they induce. Our guarantees rely on proving that unregularized follow-the-leader combined with two levels of low-dimensional hyperparameter tuning is enough to learn a sequence of affine functions of non-Lipschitz and sometimes non-convex Bregman divergences bounding the regret of OMD.
△ Less
Submitted 1 November, 2023; v1 submitted 5 July, 2023;
originally announced July 2023.
-
Perceptual Kalman Filters: Online State Estimation under a Perfect Perceptual-Quality Constraint
Authors:
Dror Freirich,
Tomer Michaeli,
Ron Meir
Abstract:
Many practical settings call for the reconstruction of temporal signals from corrupted or missing data. Classic examples include decoding, tracking, signal enhancement and denoising. Since the reconstructed signals are ultimately viewed by humans, it is desirable to achieve reconstructions that are pleasing to human perception. Mathematically, perfect perceptual-quality is achieved when the distri…
▽ More
Many practical settings call for the reconstruction of temporal signals from corrupted or missing data. Classic examples include decoding, tracking, signal enhancement and denoising. Since the reconstructed signals are ultimately viewed by humans, it is desirable to achieve reconstructions that are pleasing to human perception. Mathematically, perfect perceptual-quality is achieved when the distribution of restored signals is the same as that of natural signals, a requirement which has been heavily researched in static estimation settings (i.e. when a whole signal is processed at once). Here, we study the problem of optimal causal filtering under a perfect perceptual-quality constraint, which is a task of fundamentally different nature. Specifically, we analyze a Gaussian Markov signal observed through a linear noisy transformation. In the absence of perceptual constraints, the Kalman filter is known to be optimal in the MSE sense for this setting. Here, we show that adding the perfect perceptual quality constraint (i.e. the requirement of temporal consistency), introduces a fundamental dilemma whereby the filter may have to "knowingly" ignore new information revealed by the observations in order to conform to its past decisions. This often comes at the cost of a significant increase in the MSE (beyond that encountered in static settings). Our analysis goes beyond the classic innovation process of the Kalman filter, and introduces the novel concept of an unutilized information process. Using this tool, we present a recursive formula for perceptual filters, and demonstrate the qualitative effects of perfect perceptual-quality estimation on a video reconstruction problem.
△ Less
Submitted 4 June, 2023;
originally announced June 2023.
-
Strategic Proxy Voting on the Line
Authors:
Gili Bielous,
Reshef Meir
Abstract:
This paper offers a framework for the study of strategic behavior in proxy voting, where non-active voters delegate their votes to active voters. We further study how proxy voting affects the strategic behavior of non-active voters and proxies (active voters) under complete and partial information. We focus on the median voting rule for single-peaked preferences.
Our results show strategyproofne…
▽ More
This paper offers a framework for the study of strategic behavior in proxy voting, where non-active voters delegate their votes to active voters. We further study how proxy voting affects the strategic behavior of non-active voters and proxies (active voters) under complete and partial information. We focus on the median voting rule for single-peaked preferences.
Our results show strategyproofness with respect to non-active voters. Furthermore, while strategyproofness does not extend to proxies, we show that the outcome is bounded and, under mild restrictions, strategic behavior leads to socially optimal outcomes.
We further show that our results extend to partial information settings, and in particular for regret-averse agents.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Strategy-proof Budgeting via a VCG-like Mechanism
Authors:
Jonathan Wagner,
Reshef Meir
Abstract:
We present a strategy-proof public goods budgeting mechanism where agents determine both the total volume of expanses and the specific allocation. It is constructed as a modification of VCG to a less typical environment, namely where we do not assume quasi-linear utilities nor direct revelation. We further show that under plausible assumptions it satisfies strategy-proofness in strictly dominant s…
▽ More
We present a strategy-proof public goods budgeting mechanism where agents determine both the total volume of expanses and the specific allocation. It is constructed as a modification of VCG to a less typical environment, namely where we do not assume quasi-linear utilities nor direct revelation. We further show that under plausible assumptions it satisfies strategy-proofness in strictly dominant strategies, and consequently implements the social optimum as a Coalition Proof Nash Equilibrium. A primary (albeit not an exclusive) motivation of our model is Participatory Budgeting, where members of a community collectively decide the spending policy of public tax dollars. While incentives alignment in our mechanism, as in classic VCG, is achieved via individual payments we charge from agents, in a PB context that seems unreasonable. Our second main result thus provides that, under further specifications relevant in that context, these payments will vanish in large populations. In the last section we expand the mechanism's definition to a class of mechanisms in which the designer can prioritize certain outcomes she sees as desirable. In particular we give the example of favoring equitable (egalitarian) allocations.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
Mitigating Skewed Bidding for Conference Paper Assignment
Authors:
Inbal Rozencweig,
Reshef Meir,
Nick Mattei,
Ofra Amir
Abstract:
The explosion of conference paper submissions in AI and related fields, has underscored the need to improve many aspects of the peer review process, especially the matching of papers and reviewers. Recent work argues that the key to improve this matching is to modify aspects of the \emph{bidding phase} itself, to ensure that the set of bids over papers is balanced, and in particular to avoid \emph…
▽ More
The explosion of conference paper submissions in AI and related fields, has underscored the need to improve many aspects of the peer review process, especially the matching of papers and reviewers. Recent work argues that the key to improve this matching is to modify aspects of the \emph{bidding phase} itself, to ensure that the set of bids over papers is balanced, and in particular to avoid \emph{orphan papers}, i.e., those papers that receive no bids. In an attempt to understand and mitigate this problem, we have developed a flexible bidding platform to test adaptations to the bidding process. Using this platform, we performed a field experiment during the bidding phase of a medium-size international workshop that compared two bidding methods. We further examined via controlled experiments on Amazon Mechanical Turk various factors that affect bidding, in particular the order in which papers are presented \cite{cabanac2013capitalizing,fiez2020super}; and information on paper demand \cite{meir2021market}. Our results suggest that several simple adaptations, that can be added to any existing platform, may significantly reduce the skew in bids, thereby improving the allocation for both reviewers and conference organizers.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Convergence of Multi-Issue Iterative Voting under Uncertainty
Authors:
Joshua Kavner,
Reshef Meir,
Francesca Rossi,
Lirong Xia
Abstract:
We study the effect of strategic behavior in iterative voting for multiple issues under uncertainty. We introduce a model synthesizing simultaneous multi-issue voting with Meir, Lev, and Rosenschein (2014)'s local dominance theory and determine its convergence properties. After demonstrating that local dominance improvement dynamics may fail to converge, we present two sufficient model refinements…
▽ More
We study the effect of strategic behavior in iterative voting for multiple issues under uncertainty. We introduce a model synthesizing simultaneous multi-issue voting with Meir, Lev, and Rosenschein (2014)'s local dominance theory and determine its convergence properties. After demonstrating that local dominance improvement dynamics may fail to converge, we present two sufficient model refinements that guarantee convergence from any initial vote profile for binary issues: constraining agents to have O-legal preferences and endowing agents with less uncertainty about issues they are modifying than others. Our empirical studies demonstrate that although cycles are common when agents have no uncertainty, introducing uncertainty makes convergence almost guaranteed in practice.
△ Less
Submitted 20 January, 2023;
originally announced January 2023.
-
Emergent Quantized Communication
Authors:
Boaz Carmeli,
Ron Meir,
Yonatan Belinkov
Abstract:
The field of emergent communication aims to understand the characteristics of communication as it emerges from artificial agents solving tasks that require information exchange. Communication with discrete messages is considered a desired characteristic, for both scientific and applied reasons. However, training a multi-agent system with discrete communication is not straightforward, requiring eit…
▽ More
The field of emergent communication aims to understand the characteristics of communication as it emerges from artificial agents solving tasks that require information exchange. Communication with discrete messages is considered a desired characteristic, for both scientific and applied reasons. However, training a multi-agent system with discrete communication is not straightforward, requiring either reinforcement learning algorithms or relaxing the discreteness requirement via a continuous approximation such as the Gumbel-softmax. Both these solutions result in poor performance compared to fully continuous communication. In this work, we propose an alternative approach to achieve discrete communication -- quantization of communicated messages. Using message quantization allows us to train the model end-to-end, achieving superior performance in multiple setups. Moreover, quantization is a natural framework that runs the gamut from continuous to discrete communication. Thus, it sets the ground for a broader view of multi-agent communication in the deep learning era.
△ Less
Submitted 19 January, 2023; v1 submitted 4 November, 2022;
originally announced November 2022.
-
Integral Probability Metrics PAC-Bayes Bounds
Authors:
Ron Amit,
Baruch Epstein,
Shay Moran,
Ron Meir
Abstract:
We present a PAC-Bayes-style generalization bound which enables the replacement of the KL-divergence with a variety of Integral Probability Metrics (IPM). We provide instances of this bound with the IPM being the total variation metric and the Wasserstein distance. A notable feature of the obtained bounds is that they naturally interpolate between classical uniform convergence bounds in the worst…
▽ More
We present a PAC-Bayes-style generalization bound which enables the replacement of the KL-divergence with a variety of Integral Probability Metrics (IPM). We provide instances of this bound with the IPM being the total variation metric and the Wasserstein distance. A notable feature of the obtained bounds is that they naturally interpolate between classical uniform convergence bounds in the worst case (when the prior and posterior are far away from each other), and improved bounds in favorable cases (when the posterior and prior are close). This illustrates the possibility of reinforcing classical generalization bounds with algorithm- and data-dependent components, thus making them more suitable to analyze algorithms that use a large hypothesis space.
△ Less
Submitted 25 December, 2022; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Empirical Bayes approach to Truth Discovery problems
Authors:
Tsviel Ben Shabat,
Reshef Meir,
David Azriel
Abstract:
When aggregating information from conflicting sources, one's goal is to find the truth. Most real-value \emph{truth discovery} (TD) algorithms try to achieve this goal by estimating the competence of each source and then aggregating the conflicting information by weighing each source's answer proportionally to her competence. However, each of those algorithms requires more than a single source for…
▽ More
When aggregating information from conflicting sources, one's goal is to find the truth. Most real-value \emph{truth discovery} (TD) algorithms try to achieve this goal by estimating the competence of each source and then aggregating the conflicting information by weighing each source's answer proportionally to her competence. However, each of those algorithms requires more than a single source for such estimation and usually does not consider different estimation methods other than a weighted mean. Therefore, in this work we formulate, prove, and empirically test the conditions for an Empirical Bayes Estimator (EBE) to dominate the weighted mean aggregation. Our main result demonstrates that EBE, under mild conditions, can be used as a second step of any TD algorithm in order to reduce the expected error.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Online Meta-Learning in Adversarial Multi-Armed Bandits
Authors:
Ilya Osadchiy,
Kfir Y. Levy,
Ron Meir
Abstract:
We study meta-learning for adversarial multi-armed bandits. We consider the online-within-online setup, in which a player (learner) encounters a sequence of multi-armed bandit episodes. The player's performance is measured as regret against the best arm in each episode, according to the losses generated by an adversary. The difficulty of the problem depends on the empirical distribution of the per…
▽ More
We study meta-learning for adversarial multi-armed bandits. We consider the online-within-online setup, in which a player (learner) encounters a sequence of multi-armed bandit episodes. The player's performance is measured as regret against the best arm in each episode, according to the losses generated by an adversary. The difficulty of the problem depends on the empirical distribution of the per-episode best arm chosen by the adversary. We present an algorithm that can leverage the non-uniformity in this empirical distribution, and derive problem-dependent regret bounds. This solution comprises an inner learner that plays each episode separately, and an outer learner that updates the hyper-parameters of the inner algorithm between the episodes. In the case where the best arm distribution is far from uniform, it improves upon the best bound that can be achieved by any online algorithm executed on each episode individually without meta-learning.
△ Less
Submitted 12 July, 2022; v1 submitted 31 May, 2022;
originally announced May 2022.
-
Welfare vs. Representation in Participatory Budgeting
Authors:
Roy Fairstein,
Reshef Meir,
Dan Vilenchik,
Kobi Gal
Abstract:
Participatory budgeting (PB) is a democratic process for allocating funds to projects based on the votes of members of the community. Different rules have been used to aggregate participants' votes. Past research has studied the trade-off between notions of social welfare and fairness in the multi-winner setting (a special case of participatory budgeting with identical project costs) by Lackner an…
▽ More
Participatory budgeting (PB) is a democratic process for allocating funds to projects based on the votes of members of the community. Different rules have been used to aggregate participants' votes. Past research has studied the trade-off between notions of social welfare and fairness in the multi-winner setting (a special case of participatory budgeting with identical project costs) by Lackner and Skowron (2020). But there is little understanding of this trade-off in the more general PB setting. This paper provides a theoretical and empirical study of the worst-case guarantees of several common rules to better understand the trade-off between social welfare, representation. We show that many of the guarantees from the multi-winner setting do not generalize to the PB setting, and that the introduction of costs leads to substantially worse guarantees, thereby exacerbating the welfare-representation trade-off. We extend our theoretical analysis to studying how the requirement of proportionality over voting rules affects this trade-off. We further study how the requirement of proportionality over voting rules effects the guarantees on social welfare and representation. We study the latter point also empirically, both on real and synthetic datasets. We show that variants of the recently suggested voting rule Rule-X (which satisfies proportionality) do very well in practice both with respect to social welfare and representation.
△ Less
Submitted 25 May, 2022; v1 submitted 19 January, 2022;
originally announced January 2022.
-
Metalearning Linear Bandits by Prior Update
Authors:
Amit Peleg,
Naama Pearl,
Ron Meir
Abstract:
Fully Bayesian approaches to sequential decision-making assume that problem parameters are generated from a known prior. In practice, such information is often lacking. This problem is exacerbated in setups with partial information, where a misspecified prior may lead to poor exploration and performance. In this work we prove, in the context of stochastic linear bandits and Gaussian priors, that a…
▽ More
Fully Bayesian approaches to sequential decision-making assume that problem parameters are generated from a known prior. In practice, such information is often lacking. This problem is exacerbated in setups with partial information, where a misspecified prior may lead to poor exploration and performance. In this work we prove, in the context of stochastic linear bandits and Gaussian priors, that as long as the prior is sufficiently close to the true prior, the performance of the applied algorithm is close to that of the algorithm that uses the true prior. Furthermore, we address the task of learning the prior through metalearning, where a learner updates her estimate of the prior across multiple task instances in order to improve performance on future tasks. We provide an algorithm and regret bounds, demonstrate its effectiveness in comparison to an algorithm that knows the correct prior, and support our theoretical results empirically. Our theoretical results hold for a broad class of algorithms, including Thompson Sampling and Information Directed Sampling.
△ Less
Submitted 2 March, 2022; v1 submitted 12 July, 2021;
originally announced July 2021.
-
A Theory of the Distortion-Perception Tradeoff in Wasserstein Space
Authors:
Dror Freirich,
Tomer Michaeli,
Ron Meir
Abstract:
The lower the distortion of an estimator, the more the distribution of its outputs generally deviates from the distribution of the signals it attempts to estimate. This phenomenon, known as the perception-distortion tradeoff, has captured significant attention in image restoration, where it implies that fidelity to ground truth images comes at the expense of perceptual quality (deviation from stat…
▽ More
The lower the distortion of an estimator, the more the distribution of its outputs generally deviates from the distribution of the signals it attempts to estimate. This phenomenon, known as the perception-distortion tradeoff, has captured significant attention in image restoration, where it implies that fidelity to ground truth images comes at the expense of perceptual quality (deviation from statistics of natural images). However, despite the increasing popularity of performing comparisons on the perception-distortion plane, there remains an important open question: what is the minimal distortion that can be achieved under a given perception constraint? In this paper, we derive a closed form expression for this distortion-perception (DP) function for the mean squared-error (MSE) distortion and the Wasserstein-2 perception index. We prove that the DP function is always quadratic, regardless of the underlying distribution. This stems from the fact that estimators on the DP curve form a geodesic in Wasserstein space. In the Gaussian setting, we further provide a closed form expression for such estimators. For general distributions, we show how these estimators can be constructed from the estimators at the two extremes of the tradeoff: The global MSE minimizer, and a minimizer of the MSE under a perfect perceptual quality constraint. The latter can be obtained as a stochastic transformation of the former.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
Proportional Participatory Budgeting with Substitute Projects
Authors:
Roy Fairstein,
Reshef Meir,
Kobi Gal
Abstract:
Participatory budgeting is a democratic process for allocating funds to projects based on the votes of members of the community. However, most input methods of voters' preferences prevent the voters from expressing complex relationships among projects, leading to outcomes that do not reflect their preferences well enough. In this paper, we propose an input method that begins to address this challe…
▽ More
Participatory budgeting is a democratic process for allocating funds to projects based on the votes of members of the community. However, most input methods of voters' preferences prevent the voters from expressing complex relationships among projects, leading to outcomes that do not reflect their preferences well enough. In this paper, we propose an input method that begins to address this challenge, by allowing participants to express substitutes over projects. Then, we extend a known aggregation mechanism from the literature (Rule X) to handle substitute projects. We prove that our extended rule preserves proportionality under natural conditions, and show empirically that it obtains substantially more welfare than the original mechanism on instances with substitutes.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
The Core of Approval Participatory Budgeting with Uniform Costs (or with up to Four Projects) is Non-Empty
Authors:
Reshef Meir
Abstract:
In the Approval Participatory Budgeting problem an agent prefers a set of projects $W'$ over $W$ if she approves strictly more projects in $W'$. A set of projects $W$ is in the core, if there is no other set of projects $W'$ and set of agents $K$ that both prefer $W'$ over $W$ and can fund $W'$. It is an open problem whether the core can be empty, even when project costs are uniform. the latter ca…
▽ More
In the Approval Participatory Budgeting problem an agent prefers a set of projects $W'$ over $W$ if she approves strictly more projects in $W'$. A set of projects $W$ is in the core, if there is no other set of projects $W'$ and set of agents $K$ that both prefer $W'$ over $W$ and can fund $W'$. It is an open problem whether the core can be empty, even when project costs are uniform. the latter case is known as the multiwinner voting core.
We show that in any instance with uniform costs or with at most four projects (and any number of agents), the core is nonempty.
△ Less
Submitted 6 December, 2022; v1 submitted 11 April, 2021;
originally announced April 2021.
-
Ensemble Bootstrap** for Q-Learning
Authors:
Oren Peer,
Chen Tessler,
Nadav Merlis,
Ron Meir
Abstract:
Q-learning (QL), a common reinforcement learning algorithm, suffers from over-estimation bias due to the maximization term in the optimal Bellman operator. This bias may lead to sub-optimal behavior. Double-Q-learning tackles this issue by utilizing two estimators, yet results in an under-estimation bias. Similar to over-estimation in Q-learning, in certain scenarios, the under-estimation bias may…
▽ More
Q-learning (QL), a common reinforcement learning algorithm, suffers from over-estimation bias due to the maximization term in the optimal Bellman operator. This bias may lead to sub-optimal behavior. Double-Q-learning tackles this issue by utilizing two estimators, yet results in an under-estimation bias. Similar to over-estimation in Q-learning, in certain scenarios, the under-estimation bias may degrade performance. In this work, we introduce a new bias-reduced algorithm called Ensemble Bootstrapped Q-Learning (EBQL), a natural extension of Double-Q-learning to ensembles. We analyze our method both theoretically and empirically. Theoretically, we prove that EBQL-like updates yield lower MSE when estimating the maximal mean of a set of independent random variables. Empirically, we show that there exist domains where both over and under-estimation result in sub-optimal performance. Finally, We demonstrate the superior performance of a deep RL variant of EBQL over other deep QL algorithms for a suite of ATARI games.
△ Less
Submitted 20 April, 2021; v1 submitted 28 February, 2021;
originally announced March 2021.
-
Strategyproof Facility Location Mechanisms on Discrete Trees
Authors:
Alina Filimonov,
Reshef Meir
Abstract:
We address the problem of strategyproof (SP) facility location mechanisms on discrete trees. Our main result is a full characterization of onto and SP mechanisms. In particular, we prove that when a single agent significantly affects the outcome, the trajectory of the facility is almost contained in the trajectory of the agent, and both move in the same direction along the common edges. We show ti…
▽ More
We address the problem of strategyproof (SP) facility location mechanisms on discrete trees. Our main result is a full characterization of onto and SP mechanisms. In particular, we prove that when a single agent significantly affects the outcome, the trajectory of the facility is almost contained in the trajectory of the agent, and both move in the same direction along the common edges. We show tight relations of our characterization to previous results on discrete lines and on continuous trees. We then derive further implications of the main result for infinite discrete lines.
△ Less
Submitted 4 February, 2021;
originally announced February 2021.
-
Discount Factor as a Regularizer in Reinforcement Learning
Authors:
Ron Amit,
Ron Meir,
Kamil Ciosek
Abstract:
Specifying a Reinforcement Learning (RL) task involves choosing a suitable planning horizon, which is typically modeled by a discount factor. It is known that applying RL algorithms with a lower discount factor can act as a regularizer, improving performance in the limited data regime. Yet the exact nature of this regularizer has not been investigated. In this work, we fill in this gap. For severa…
▽ More
Specifying a Reinforcement Learning (RL) task involves choosing a suitable planning horizon, which is typically modeled by a discount factor. It is known that applying RL algorithms with a lower discount factor can act as a regularizer, improving performance in the limited data regime. Yet the exact nature of this regularizer has not been investigated. In this work, we fill in this gap. For several Temporal-Difference (TD) learning methods, we show an explicit equivalence between using a reduced discount factor and adding an explicit regularization term to the algorithm's loss. Motivated by the equivalence, we empirically study this technique compared to standard $L_2$ regularization by extensive experiments in discrete and continuous domains, using tabular and functional representations. Our experiments suggest the regularization effectiveness is strongly related to properties of the available data, such as size, distribution, and mixing rate.
△ Less
Submitted 4 July, 2020;
originally announced July 2020.
-
Representative Committees of Peers
Authors:
Reshef Meir,
Fedor Sandomirskiy,
Moshe Tennenholtz
Abstract:
A population of voters must elect representatives among themselves to decide on a sequence of possibly unforeseen binary issues. Voters care only about the final decision, not the elected representatives. The disutility of a voter is proportional to the fraction of issues, where his preferences disagree with the decision.
While an issue-by-issue vote by all voters would maximize social welfare,…
▽ More
A population of voters must elect representatives among themselves to decide on a sequence of possibly unforeseen binary issues. Voters care only about the final decision, not the elected representatives. The disutility of a voter is proportional to the fraction of issues, where his preferences disagree with the decision.
While an issue-by-issue vote by all voters would maximize social welfare, we are interested in how well the preferences of the population can be approximated by a small committee.
We show that a k-sortition (a random committee of k voters with the majority vote within the committee) leads to an outcome within the factor 1+O(1/k) of the optimal social cost for any number of voters n, any number of issues $m$, and any preference profile.
For a small number of issues m, the social cost can be made even closer to optimal by delegation procedures that weigh committee members according to their number of followers. However, for large m, we demonstrate that the k-sortition is the worst-case optimal rule within a broad family of committee-based rules that take into account metric information about the preference profile of the whole population.
△ Less
Submitted 14 June, 2020;
originally announced June 2020.
-
Cumulative Games: Who is the current player?
Authors:
Urban Larsson,
Reshef Meir,
Yair Zick
Abstract:
Combinatorial Game Theory (CGT) is a branch of game theory that has developed almost independently from Economic Game Theory (EGT), and is concerned with deep mathematical properties of 2-player 0-sum games that are defined over various combinatorial structures. The aim of this work is to lay foundations to bridging the conceptual and technical gaps between CGT and EGT, here interpreted as so-call…
▽ More
Combinatorial Game Theory (CGT) is a branch of game theory that has developed almost independently from Economic Game Theory (EGT), and is concerned with deep mathematical properties of 2-player 0-sum games that are defined over various combinatorial structures. The aim of this work is to lay foundations to bridging the conceptual and technical gaps between CGT and EGT, here interpreted as so-called Extensive Form Games, so they can be treated within a unified framework. More specifically, we introduce a class of $n$-player, general-sum games, called Cumulative Games, that can be analyzed by both CGT and EGT tools. We show how two of the most fundamental definitions of CGT---the outcome function, and the disjunctive sum operator---naturally extend to the class of Cumulative Games. The outcome function allows for an efficient equilibrium computation under certain restrictions, and the disjunctive sum operator lets us define a partial order over games, according to the advantage that a certain player has. Finally, we show that any Extensive Form Game can be written as a Cumulative Game.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Option Discovery in the Absence of Rewards with Manifold Analysis
Authors:
Amitay Bar,
Ronen Talmon,
Ron Meir
Abstract:
Options have been shown to be an effective tool in reinforcement learning, facilitating improved exploration and learning. In this paper, we present an approach based on spectral graph theory and derive an algorithm that systematically discovers options without access to a specific reward or task assignment. As opposed to the common practice used in previous methods, our algorithm makes full use o…
▽ More
Options have been shown to be an effective tool in reinforcement learning, facilitating improved exploration and learning. In this paper, we present an approach based on spectral graph theory and derive an algorithm that systematically discovers options without access to a specific reward or task assignment. As opposed to the common practice used in previous methods, our algorithm makes full use of the spectrum of the graph Laplacian. Incorporating modes associated with higher graph frequencies unravels domain subtleties, which are shown to be useful for option discovery. Using geometric and manifold-based analysis, we present a theoretical justification for the algorithm. In addition, we showcase its performance in several domains, demonstrating clear improvements compared to competing methods.
△ Less
Submitted 19 August, 2020; v1 submitted 12 March, 2020;
originally announced March 2020.
-
Distance-based Equilibria in Normal-Form Games
Authors:
Erman Acar,
Reshef Meir
Abstract:
We propose a simple uncertainty modification for the agent model in normal-form games; at any given strategy profile, the agent can access only a set of "possible profiles" that are within a certain distance from the actual action profile. We investigate the various instantiations in which the agent chooses her strategy using well-known rationales e.g., considering the worst case, or trying to min…
▽ More
We propose a simple uncertainty modification for the agent model in normal-form games; at any given strategy profile, the agent can access only a set of "possible profiles" that are within a certain distance from the actual action profile. We investigate the various instantiations in which the agent chooses her strategy using well-known rationales e.g., considering the worst case, or trying to minimize the regret, to cope with such uncertainty. Any such modification in the behavioral model naturally induces a corresponding notion of equilibrium; a distance-based equilibrium. We characterize the relationships between the various equilibria, and also their connections to well-known existing solution concepts such as Trembling-hand perfection. Furthermore, we deliver existence results, and show that for some class of games, such solution concepts can actually lead to better outcomes.
△ Less
Submitted 8 February, 2020;
originally announced February 2020.
-
Safe Voting: Resilience to Abstention and Sybils
Authors:
Reshef Meir,
Gal Shahaf,
Ehud Shapiro,
Nimrod Talmon
Abstract:
Voting rules may implement the will of the society when all eligible voters vote, and only them. However, they may fail to do so when sybil (fake or duplicate) votes are present and when only some honest (non sybil) voters actively participate. As, unfortunately, sometimes this is the case, our aim here is to address social choice in the presence of sybils and voter abstention. To do so we build u…
▽ More
Voting rules may implement the will of the society when all eligible voters vote, and only them. However, they may fail to do so when sybil (fake or duplicate) votes are present and when only some honest (non sybil) voters actively participate. As, unfortunately, sometimes this is the case, our aim here is to address social choice in the presence of sybils and voter abstention. To do so we build upon the framework of Reality-aware Social Choice: we assume the status-quo as an ever-present distinguished alternative, and study Status-Quo Enforcing voting rules, which add virtual votes in support of the status-quo. We characterize the tradeoff between safety and liveness (the ability of active honest voters to maintain/change the status-quo, respectively) in several domains, and show that the Status-Quo Enforcing voting rules are often optimal. We comment on the applicability of our methods and analyses to the governance of digital communities.
△ Less
Submitted 7 April, 2024; v1 submitted 15 January, 2020;
originally announced January 2020.
-
Bidding in Spades
Authors:
Gal Cohensius,
Reshef Meir,
Nadav Oved,
Roni Stern
Abstract:
We present a Spades bidding algorithm that is superior to recreational human players and to publicly available bots. Like in Bridge, the game of Spades is composed of two independent phases, \textit{bidding} and \textit{playing}. This paper focuses on the bidding algorithm, since this phase holds a precise challenge: based on the input, choose the bid that maximizes the agent's winning probability…
▽ More
We present a Spades bidding algorithm that is superior to recreational human players and to publicly available bots. Like in Bridge, the game of Spades is composed of two independent phases, \textit{bidding} and \textit{playing}. This paper focuses on the bidding algorithm, since this phase holds a precise challenge: based on the input, choose the bid that maximizes the agent's winning probability. Our \emph{Bidding-in-Spades} (BIS) algorithm heuristically determines the bidding strategy by comparing the expected utility of each possible bid. A major challenge is how to estimate these expected utilities. To this end, we propose a set of domain-specific heuristics, and then correct them via machine learning using data from real-world players. The \BIS algorithm we present can be attached to any playing algorithm. It beats rule-based bidding bots when all use the same playing component. When combined with a rule-based playing algorithm, it is superior to the average recreational human.
△ Less
Submitted 10 February, 2020; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Modeling Peoples Voting Behavior with Poll Information
Authors:
Roy Fairstein,
Adam Lauz,
Kobi Gal,
Reshef Meir
Abstract:
Despite the prevalence of voting systems in the real world there is no consensus among researchers of how people vote strategically, even in simple voting settings. This paper addresses this gap by comparing different approaches that have been used to model strategic voting, including expected utility maximization, heuristic decisionmaking, and bounded rationality models. The models are applied to…
▽ More
Despite the prevalence of voting systems in the real world there is no consensus among researchers of how people vote strategically, even in simple voting settings. This paper addresses this gap by comparing different approaches that have been used to model strategic voting, including expected utility maximization, heuristic decisionmaking, and bounded rationality models. The models are applied to data collected from hundreds of people in controlled voting experiments, where people vote after observing non-binding poll information. We introduce a new voting model, the Attainability- Utility (AU) heuristic, which weighs the popularity of a candidate according to the poll, with the utility of the candidate to the voter. We argue that the AU model is cognitively plausible, and show that it is able to predict peoples voting behavior significantly better than other models from the literature. It was almost at par with (and sometimes better than) a machine learning algorithm that uses substantially more information. Our results provide new insights into the strategic considerations of voters, that undermine the prevalent assumptions of much theoretical work in social choice.
△ Less
Submitted 23 September, 2019;
originally announced September 2019.
-
Penalty Bidding Mechanisms for Allocating Resources and Overcoming Present Bias
Authors:
Hongyao Ma,
Reshef Meir,
David C. Parkes,
Elena Wu-Yan
Abstract:
From skipped exercise classes to last-minute cancellation of dentist appointments, underutilization of reserved resources abounds. Likely reasons include uncertainty about the future, further exacerbated by present bias. In this paper, we unite resource allocation and commitment devices through the design of contingent payment mechanisms, and propose the two-bid penalty-bidding mechanism. This ext…
▽ More
From skipped exercise classes to last-minute cancellation of dentist appointments, underutilization of reserved resources abounds. Likely reasons include uncertainty about the future, further exacerbated by present bias. In this paper, we unite resource allocation and commitment devices through the design of contingent payment mechanisms, and propose the two-bid penalty-bidding mechanism. This extends an earlier mechanism proposed by Ma et al. (2019), assigning the resources based on willingness to accept a no-show penalty, while also allowing each participant to increase her own penalty in order to counter present bias. We establish a simple dominant strategy equilibrium, regardless of an agent's level of present bias or degree of "sophistication". Via simulations, we show that the proposed mechanism substantially improves utilization and achieves higher welfare and better equity in comparison with mechanisms used in practice and mechanisms that optimize welfare in the absence of present bias.
△ Less
Submitted 8 May, 2020; v1 submitted 24 June, 2019;
originally announced June 2019.
-
PAC Guarantees for Cooperative Multi-Agent Reinforcement Learning with Restricted Communication
Authors:
Or Raveh,
Ron Meir
Abstract:
We develop model free PAC performance guarantees for multiple concurrent MDPs, extending recent works where a single learner interacts with multiple non-interacting agents in a noise free environment. Our framework allows noisy and resource limited communication between agents, and develops novel PAC guarantees in this extended setting. By allowing communication between the agents themselves, we s…
▽ More
We develop model free PAC performance guarantees for multiple concurrent MDPs, extending recent works where a single learner interacts with multiple non-interacting agents in a noise free environment. Our framework allows noisy and resource limited communication between agents, and develops novel PAC guarantees in this extended setting. By allowing communication between the agents themselves, we suggest improved PAC-exploration algorithms that can overcome the communication noise and lead to improved sample complexity bounds. We provide a theoretically motivated algorithm that optimally combines information from the resource limited agents, thereby analyzing the interaction between noise and communication constraints that are ubiquitous in real-world systems. We present empirical results for a simple task that supports our theoretical formulations and improve upon naive information fusion methods.
△ Less
Submitted 10 October, 2019; v1 submitted 23 May, 2019;
originally announced May 2019.
-
Frustratingly Easy Truth Discovery
Authors:
Reshef Meir,
Ofra Amir,
Omer Ben-Porat,
Tsviel Ben-Shabat,
Gal Cohensius,
Lirong Xia
Abstract:
Truth discovery is a general name for a broad range of statistical methods aimed to extract the correct answers to questions, based on multiple answers coming from noisy sources. For example, workers in a crowdsourcing platform. In this paper, we consider an extremely simple heuristic for estimating workers' competence using average proximity to other workers. We prove that this estimates well the…
▽ More
Truth discovery is a general name for a broad range of statistical methods aimed to extract the correct answers to questions, based on multiple answers coming from noisy sources. For example, workers in a crowdsourcing platform. In this paper, we consider an extremely simple heuristic for estimating workers' competence using average proximity to other workers. We prove that this estimates well the actual competence level and enables separating high and low quality workers in a wide spectrum of domains and statistical models. Under Gaussian noise, this simple estimate is the unique solution to the MLE with a constant regularization factor.
Finally, weighing workers according to their average proximity in a crowdsourcing setting, results in substantial improvement over unweighted aggregation and other truth discovery algorithms in practice.
△ Less
Submitted 2 December, 2022; v1 submitted 2 May, 2019;
originally announced May 2019.
-
Strategyproof Facility Location for Three Agents on a Circle
Authors:
Reshef Meir
Abstract:
We consider the facility location problem in a metric space, focusing on the case of three agents. We show that selecting the reported location of each agent with probability proportional to the distance between the other two agents results in a mechanism that is strategyproof in expectation, and dominates the random dictator mechanism in terms of utilitarian social welfare. We further improve the…
▽ More
We consider the facility location problem in a metric space, focusing on the case of three agents. We show that selecting the reported location of each agent with probability proportional to the distance between the other two agents results in a mechanism that is strategyproof in expectation, and dominates the random dictator mechanism in terms of utilitarian social welfare. We further improve the upper bound for three agents on a circle to 7/6 (whereas random dictator obtains 4/3); and provide the first lower bounds for randomized strategyproof facility location in any metric space, using linear programming.
△ Less
Submitted 7 July, 2019; v1 submitted 21 February, 2019;
originally announced February 2019.
-
Generalization Bounds For Unsupervised and Semi-Supervised Learning With Autoencoders
Authors:
Baruch Epstein,
Ron Meir
Abstract:
Autoencoders are widely used for unsupervised learning and as a regularization scheme in semi-supervised learning. However, theoretical understanding of their generalization properties and of the manner in which they can assist supervised learning has been lacking. We utilize recent advances in the theory of deep learning generalization, together with a novel reconstruction loss, to provide genera…
▽ More
Autoencoders are widely used for unsupervised learning and as a regularization scheme in semi-supervised learning. However, theoretical understanding of their generalization properties and of the manner in which they can assist supervised learning has been lacking. We utilize recent advances in the theory of deep learning generalization, together with a novel reconstruction loss, to provide generalization bounds for autoencoders. To the best of our knowledge, this is the first such bound. We further show that, under appropriate assumptions, an autoencoder with good generalization properties can improve any semi-supervised learning scheme. We support our theoretical results with empirical demonstrations.
△ Less
Submitted 4 February, 2019;
originally announced February 2019.
-
Heuristic Voting as Ordinal Dominance Strategies
Authors:
Omer Lev,
Reshef Meir,
Svetlana Obraztsova,
Maria Polukarov
Abstract:
Decision making under uncertainty is a key component of many AI settings, and in particular of voting scenarios where strategic agents are trying to reach a joint decision. The common approach to handle uncertainty is by maximizing expected utility, which requires a cardinal utility function as well as detailed probabilistic information. However, often such probabilities are not easy to estimate o…
▽ More
Decision making under uncertainty is a key component of many AI settings, and in particular of voting scenarios where strategic agents are trying to reach a joint decision. The common approach to handle uncertainty is by maximizing expected utility, which requires a cardinal utility function as well as detailed probabilistic information. However, often such probabilities are not easy to estimate or apply.
To this end, we present a framework that allows "shades of gray" of likelihood without probabilities. Specifically, we create a hierarchy of sets of world states based on a prospective poll, with inner sets contain more likely outcomes. This hierarchy of likelihoods allows us to define what we term ordinally-dominated strategies. We use this approach to justify various known voting heuristics as bounded-rational strategies.
△ Less
Submitted 13 November, 2018;
originally announced November 2018.
-
Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN
Authors:
Dror Freirich,
Ron Meir,
Aviv Tamar
Abstract:
The recently proposed distributional approach to reinforcement learning (DiRL) is centered on learning the distribution of the reward-to-go, often referred to as the value distribution. In this work, we show that the distributional Bellman equation, which drives DiRL methods, is equivalent to a generative adversarial network (GAN) model. In this formulation, DiRL can be seen as learning a deep gen…
▽ More
The recently proposed distributional approach to reinforcement learning (DiRL) is centered on learning the distribution of the reward-to-go, often referred to as the value distribution. In this work, we show that the distributional Bellman equation, which drives DiRL methods, is equivalent to a generative adversarial network (GAN) model. In this formulation, DiRL can be seen as learning a deep generative model of the value distribution, driven by the discrepancy between the distribution of the current value, and the distribution of the sum of current reward and next value. We use this insight to propose a GAN-based approach to DiRL, which leverages the strengths of GANs in learning distributions of high-dimensional data. In particular, we show that our GAN approach can be used for DiRL with multivariate rewards, an important setting which cannot be tackled with prior methods. The multivariate setting also allows us to unify learning the distribution of values and state transitions, and we exploit this idea to devise a novel exploration method that is driven by the discrepancy in estimating both values and states.
△ Less
Submitted 6 August, 2018;
originally announced August 2018.
-
Efficient Crowdsourcing via Proxy Voting
Authors:
Gal Cohensius,
Omer Ben Porat,
Reshef Meir,
Ofra Amir
Abstract:
Crowdsourcing platforms offer a way to label data by aggregating answers of multiple unqualified workers. We introduce a \textit{simple} and \textit{budget efficient} crowdsourcing method named Proxy Crowdsourcing (PCS). PCS collects answers from two sets of workers: \textit{leaders} (a.k.a proxies) and \textit{followers}. Each leader completely answers the survey while each follower answers only…
▽ More
Crowdsourcing platforms offer a way to label data by aggregating answers of multiple unqualified workers. We introduce a \textit{simple} and \textit{budget efficient} crowdsourcing method named Proxy Crowdsourcing (PCS). PCS collects answers from two sets of workers: \textit{leaders} (a.k.a proxies) and \textit{followers}. Each leader completely answers the survey while each follower answers only a small subset of it. We then weigh every leader according to the number of followers to which his answer are closest, and aggregate the answers of the leaders using any standard aggregation method (e.g., Plurality for categorical labels or Mean for continuous labels). We compare empirically the performance of PCS to unweighted aggregation, kee** the total number of questions (the budget) fixed. We show that PCS improves the accuracy of aggregated answers across several datasets, both with categorical and continuous labels. Overall, our suggested method improves accuracy while being simple and easy to implement.
△ Less
Submitted 16 June, 2018;
originally announced June 2018.
-
Cumulative subtraction games
Authors:
Gal Cohensius,
Urban Larsson,
Reshef Meir,
David Wahlstedt
Abstract:
We study zero-sum games, a variant of the classical combinatorial Subtraction games (studied for example in the monumental work "Winning Ways", by Berlekamp, Conway and Guy), called Cumulative Subtraction (CS). Two players alternate in moving, and get points for taking pebbles out of a joint pile. We prove that the outcome in optimal play (game value) of a CS with a finite number of possible actio…
▽ More
We study zero-sum games, a variant of the classical combinatorial Subtraction games (studied for example in the monumental work "Winning Ways", by Berlekamp, Conway and Guy), called Cumulative Subtraction (CS). Two players alternate in moving, and get points for taking pebbles out of a joint pile. We prove that the outcome in optimal play (game value) of a CS with a finite number of possible actions is eventually periodic, with period $2s$, where $s$ is the size of the largest available action. This settles a conjecture by Stewart in his Ph.D. thesis (2011). Specifically, we find a quadratic bound, in the size of $s$, on when the outcome function must have become periodic. In case of two possible actions, we give an explicit description of optimal play. We generalize the periodicity result to games with a so-called reward function, where at each stage of game, the change of `score' does not necessarily equal the number of pebbles you collect.
△ Less
Submitted 12 February, 2020; v1 submitted 23 May, 2018;
originally announced May 2018.
-
Predicting Strategic Voting Behavior with Poll Information
Authors:
Roy Fairstein,
Adam Lauz,
Kobi Gal,
Reshef Meir
Abstract:
The question of how people vote strategically under uncertainty has attracted much attention in several disciplines. Theoretical decision models have been proposed which vary in their assumptions on the sophistication of the voters and on the information made available to them about others' preferences and their voting behavior. This work focuses on modeling strategic voting behavior under poll in…
▽ More
The question of how people vote strategically under uncertainty has attracted much attention in several disciplines. Theoretical decision models have been proposed which vary in their assumptions on the sophistication of the voters and on the information made available to them about others' preferences and their voting behavior. This work focuses on modeling strategic voting behavior under poll information. It proposes a new heuristic for voting behavior that weighs the success of each candidate according to the poll score with the utility of the candidate given the voters' preferences. The model weights can be tuned individually for each voter. We compared this model with other relevant voting models from the literature on data obtained from a recently released large scale study. We show that the new heuristic outperforms all other tested models. The prediction errors of the model can be partly explained due to inconsistent voters that vote for (weakly) dominated candidates.
△ Less
Submitted 19 May, 2018;
originally announced May 2018.
-
Social Choice with Non Quasi-linear Utilities
Authors:
Hongyao Ma,
Reshef Meir,
David C. Parkes
Abstract:
Without monetary payments, the Gibbard-Satterthwaite theorem proves that under mild requirements all truthful social choice mechanisms must be dictatorships. When payments are allowed, the Vickrey-Clarke-Groves (VCG) mechanism implements the value-maximizing choice, and has many other good properties: it is strategy-proof, onto, deterministic, individually rational, and does not make positive tran…
▽ More
Without monetary payments, the Gibbard-Satterthwaite theorem proves that under mild requirements all truthful social choice mechanisms must be dictatorships. When payments are allowed, the Vickrey-Clarke-Groves (VCG) mechanism implements the value-maximizing choice, and has many other good properties: it is strategy-proof, onto, deterministic, individually rational, and does not make positive transfers to the agents. By Roberts' theorem, with three or more alternatives, the weighted VCG mechanisms are essentially unique for domains with quasi-linear utilities. The goal of this paper is to characterize domains of non-quasi-linear utilities where "reasonable" mechanisms (with VCG-like properties) exist. Our main result is a tight characterization of the maximal non quasi-linear utility domain, which we call the largest parallel domain. We extend Roberts' theorem to parallel domains, and use the generalized theorem to prove two impossibility results. First, any reasonable mechanism must be dictatorial when the utility domain is quasi-linear together with any single non-parallel type. Second, for richer utility domains that still differ very slightly from quasi-linearity, every strategy-proof, onto and deterministic mechanism must be a dictatorship.
△ Less
Submitted 25 April, 2018; v1 submitted 6 April, 2018;
originally announced April 2018.
-
Directed Graph Minors and Serial-Parallel Width
Authors:
Argyrios Deligkas,
Reshef Meir
Abstract:
Graph minors are a primary tool in understanding the structure of undirected graphs, with many conceptual and algorithmic implications. We propose new variants of \emph{directed graph minors} and \emph{directed graph embeddings}, by modifying familiar definitions. For the class of 2-terminal directed acyclic graphs (TDAGs) our two definitions coincide, and the class is closed under both operations…
▽ More
Graph minors are a primary tool in understanding the structure of undirected graphs, with many conceptual and algorithmic implications. We propose new variants of \emph{directed graph minors} and \emph{directed graph embeddings}, by modifying familiar definitions. For the class of 2-terminal directed acyclic graphs (TDAGs) our two definitions coincide, and the class is closed under both operations. The usefulness of our directed minor operations is demonstrated by characterizing all TDAGs with serial-parallel width at most $k$; a class of networks known to guarantee bounded negative externality in nonatomic routing games. Our characterization implies that a TDAG has serial-parallel width of $1$ if and only if it is a directed series-parallel graph. We also study the computational complexity of finding a directed minor and computing the serial-parallel width.
△ Less
Submitted 29 May, 2019; v1 submitted 6 November, 2017;
originally announced November 2017.
-
Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory
Authors:
Ron Amit,
Ron Meir
Abstract:
In meta-learning an agent extracts knowledge from observed tasks, aiming to facilitate learning of novel future tasks. Under the assumption that future tasks are 'related' to previous tasks, the accumulated knowledge should be learned in a way which captures the common structure across learned tasks, while allowing the learner sufficient flexibility to adapt to novel aspects of new tasks. We prese…
▽ More
In meta-learning an agent extracts knowledge from observed tasks, aiming to facilitate learning of novel future tasks. Under the assumption that future tasks are 'related' to previous tasks, the accumulated knowledge should be learned in a way which captures the common structure across learned tasks, while allowing the learner sufficient flexibility to adapt to novel aspects of new tasks. We present a framework for meta-learning that is based on generalization error bounds, allowing us to extend various PAC-Bayes bounds to meta-learning. Learning takes place through the construction of a distribution over hypotheses based on the observed tasks, and its utilization for learning a new task. Thus, prior knowledge is incorporated through setting an experience-dependent prior for novel tasks. We develop a gradient-based algorithm which minimizes an objective function derived from the bounds and demonstrate its effectiveness numerically with deep neural networks. In addition to establishing the improved performance available through meta-learning, we demonstrate the intuitive way by which prior information is manifested at different levels of the network.
△ Less
Submitted 20 May, 2019; v1 submitted 3 November, 2017;
originally announced November 2017.
-
Joint auto-encoders: a flexible multi-task learning framework
Authors:
Baruch Epstein. Ron Meir,
Tomer Michaeli
Abstract:
The incorporation of prior knowledge into learning is essential in achieving good performance based on small noisy samples. Such knowledge is often incorporated through the availability of related data arising from domains and tasks similar to the one of current interest. Ideally one would like to allow both the data for the current task and for previous related tasks to self-organize the learning…
▽ More
The incorporation of prior knowledge into learning is essential in achieving good performance based on small noisy samples. Such knowledge is often incorporated through the availability of related data arising from domains and tasks similar to the one of current interest. Ideally one would like to allow both the data for the current task and for previous related tasks to self-organize the learning system in such a way that commonalities and differences between the tasks are learned in a data-driven fashion. We develop a framework for learning multiple tasks simultaneously, based on sharing features that are common to all tasks, achieved through the use of a modular deep feedforward neural network consisting of shared branches, dealing with the common features of all tasks, and private branches, learning the specific unique aspects of each task. Once an appropriate weight sharing architecture has been established, learning takes place through standard algorithms for feedforward networks, e.g., stochastic gradient descent and its variations. The method deals with domain adaptation and multi-task learning in a unified fashion, and can easily deal with data arising from different types of sources. Numerical experiments demonstrate the effectiveness of learning in domain adaptation and transfer learning setups, and provide evidence for the flexible and task-oriented representations arising in the network.
△ Less
Submitted 30 May, 2017;
originally announced May 2017.
-
Contract Design for Energy Demand Response
Authors:
Reshef Meir,
Hongyao Ma,
Valentin Robu
Abstract:
Power companies such as Southern California Edison (SCE) uses Demand Response (DR) contracts to incentivize consumers to reduce their power consumption during periods when demand forecast exceeds supply. Current mechanisms in use offer contracts to consumers independent of one another, do not take into consideration consumers' heterogeneity in consumption profile or reliability, and fail to achiev…
▽ More
Power companies such as Southern California Edison (SCE) uses Demand Response (DR) contracts to incentivize consumers to reduce their power consumption during periods when demand forecast exceeds supply. Current mechanisms in use offer contracts to consumers independent of one another, do not take into consideration consumers' heterogeneity in consumption profile or reliability, and fail to achieve high participation.
We introduce DR-VCG, a new DR mechanism that offers a flexible set of contracts (which may include the standard SCE contracts) and uses VCG pricing. We prove that DR-VCG elicits truthful bids, incentivizes honest preparation efforts, enables efficient computation of allocation and prices. With simple fixed-penalty contracts, the optimization goal of the mechanism is an upper bound on probability that the reduction target is missed. Extensive simulations show that compared to the current mechanism deployed in by SCE, the DR-VCG mechanism achieves higher participation, increased reliability, and significantly reduced total expenses.
△ Less
Submitted 20 May, 2017;
originally announced May 2017.
-
Learning an attention model in an artificial visual system
Authors:
Alon Hazan,
Yuval Harel,
Ron Meir
Abstract:
The Human visual perception of the world is of a large fixed image that is highly detailed and sharp. However, receptor density in the retina is not uniform: a small central region called the fovea is very dense and exhibits high resolution, whereas a peripheral region around it has much lower spatial resolution. Thus, contrary to our perception, we are only able to observe a very small region aro…
▽ More
The Human visual perception of the world is of a large fixed image that is highly detailed and sharp. However, receptor density in the retina is not uniform: a small central region called the fovea is very dense and exhibits high resolution, whereas a peripheral region around it has much lower spatial resolution. Thus, contrary to our perception, we are only able to observe a very small region around the line of sight with high resolution. The perception of a complete and stable view is aided by an attention mechanism that directs the eyes to the numerous points of interest within the scene. The eyes move between these targets in quick, unconscious movements, known as "saccades". Once a target is centered at the fovea, the eyes fixate for a fraction of a second while the visual system extracts the necessary information. An artificial visual system was built based on a fully recurrent neural network set within a reinforcement learning protocol, and learned to attend to regions of interest while solving a classification task. The model is consistent with several experimentally observed phenomena, and suggests novel predictions.
△ Less
Submitted 24 January, 2017;
originally announced January 2017.
-
Proxy Voting for Better Outcomes
Authors:
Gal Cohensius,
Shie Manor,
Reshef Meir,
Eli Meirom,
Ariel Orda
Abstract:
We consider a social choice problem where only a small number of people out of a large population are sufficiently available or motivated to vote. A common solution to increase participation is to allow voters use a proxy, that is, transfer their voting rights to another voter. Considering social choice problems on metric spaces, we compare voting with and without the use of proxies to see which m…
▽ More
We consider a social choice problem where only a small number of people out of a large population are sufficiently available or motivated to vote. A common solution to increase participation is to allow voters use a proxy, that is, transfer their voting rights to another voter. Considering social choice problems on metric spaces, we compare voting with and without the use of proxies to see which mechanism better approximates the optimal outcome, and characterize the regimes in which proxy voting is beneficial. When voters' opinions are located on an interval, both the median mechanism and the mean mechanism are substantially improved by proxy voting. When voters vote on many binary issues, proxy voting is better when the sample of active voters is too small to provide a good outcome. Our theoretical results extend to situations where available voters choose strategically whether to participate. We support our theoretical findings with empirical results showing substantial benefits of proxy voting on simulated and real preference data.
△ Less
Submitted 24 November, 2016;
originally announced November 2016.
-
Random Tie-breaking with Stochastic Dominance
Authors:
Reshef Meir
Abstract:
Consider Plurality with random tie-breaking. This paper uses standard axiomatic extensions of preferences over elements to preferences over sets (Kelly, Gardenfors, Responsiveness) to characterize all better-replies of a voter under stochastic dominance.
Consider Plurality with random tie-breaking. This paper uses standard axiomatic extensions of preferences over elements to preferences over sets (Kelly, Gardenfors, Responsiveness) to characterize all better-replies of a voter under stochastic dominance.
△ Less
Submitted 6 September, 2016;
originally announced September 2016.
-
Contingent Payment Mechanisms for Resource Utilization
Authors:
Hongyao Ma,
Reshef Meir,
David C. Parkes,
James Zou
Abstract:
We introduce the problem of assigning resources to improve their utilization. The motivation comes from settings where agents have uncertainty about their own values for using a resource, and where it is in the interest of a group that resources be used and not wasted. Done in the right way, improved utilization maximizes social welfare--- balancing the utility of a high value but unreliable agent…
▽ More
We introduce the problem of assigning resources to improve their utilization. The motivation comes from settings where agents have uncertainty about their own values for using a resource, and where it is in the interest of a group that resources be used and not wasted. Done in the right way, improved utilization maximizes social welfare--- balancing the utility of a high value but unreliable agent with the group's preference that resources be used. We introduce the family of contingent payment mechanisms (CP), which may charge an agent contingent on use (a penalty). A CP mechanism is parameterized by a maximum penalty, and has a dominant-strategy equilibrium. Under a set of axiomatic properties, we establish welfare-optimality for the special case CP(W), with CP instantiated for a maximum penalty equal to societal value W for utilization. CP(W) is not dominated for expected welfare by any other mechanism, and second, amongst mechanisms that always allocate the resource and have a simple indirect structure, CP(W) strictly dominates every other mechanism. The special case with no upper bound on penalty, the contingent second-price mechanism, maximizes utilization. We extend the mechanisms to assign multiple, heterogeneous resources, and present a simulation study of the welfare properties of these mechanisms.
△ Less
Submitted 1 November, 2018; v1 submitted 21 July, 2016;
originally announced July 2016.