-
Efficient Reconstruction of Stochastic Pedigrees
Authors:
Younhun Kim,
Elchanan Mossel,
Govind Ramnarayan,
Paxton Turner
Abstract:
We introduce a new algorithm called {\sc Rec-Gen} for reconstructing the genealogy or \textit{pedigree} of an extant population purely from its genetic data. We justify our approach by giving a mathematical proof of the effectiveness of {\sc Rec-Gen} when applied to pedigrees from an idealized generative model that replicates some of the features of real-world pedigrees. Our algorithm is iterative…
▽ More
We introduce a new algorithm called {\sc Rec-Gen} for reconstructing the genealogy or \textit{pedigree} of an extant population purely from its genetic data. We justify our approach by giving a mathematical proof of the effectiveness of {\sc Rec-Gen} when applied to pedigrees from an idealized generative model that replicates some of the features of real-world pedigrees. Our algorithm is iterative and provides an accurate reconstruction of a large fraction of the pedigree while having relatively low \emph{sample complexity}, measured in terms of the length of the genetic sequences of the population. We propose our approach as a prototype for further investigation of the pedigree reconstruction problem toward the goal of applications to real-world examples. As such, our results have some conceptual bearing on the increasingly important issue of genomic privacy.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
A Geometric Model of Opinion Polarization
Authors:
Jan Hązła,
Yan **,
Elchanan Mossel,
Govind Ramnarayan
Abstract:
We introduce a simple, geometric model of opinion polarization. It is a model of political persuasion, as well as marketing and advertising, utilizing social values. It focuses on the interplay between different topics and persuasion efforts. We demonstrate that societal opinion polarization often arises as an unintended byproduct of influencers attempting to promote a product or idea. We discuss…
▽ More
We introduce a simple, geometric model of opinion polarization. It is a model of political persuasion, as well as marketing and advertising, utilizing social values. It focuses on the interplay between different topics and persuasion efforts. We demonstrate that societal opinion polarization often arises as an unintended byproduct of influencers attempting to promote a product or idea. We discuss a number of mechanisms for the emergence of polarization involving one or more influencers, sending messages strategically, heuristically, or randomly. We also examine some computational aspects of choosing the most effective means of influencing agents, and the effects of those strategic considerations on polarization.
△ Less
Submitted 24 August, 2021; v1 submitted 11 October, 2019;
originally announced October 2019.
-
Efficient Multiparty Interactive Coding for Insertions, Deletions and Substitutions
Authors:
Ran Gelles,
Yael T. Kalai,
Govind Ramnarayan
Abstract:
In the field of interactive coding, two or more parties wish to carry out a distributed computation over a communication network that may be noisy. The ultimate goal is to develop efficient coding schemes that can tolerate a high level of noise while increasing the communication by only a constant factor (i.e., constant rate).
In this work we consider synchronous communication networks over an a…
▽ More
In the field of interactive coding, two or more parties wish to carry out a distributed computation over a communication network that may be noisy. The ultimate goal is to develop efficient coding schemes that can tolerate a high level of noise while increasing the communication by only a constant factor (i.e., constant rate).
In this work we consider synchronous communication networks over an arbitrary topology, in the powerful adversarial insertion-deletion noise model. Namely, the noisy channel may adversarially alter the content of any transmitted symbol, as well as completely remove a transmitted symbol or inject a new symbol into the channel. We provide efficient, constant rate schemes that successfully conduct any computation with high probability as long as the adversary corrupts at most $\varepsilon /m$ fraction of the total communication, where $m$ is the number of links in the network and $\varepsilon$ is a small constant. This scheme assumes the parties share a random string to which the adversarial noise is oblivious. We can remove this assumption at the price of being resilient to $\varepsilon / (m\log m)$ adversarial error.
While previous work considered the insertion-deletion noise model in the two-party setting, to the best of our knowledge, our scheme is the first multiparty scheme that is resilient to insertions and deletions. Furthermore, our scheme is the first computationally efficient scheme in the multiparty setting that is resilient to adversarial noise.
△ Less
Submitted 3 August, 2022; v1 submitted 28 January, 2019;
originally announced January 2019.
-
From Soft Classifiers to Hard Decisions: How fair can we be?
Authors:
Ran Canetti,
Aloni Cohen,
Nishanth Dikkala,
Govind Ramnarayan,
Sarah Scheffler,
Adam Smith
Abstract:
A popular methodology for building binary decision-making classifiers in the presence of imperfect information is to first construct a non-binary "scoring" classifier that is calibrated over all protected groups, and then to post-process this score to obtain a binary decision. We study the feasibility of achieving various fairness properties by post-processing calibrated scores, and then show that…
▽ More
A popular methodology for building binary decision-making classifiers in the presence of imperfect information is to first construct a non-binary "scoring" classifier that is calibrated over all protected groups, and then to post-process this score to obtain a binary decision. We study the feasibility of achieving various fairness properties by post-processing calibrated scores, and then show that deferring post-processors allow for more fairness conditions to hold on the final decision. Specifically, we show:
1. There does not exist a general way to post-process a calibrated classifier to equalize protected groups' positive or negative predictive value (PPV or NPV). For certain "nice" calibrated classifiers, either PPV or NPV can be equalized when the post-processor uses different thresholds across protected groups, though there exist distributions of calibrated scores for which the two measures cannot be both equalized. When the post-processing consists of a single global threshold across all groups, natural fairness properties, such as equalizing PPV in a nontrivial way, do not hold even for "nice" classifiers.
2. When the post-processing is allowed to `defer' on some decisions (that is, to avoid making a decision by handing off some examples to a separate process), then for the non-deferred decisions, the resulting classifier can be made to equalize PPV, NPV, false positive rate (FPR) and false negative rate (FNR) across the protected groups. This suggests a way to partially evade the impossibility results of Chouldechova and Kleinberg et al., which preclude equalizing all of these measures simultaneously. We also present different deferring strategies and show how they affect the fairness properties of the overall system.
We evaluate our post-processing techniques using the COMPAS data set from 2016.
△ Less
Submitted 21 January, 2019; v1 submitted 3 October, 2018;
originally announced October 2018.
-
Being Corrupt Requires Being Clever, But Detecting Corruption Doesn't
Authors:
Yan **,
Elchanan Mossel,
Govind Ramnarayan
Abstract:
We consider a variation of the problem of corruption detection on networks posed by Alon, Mossel, and Pemantle '15. In this model, each vertex of a graph can be either truthful or corrupt. Each vertex reports about the types (truthful or corrupt) of all its neighbors to a central agency, where truthful nodes report the true types they see and corrupt nodes report adversarially. The central agency…
▽ More
We consider a variation of the problem of corruption detection on networks posed by Alon, Mossel, and Pemantle '15. In this model, each vertex of a graph can be either truthful or corrupt. Each vertex reports about the types (truthful or corrupt) of all its neighbors to a central agency, where truthful nodes report the true types they see and corrupt nodes report adversarially. The central agency aggregates these reports and attempts to find a single truthful node. Inspired by real auditing networks, we pose our problem for arbitrary graphs and consider corruption through a computational lens. We identify a key combinatorial parameter of the graph $m(G)$, which is the minimal number of corrupted agents needed to prevent the central agency from identifying a single truthful node. We give an efficient (in fact, linear time) algorithm for the central agency to identify a truthful node that is successful whenever the number of corrupt nodes is less than $m(G)/2$. On the other hand, we prove that for any constant $α> 1$, it is NP-hard to find a subset of nodes $S$ in $G$ such that corrupting $S$ prevents the central agency from finding one truthful node and $|S| \leq αm(G)$, assuming the Small Set Expansion Hypothesis (Raghavendra and Steurer, STOC '10). We conclude that being corrupt requires being clever, while detecting corruption does not.
Our main technical insight is a relation between the minimum number of corrupt nodes required to hide all truthful nodes and a certain notion of vertex separability for the underlying graph. Additionally, this insight lets us design an efficient algorithm for a corrupt party to decide which graphs require the fewest corrupted nodes, up to a multiplicative factor of $O(\log n)$.
△ Less
Submitted 12 December, 2019; v1 submitted 26 September, 2018;
originally announced September 2018.
-
Equalizing Financial Impact in Supervised Learning
Authors:
Govind Ramnarayan
Abstract:
Notions of "fair classification" that have arisen in computer science generally revolve around equalizing certain statistics across protected groups. This approach has been criticized as ignoring societal issues, including how errors can hurt certain groups disproportionately. We pose a modification of one of the fairness criteria from Hardt, Price, and Srebro [NIPS, 2016] that makes a small step…
▽ More
Notions of "fair classification" that have arisen in computer science generally revolve around equalizing certain statistics across protected groups. This approach has been criticized as ignoring societal issues, including how errors can hurt certain groups disproportionately. We pose a modification of one of the fairness criteria from Hardt, Price, and Srebro [NIPS, 2016] that makes a small step towards addressing this issue in the case of financial decisions like giving loans. We call this new notion "equalized financial impact."
△ Less
Submitted 24 June, 2018;
originally announced June 2018.
-
A No-Go Theorem for Derandomized Parallel Repetition: Beyond Feige-Kilian
Authors:
Dana Moshkovitz,
Govind Ramnarayan,
Henry Yuen
Abstract:
In this work we show a barrier towards proving a randomness-efficient parallel repetition, a promising avenue for achieving many tight inapproximability results. Feige and Kilian (STOC'95) proved an impossibility result for randomness-efficient parallel repetition for two prover games with small degree, i.e., when each prover has only few possibilities for the question of the other prover. In rece…
▽ More
In this work we show a barrier towards proving a randomness-efficient parallel repetition, a promising avenue for achieving many tight inapproximability results. Feige and Kilian (STOC'95) proved an impossibility result for randomness-efficient parallel repetition for two prover games with small degree, i.e., when each prover has only few possibilities for the question of the other prover. In recent years, there have been indications that randomness-efficient parallel repetition (also called derandomized parallel repetition) might be possible for games with large degree, circumventing the impossibility result of Feige and Kilian. In particular, Dinur and Meir (CCC'11) construct games with large degree whose repetition can be derandomized using a theorem of Impagliazzo, Kabanets and Wigderson (SICOMP'12). However, obtaining derandomized parallel repetition theorems that would yield optimal inapproximability results has remained elusive.
This paper presents an explanation for the current impasse in progress, by proving a limitation on derandomized parallel repetition. We formalize two properties which we call "fortification-friendliness" and "yields robust embeddings." We show that any proof of derandomized parallel repetition achieving almost-linear blow-up cannot both (a) be fortification-friendly and (b) yield robust embeddings. Unlike Feige and Kilian, we do not require the small degree assumption.
Given that virtually all existing proofs of parallel repetition, including the derandomized parallel repetition result of Dinur and Meir, share these two properties, our no-go theorem highlights a major barrier to achieving almost-linear derandomized parallel repetition.
△ Less
Submitted 24 July, 2016;
originally announced July 2016.