-
Causal inference using deep neural networks
Authors:
Ye Yuan,
Xueying Ding,
Ziv Bar-Joseph
Abstract:
Causal inference from observation data is a core problem in many scientific fields. Here we present a general supervised deep learning framework that infers causal interactions by transforming the input vectors to an image-like representation for every pair of inputs. Given a training dataset we first construct a normalized empirical probability density distribution (NEPDF) matrix. We then train a…
▽ More
Causal inference from observation data is a core problem in many scientific fields. Here we present a general supervised deep learning framework that infers causal interactions by transforming the input vectors to an image-like representation for every pair of inputs. Given a training dataset we first construct a normalized empirical probability density distribution (NEPDF) matrix. We then train a convolutional neural network (CNN) on NEPDFs for causality predictions. We tested the method on several different simulated and real world data and compared it to prior methods for causal inference. As we show, the method is general, can efficiently handle very large datasets and improves upon prior methods.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
Genome-Wide Epigenetic Modifications as a Shared Memory Consensus Problem
Authors:
Sabrina Rashid,
Gadi Taubenfeld,
Ziv Bar-Joseph
Abstract:
A distributed computing system is a collection of processors that communicate either by reading and writing from a shared memory or by sending messages over some communication network. Most prior biologically inspired distributed computing algorithms rely on message passing as the communication model. Here we show that in the process of genome-wide epigenetic modifications cells utilize their DNA…
▽ More
A distributed computing system is a collection of processors that communicate either by reading and writing from a shared memory or by sending messages over some communication network. Most prior biologically inspired distributed computing algorithms rely on message passing as the communication model. Here we show that in the process of genome-wide epigenetic modifications cells utilize their DNA as a shared memory system. We formulate a particular consensus problem, called the epigenetic consensus problem, that cells attempt to solve using this shared memory model, and then present algorithms, derive expected run time and discuss, analyze and simulate improved methods for solving this problem. Analysis of real biological data indicates that the computational methods indeed reflect aspects of the biological process for genome-wide epigenetic modifications.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Bee** a Maximal Independent Set
Authors:
Yehuda Afek,
Noga Alon,
Ziv Bar-Joseph,
Alejandro Cornejo,
Bernhard Haeupler,
Fabian Kuhn
Abstract:
We consider the problem of computing a maximal independent set (MIS) in an extremely harsh broadcast model that relies only on carrier sensing. The model consists of an anonymous broadcast network in which nodes have no knowledge about the topology of the network or even an upper bound on its size. Furthermore, it is assumed that an adversary chooses at which time slot each node wakes up. At each…
▽ More
We consider the problem of computing a maximal independent set (MIS) in an extremely harsh broadcast model that relies only on carrier sensing. The model consists of an anonymous broadcast network in which nodes have no knowledge about the topology of the network or even an upper bound on its size. Furthermore, it is assumed that an adversary chooses at which time slot each node wakes up. At each time slot a node can either beep, that is, emit a signal, or be silent. At a particular time slot, bee** nodes receive no feedback, while silent nodes can only differentiate between none of its neighbors bee**, or at least one of its neighbors bee**.
We start by proving a lower bound that shows that in this model, it is not possible to locally converge to an MIS in sub-polynomial time. We then study four different relaxations of the model which allow us to circumvent the lower bound and find an MIS in polylogarithmic time. First, we show that if a polynomial upper bound on the network size is known, it is possible to find an MIS in O(log^3 n) time. Second, if we assume slee** nodes are awoken by neighboring beeps, then we can also find an MIS in O(log^3 n) time. Third, if in addition to this wakeup assumption we allow sender-side collision detection, that is, bee** nodes can distinguish whether at least one neighboring node is bee** concurrently or not, we can find an MIS in O(log^2 n) time. Finally, if instead we endow nodes with synchronous clocks, it is also possible to find an MIS in O(log^2 n) time.
△ Less
Submitted 1 June, 2012;
originally announced June 2012.
-
MIS on the fly
Authors:
Yehuda Afek,
Noga Alon,
Ziv Bar-Joseph
Abstract:
Humans are very good at optimizing solutions for specific problems. Biological processes, on the other hand, have evolved to handle multiple constrained distributed environments and so they are robust and adaptable. Inspired by observations made in a biological system we have recently presented a simple new randomized distributed MIS algorithm \cite{ZScience}. Here we extend these results by remov…
▽ More
Humans are very good at optimizing solutions for specific problems. Biological processes, on the other hand, have evolved to handle multiple constrained distributed environments and so they are robust and adaptable. Inspired by observations made in a biological system we have recently presented a simple new randomized distributed MIS algorithm \cite{ZScience}. Here we extend these results by removing a number of strong assumptions that we made, making the algorithms more practical. Specifically we present an $O(\log^2 n)$ rounds synchronous randomized MIS algorithm which uses only 1 bit unary messages (a bee** signal with collision detection), allows for asynchronous wake up, does not assume any knowledge of the network topology, and assumes only a loose bound on the network size. We also present an extension with no collision detection in which the round complexity increases to $(\log^3 n)$. Finally, we show that our algorithm is optimal under some restriction, by presenting a tight lower bound of $Ω(\log^2 n)$ on the number of rounds required to construct a MIS for a restricted model.
△ Less
Submitted 10 June, 2011;
originally announced June 2011.