-
Causal Network Motifs: Identifying Heterogeneous Spillover Effects in A/B Tests
Authors:
Yuan Yuan,
Kristen M. Altenburger,
Farshad Kooti
Abstract:
Randomized experiments, or "A/B" tests, remain the gold standard for evaluating the causal effect of a policy intervention or product change. However, experimental settings, such as social networks, where users are interacting and influencing one another, may violate conventional assumptions of no interference for credible causal inference. Existing solutions to the network setting include account…
▽ More
Randomized experiments, or "A/B" tests, remain the gold standard for evaluating the causal effect of a policy intervention or product change. However, experimental settings, such as social networks, where users are interacting and influencing one another, may violate conventional assumptions of no interference for credible causal inference. Existing solutions to the network setting include accounting for the fraction or count of treated neighbors in a user's network, yet most current methods do not account for the local network structure beyond simply counting the number of neighbors. Our study provides an approach that accounts for both the local structure in a user's social network via motifs as well as the treatment assignment conditions of neighbors. We propose a two-part approach. We first introduce and employ "causal network motifs", which are network motifs that characterize the assignment conditions in local ego networks; and then we propose a tree-based algorithm for identifying different network interference conditions and estimating their average potential outcomes. Our approach can account for social network theories, such as structural diversity and echo chambers, and also can help specify network interference conditions that are suitable to each experiment. We test our method on a synthetic network setting and on a real-world experiment on a large-scale network, which highlight how accounting for local structures can better account for different interference patterns in networks.
△ Less
Submitted 15 February, 2021; v1 submitted 19 October, 2020;
originally announced October 2020.
-
Ensemble Validation: Selectivity has a Price, but Variety is Free
Authors:
Eric Bax,
Farshad Kooti
Abstract:
Suppose some classifiers are selected from a set of hypothesis classifiers to form an equally-weighted ensemble that selects a member classifier at random for each input example. Then the ensemble has an error bound consisting of the average error bound for the member classifiers, a term for selectivity that varies from zero (if all hypothesis classifiers are selected) to a standard uniform error…
▽ More
Suppose some classifiers are selected from a set of hypothesis classifiers to form an equally-weighted ensemble that selects a member classifier at random for each input example. Then the ensemble has an error bound consisting of the average error bound for the member classifiers, a term for selectivity that varies from zero (if all hypothesis classifiers are selected) to a standard uniform error bound (if only a single classifier is selected), and small constants. There is no penalty for using a richer hypothesis set if the same fraction of the hypothesis classifiers are selected for the ensemble.
△ Less
Submitted 28 March, 2019; v1 submitted 4 October, 2016;
originally announced October 2016.
-
Friendship Paradox Redux: Your Friends Are More Interesting Than You
Authors:
Nathan O. Hodas,
Farshad Kooti,
Kristina Lerman
Abstract:
Feld's friendship paradox states that "your friends have more friends than you, on average." This paradox arises because extremely popular people, despite being rare, are overrepresented when averaging over friends. Using a sample of the Twitter firehose, we confirm that the friendship paradox holds for >98% of Twitter users. Because of the directed nature of the follower graph on Twitter, we are…
▽ More
Feld's friendship paradox states that "your friends have more friends than you, on average." This paradox arises because extremely popular people, despite being rare, are overrepresented when averaging over friends. Using a sample of the Twitter firehose, we confirm that the friendship paradox holds for >98% of Twitter users. Because of the directed nature of the follower graph on Twitter, we are further able to confirm more detailed forms of the friendship paradox: everyone you follow or who follows you has more friends and followers than you. This is likely caused by a correlation we demonstrate between Twitter activity, number of friends, and number of followers. In addition, we discover two new paradoxes: the virality paradox that states "your friends receive more viral content than you, on average," and the activity paradox, which states "your friends are more active than you, on average." The latter paradox is important in regulating online communication. It may result in users having difficulty maintaining optimal incoming information rates, because following additional users causes the volume of incoming tweets to increase super-linearly. While users may compensate for increased information flow by increasing their own activity, users become information overloaded when they receive more information than they are able or willing to process. We compare the average size of cascades that are sent and received by overloaded and underloaded users. And we show that overloaded users post and receive larger cascades and they are poor detector of small cascades.
△ Less
Submitted 11 April, 2013;
originally announced April 2013.