-
General Regression Methods for Respondent-Driven Sampling Data
Authors:
Mamadou Yauck,
Erica E. M. Moodie,
Herak Apelian,
Alain Fourmigue,
Daniel Grace,
Trevor Hart,
Gilles Lambert,
Joseph Cox
Abstract:
Respondent-Driven Sampling (RDS) is a variant of link-tracing sampling techniques that aim to recruit hard-to-reach populations by leveraging individuals' social relationships. As such, an RDS sample has a graphical component which represents a partially observed network of unknown structure. Moreover, it is common to observe homophily, or the tendency to form connections with individuals who shar…
▽ More
Respondent-Driven Sampling (RDS) is a variant of link-tracing sampling techniques that aim to recruit hard-to-reach populations by leveraging individuals' social relationships. As such, an RDS sample has a graphical component which represents a partially observed network of unknown structure. Moreover, it is common to observe homophily, or the tendency to form connections with individuals who share similar traits. Currently, there is a lack of principled guidance on multivariate modeling strategies for RDS to address homophilic covariates and the dependence between observations within the network. In this work, we propose a methodology for general regression techniques using RDS data. This is used to study the socio-demographic predictors of HIV treatment optimism (about the value of antiretroviral therapy) among gay, bisexual and other men who have sex with men, recruited into an RDS study in Montreal, Canada.
△ Less
Submitted 1 December, 2020;
originally announced December 2020.
-
Neighbourhood Bootstrap for Respondent-Driven Sampling
Authors:
Mamadou Yauck,
Erica E. M. Moodie,
Herak Apelian,
Alain Fourmigue,
Daniel Grace,
Trevor A. Hart,
Gilles Lambert,
Joseph Cox
Abstract:
Respondent-Driven Sampling (RDS) is a form of link-tracing sampling, a sampling technique used for `hard-to-reach' populations that aims to leverage individuals' social relationships to reach potential participants. While the methodological focus has been restricted to the estimation of population proportions, there is a growing interest in the estimation of uncertainty for RDS as recent findings…
▽ More
Respondent-Driven Sampling (RDS) is a form of link-tracing sampling, a sampling technique used for `hard-to-reach' populations that aims to leverage individuals' social relationships to reach potential participants. While the methodological focus has been restricted to the estimation of population proportions, there is a growing interest in the estimation of uncertainty for RDS as recent findings suggest that most variance estimators underestimate variability. Recently, Baraff et al. (2016) proposed the \textit{tree bootstrap} method based on resampling the RDS recruitment tree, and empirically showed that this method outperforms current bootstrap methods. However, some findings suggest that the tree bootstrap (severely) overestimates uncertainty. In this paper, we propose the \textit{neighbourhood} bootstrap method for quantifiying uncertainty in RDS. We prove the consistency of our method under some conditions and investigate its finite sample performance, through a simulation study, under realistic RDS sampling assumptions.
△ Less
Submitted 4 February, 2021; v1 submitted 30 September, 2020;
originally announced October 2020.
-
Sampling from Networks: Respondent-Driven Sampling
Authors:
Mamadou Yauck,
Erica E. M. Moodie,
Herak Apelian,
Marc-Messier Peet,
Gilles Lambert,
Daniel Grace,
Nathan J. Lachowsky,
Trevor Hart,
Joseph Cox
Abstract:
Respondent-Driven Sampling (RDS) is a variant of link-tracing, a sampling technique for surveying hard-to-reach communities that takes advantage of community members' social networks to reach potential participants. As a network-based sampling method, RDS is faced with the fundamental problem of sampling from population networks where features such as homophily (the tendency for individuals with s…
▽ More
Respondent-Driven Sampling (RDS) is a variant of link-tracing, a sampling technique for surveying hard-to-reach communities that takes advantage of community members' social networks to reach potential participants. As a network-based sampling method, RDS is faced with the fundamental problem of sampling from population networks where features such as homophily (the tendency for individuals with similar traits to share social ties) and differential activity (the ratio of the average number of connections by attribute) are sensitive to the choice of a sampling method. Though not clearly described in the RDS literature, many simple methods exist to generate simulated RDS data, with specific levels of network features, where the focus is on estimating simple estimands. However, the accuracy of these methods in their abilities to consistently recover those targeted network features remains unclear. This is also motivated by recent findings that some population network parameters (e.g.~homophily) cannot be consistently estimated from the RDS data alone \citep{Crawford17}.
In this paper, we conduct a simulation study to assess the accuracy of existing RDS simulation methods, in terms of their abilities to generate RDS samples with the desired levels of two network parameters: homophily and differential activity. The results show that (1) homophily cannot be consistently estimated from simulated RDS samples and (2) differential activity estimates are more precise when groups, defined by traits, are equally active and equally represented in the population. We use this approach to mimic features of the Engage Study, an RDS sample of gay, bisexual and other men who have sex with men in Montreal.
△ Less
Submitted 14 August, 2020; v1 submitted 13 February, 2020;
originally announced February 2020.