Potential for allocative harm in an environmental justice data tool
Authors:
Benjamin Q. Huynh,
Elizabeth T. Chin,
Allison Koenecke,
Derek Ouyang,
Daniel E. Ho,
Mathew V. Kiang,
David H. Rehkopf
Abstract:
Neighborhood-level screening algorithms are increasingly being deployed to inform policy decisions. We evaluate one such algorithm, CalEnviroScreen - designed to promote environmental justice and used to guide hundreds of millions of dollars in public funding annually - assessing its potential for allocative harm. We observe the model to be sensitive to subjective model decisions, with 16% of trac…
▽ More
Neighborhood-level screening algorithms are increasingly being deployed to inform policy decisions. We evaluate one such algorithm, CalEnviroScreen - designed to promote environmental justice and used to guide hundreds of millions of dollars in public funding annually - assessing its potential for allocative harm. We observe the model to be sensitive to subjective model decisions, with 16% of tracts potentially changing designation, as well as financially consequential, estimating the effect of its positive designations as a 104% (62-145%) increase in funding, equivalent to \$2.08 billion (\$1.56-2.41 billion) over four years. We also observe allocative tradeoffs and susceptibility to manipulation, raising ethical concerns. We recommend incorporating sensitivity analyses to mitigate allocative harm and accountability mechanisms to prevent misuse.
△ Less
Submitted 12 April, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
Machine learning in the social and health sciences
Authors:
Anja K. Leist,
Matthias Klee,
Jung Hyun Kim,
David H. Rehkopf,
Stéphane P. A. Bordas,
Graciela Muniz-Terrera,
Sara Wade
Abstract:
The uptake of machine learning (ML) approaches in the social and health sciences has been rather slow, and research using ML for social and health research questions remains fragmented. This may be due to the separate development of research in the computational/data versus social and health sciences as well as a lack of accessible overviews and adequate training in ML techniques for non data scie…
▽ More
The uptake of machine learning (ML) approaches in the social and health sciences has been rather slow, and research using ML for social and health research questions remains fragmented. This may be due to the separate development of research in the computational/data versus social and health sciences as well as a lack of accessible overviews and adequate training in ML techniques for non data science researchers. This paper provides a meta-map** of research questions in the social and health sciences to appropriate ML approaches, by incorporating the necessary requirements to statistical analysis in these disciplines. We map the established classification into description, prediction, and causal inference to common research goals, such as estimating prevalence of adverse health or social outcomes, predicting the risk of an event, and identifying risk factors or causes of adverse outcomes. This meta-map** aims at overcoming disciplinary barriers and starting a fluid dialogue between researchers from the social and health sciences and methodologically trained researchers. Such map** may also help to fully exploit the benefits of ML while considering domain-specific aspects relevant to the social and health sciences, and hopefully contribute to the acceleration of the uptake of ML applications to advance both basic and applied social and health sciences research.
△ Less
Submitted 20 June, 2021;
originally announced June 2021.