-
A Discrimination Report Card
Authors:
Patrick Kline,
Evan K. Rose,
Christopher R. Walters
Abstract:
We develop an Empirical Bayes grading scheme that balances the informativeness of the assigned grades against the expected frequency of ranking errors. Applying the method to a massive correspondence experiment, we grade the racial biases of 97 U.S. employers. A four-grade ranking limits the chances that a randomly selected pair of firms is mis-ranked to 5% while explaining nearly half of the vari…
▽ More
We develop an Empirical Bayes grading scheme that balances the informativeness of the assigned grades against the expected frequency of ranking errors. Applying the method to a massive correspondence experiment, we grade the racial biases of 97 U.S. employers. A four-grade ranking limits the chances that a randomly selected pair of firms is mis-ranked to 5% while explaining nearly half of the variation in firms' racial contact gaps. The grades are presented alongside measures of uncertainty about each firm's contact gap in an accessible rubric that is easily adapted to other settings where ranks and levels are of simultaneous interest.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
On Recoding Ordered Treatments as Binary Indicators
Authors:
Evan K. Rose,
Yotam Shem-Tov
Abstract:
Researchers using instrumental variables to investigate ordered treatments often recode treatment into an indicator for any exposure. We investigate this estimand under the assumption that the instruments shift compliers from no treatment to some but not from some treatment to more. We show that when there are extensive margin compliers only (EMCO) this estimand captures a weighted average of trea…
▽ More
Researchers using instrumental variables to investigate ordered treatments often recode treatment into an indicator for any exposure. We investigate this estimand under the assumption that the instruments shift compliers from no treatment to some but not from some treatment to more. We show that when there are extensive margin compliers only (EMCO) this estimand captures a weighted average of treatment effects that can be partially unbundled into each complier group's potential outcome means. We also establish an equivalence between EMCO and a two-factor selection model and apply our results to study treatment heterogeneity in the Oregon Health Insurance Experiment.
△ Less
Submitted 1 March, 2024; v1 submitted 23 November, 2021;
originally announced November 2021.
-
Batch-sequential design and heteroskedastic surrogate modeling for delta smelt conservation
Authors:
Boya Zhang,
Robert B. Gramacy,
Leah Johnson,
Kenneth A. Rose,
Eric Smith
Abstract:
Delta smelt is an endangered fish species in the San Francisco estuary that have shown an overall population decline over the past 30 years. Researchers have developed a stochastic, agent-based simulator to virtualize the system, with the goal of understanding the relative contribution of natural and anthropogenic factors suggested as playing a role in their decline. However, the input configurati…
▽ More
Delta smelt is an endangered fish species in the San Francisco estuary that have shown an overall population decline over the past 30 years. Researchers have developed a stochastic, agent-based simulator to virtualize the system, with the goal of understanding the relative contribution of natural and anthropogenic factors suggested as playing a role in their decline. However, the input configuration space is high-dimensional, running the simulator is time-consuming, and its noisy outputs change nonlinearly in both mean and variance. Getting enough runs to effectively learn input--output dynamics requires both a nimble modeling strategy and parallel supercomputer evaluation. Recent advances in heteroskedastic Gaussian process (HetGP) surrogate modeling helps, but little is known about how to appropriately plan experiments for highly distributed simulator evaluation. We propose a batch sequential design scheme, generalizing one-at-a-time variance-based active learning for HetGP surrogates, as a means of kee** multi-core cluster nodes fully engaged with expensive runs. Our acquisition strategy is carefully engineered to favor selection of replicates which boost statistical and computational efficiencies when training surrogates to isolate signal in high noise regions. Design and modeling performance is illustrated on a range of toy examples before embarking on a large-scale smelt simulation campaign and downstream high-fidelity input sensitivity analysis.
△ Less
Submitted 17 March, 2021; v1 submitted 13 October, 2020;
originally announced October 2020.
-
Environmental Economics and Uncertainty: Review and a Machine Learning Outlook
Authors:
Ruda Zhang,
Patrick Wingo,
Rodrigo Duran,
Kelly Rose,
Jennifer Bauer,
Roger Ghanem
Abstract:
Economic assessment in environmental science concerns the measurement or valuation of environmental impacts, adaptation, and vulnerability. Integrated assessment modeling is a unifying framework of environmental economics, which attempts to combine key elements of physical, ecological, and socioeconomic systems. Uncertainty characterization in integrated assessment varies by component models: unce…
▽ More
Economic assessment in environmental science concerns the measurement or valuation of environmental impacts, adaptation, and vulnerability. Integrated assessment modeling is a unifying framework of environmental economics, which attempts to combine key elements of physical, ecological, and socioeconomic systems. Uncertainty characterization in integrated assessment varies by component models: uncertainties associated with mechanistic physical models are often assessed with an ensemble of simulations or Monte Carlo sampling, while uncertainties associated with impact models are evaluated by conjecture or econometric analysis. Manifold sampling is a machine learning technique that constructs a joint probability model of all relevant variables which may be concentrated on a low-dimensional geometric structure. Compared with traditional density estimation methods, manifold sampling is more efficient especially when the data is generated by a few latent variables. The manifold-constrained joint probability model helps answer policy-making questions from prediction, to response, and prevention. Manifold sampling is applied to assess risk of offshore drilling in the Gulf of Mexico.
△ Less
Submitted 24 April, 2020;
originally announced April 2020.