-
Does AI help humans make better decisions? A methodological framework for experimental evaluation
Authors:
Eli Ben-Michael,
D. James Greiner,
Melody Huang,
Kosuke Imai,
Zhichao Jiang,
Sooahn Shin
Abstract:
The use of Artificial Intelligence (AI) based on data-driven algorithms has become ubiquitous in today's society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions as compared to a human alone or AI an alone. We introduce a new methodological framework that can be used to ans…
▽ More
The use of Artificial Intelligence (AI) based on data-driven algorithms has become ubiquitous in today's society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions as compared to a human alone or AI an alone. We introduce a new methodological framework that can be used to answer experimentally this question with no additional assumptions. We measure a decision maker's ability to make correct decisions using standard classification metrics based on the baseline potential outcome. We consider a single-blinded experimental design, in which the provision of AI-generated recommendations is randomized across cases with a human making final decisions. Under this experimental design, we show how to compare the performance of three alternative decision-making systems--human-alone, human-with-AI, and AI-alone. We apply the proposed methodology to the data from our own randomized controlled trial of a pretrial risk assessment instrument. We find that AI recommendations do not improve the classification accuracy of a judge's decision to impose cash bail. Our analysis also shows that AI-alone decisions generally perform worse than human decisions with or without AI assistance. Finally, AI recommendations tend to impose cash bail on non-white arrestees more often than necessary when compared to white arrestees.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
Automatic detection of long-duration transients in Fermi-GBM data
Authors:
F. Kunzweiler,
B. Biltzinger,
J. Greiner,
J. M. Burgess
Abstract:
In the era of time-domain, multi-messenger astronomy, the detection of transient events on the high-energy electromagnetic sky has become more important than ever. Previous attempts to systematically search for onboard-untriggered events in the data of Fermi-GBM have been limited to short-duration signals with variability time scales smaller than ~1 min due to the dominance of background variation…
▽ More
In the era of time-domain, multi-messenger astronomy, the detection of transient events on the high-energy electromagnetic sky has become more important than ever. Previous attempts to systematically search for onboard-untriggered events in the data of Fermi-GBM have been limited to short-duration signals with variability time scales smaller than ~1 min due to the dominance of background variations on longer timescales. In this study, we aim at the detection of slowly rising or long-duration transient events with high sensitivity and full coverage of the GBM spectrum. We make use of our earlier developed physical background model, propose a novel trigger algorithm with a fully automatic data analysis pipeline. The results from extensive simulations demonstrate that the developed trigger algorithm is sensitive down to sub-Crab intensities, and has a near-optimal detection performance. During a two month test run on real Fermi-GBM data, the pipeline detected more than 300 untriggered transient signals. For one of these transient detections we verify that it originated from a known astrophysical source, namely the Vela X-1 pulsar, showing pulsed emission for more than seven hours. More generally, this method enables a systematic search for weak and/or long-duration transients.
△ Less
Submitted 26 May, 2022;
originally announced May 2022.
-
Safe Policy Learning through Extrapolation: Application to Pre-trial Risk Assessment
Authors:
Eli Ben-Michael,
D. James Greiner,
Kosuke Imai,
Zhichao Jiang
Abstract:
Algorithmic recommendations and decisions have become ubiquitous in today's society. Many of these and other data-driven policies, especially in the realm of public policy, are based on known, deterministic rules to ensure their transparency and interpretability. For example, algorithmic pre-trial risk assessments, which serve as our motivating application, provide relatively simple, deterministic…
▽ More
Algorithmic recommendations and decisions have become ubiquitous in today's society. Many of these and other data-driven policies, especially in the realm of public policy, are based on known, deterministic rules to ensure their transparency and interpretability. For example, algorithmic pre-trial risk assessments, which serve as our motivating application, provide relatively simple, deterministic classification scores and recommendations to help judges make release decisions. How can we use the data based on existing deterministic policies to learn new and better policies? Unfortunately, prior methods for policy learning are not applicable because they require existing policies to be stochastic rather than deterministic. We develop a robust optimization approach that partially identifies the expected utility of a policy, and then finds an optimal policy by minimizing the worst-case regret. The resulting policy is conservative but has a statistical safety guarantee, allowing the policy-maker to limit the probability of producing a worse outcome than the existing policy. We extend this approach to common and important settings where humans make decisions with the aid of algorithmic recommendations. Lastly, we apply the proposed methodology to a unique field experiment on pre-trial risk assessment instruments. We derive new classification and recommendation rules that retain the transparency and interpretability of the existing instrument while potentially leading to better overall outcomes at a lower cost.
△ Less
Submitted 15 February, 2022; v1 submitted 21 September, 2021;
originally announced September 2021.
-
Experimental Evaluation of Algorithm-Assisted Human Decision-Making: Application to Pretrial Public Safety Assessment
Authors:
Kosuke Imai,
Zhichao Jiang,
James Greiner,
Ryan Halen,
Sooahn Shin
Abstract:
Despite an increasing reliance on fully-automated algorithmic decision-making in our day-to-day lives, human beings still make highly consequential decisions. As frequently seen in business, healthcare, and public policy, recommendations produced by algorithms are provided to human decision-makers to guide their decisions. While there exists a fast-growing literature evaluating the bias and fairne…
▽ More
Despite an increasing reliance on fully-automated algorithmic decision-making in our day-to-day lives, human beings still make highly consequential decisions. As frequently seen in business, healthcare, and public policy, recommendations produced by algorithms are provided to human decision-makers to guide their decisions. While there exists a fast-growing literature evaluating the bias and fairness of such algorithmic recommendations, an overlooked question is whether they help humans make better decisions. We develop a statistical methodology for experimentally evaluating the causal impacts of algorithmic recommendations on human decisions. We also show how to examine whether algorithmic recommendations improve the fairness of human decisions and derive the optimal decision rules under various settings. We apply the proposed methodology to preliminary data from the first-ever randomized controlled trial that evaluates the pretrial Public Safety Assessment (PSA) in the criminal justice system. A goal of the PSA is to help judges decide which arrested individuals should be released. On the basis of the preliminary data available, we find that providing the PSA to the judge has little overall impact on the judge's decisions and subsequent arrestee behavior. However, our analysis yields some potentially suggestive evidence that the PSA may help avoid unnecessarily harsh decisions for female arrestees regardless of their risk levels while it encourages the judge to make stricter decisions for male arrestees who are deemed to be risky. In terms of fairness, the PSA appears to increase the gender bias against males while having little effect on any existing racial differences in judges' decision. Finally, we find that the PSA's recommendations might be unnecessarily severe unless the cost of a new crime is sufficiently high.
△ Less
Submitted 11 December, 2021; v1 submitted 4 December, 2020;
originally announced December 2020.
-
Bound entangled states fit for robust experimental verification
Authors:
Gael SentÃs,
Johannes N. Greiner,
Jiangwei Shang,
Jens Siewert,
Matthias Kleinmann
Abstract:
Preparing and certifying bound entangled states in the laboratory is an intrinsically hard task, due to both the fact that they typically form narrow regions in the state space, and that a certificate requires a tomographic reconstruction of the density matrix. Indeed, the previous experiments that have reported the preparation of a bound entangled state relied on such tomographic reconstruction t…
▽ More
Preparing and certifying bound entangled states in the laboratory is an intrinsically hard task, due to both the fact that they typically form narrow regions in the state space, and that a certificate requires a tomographic reconstruction of the density matrix. Indeed, the previous experiments that have reported the preparation of a bound entangled state relied on such tomographic reconstruction techniques. However, the reliability of these results crucially depends on the extra assumption of an unbiased reconstruction. We propose an alternative method for certifying the bound entangled character of a quantum state that leads to a rigorous claim within a desired statistical significance, while bypassing a full reconstruction of the state. The method is comprised by a search for bound entangled states that are robust for experimental verification, and a hypothesis test tailored for the detection of bound entanglement that is naturally equipped with a measure of statistical significance. We apply our method to families of states of $3\times 3$ and $4\times 4$ systems, and find that the experimental certification of bound entangled states is well within reach.
△ Less
Submitted 14 December, 2018; v1 submitted 20 April, 2018;
originally announced April 2018.
-
Exit polling and racial bloc voting: Combining individual-level and R$\times$C ecological data
Authors:
D. James Greiner,
Kevin M. Quinn
Abstract:
Despite its shortcomings, cross-level or ecological inference remains a necessary part of some areas of quantitative inference, including in United States voting rights litigation. Ecological inference suffers from a lack of identification that, most agree, is best addressed by incorporating individual-level data into the model. In this paper we test the limits of such an incorporation by attempti…
▽ More
Despite its shortcomings, cross-level or ecological inference remains a necessary part of some areas of quantitative inference, including in United States voting rights litigation. Ecological inference suffers from a lack of identification that, most agree, is best addressed by incorporating individual-level data into the model. In this paper we test the limits of such an incorporation by attempting it in the context of drawing inferences about racial voting patterns using a combination of an exit poll and precinct-level ecological data; accurate information about racial voting patterns is needed to assess triggers in voting rights laws that can determine the composition of United States legislative bodies. Specifically, we extend and study a hybrid model that addresses two-way tables of arbitrary dimension. We apply the hybrid model to an exit poll we administered in the City of Boston in 2008. Using the resulting data as well as simulation, we compare the performance of a pure ecological estimator, pure survey estimators using various sampling schemes and our hybrid. We conclude that the hybrid estimator offers substantial benefits by enabling substantive inferences about voting patterns not practicably available without its use.
△ Less
Submitted 5 January, 2011;
originally announced January 2011.