Skip to main content

Showing 1–17 of 17 results for author: Berk, R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2111.09211  [pdf, other

    stat.AP

    Improving Fairness in Criminal Justice Algorithmic Risk Assessments Using Optimal Transport and Conformal Prediction Sets

    Authors: Richard A. Berk, Arun Kumar Kuchibhotla, Eric Tchetgen Tchetgen

    Abstract: In the United States and elsewhere, risk assessment algorithms are being used to help inform criminal justice decision-makers. A common intent is to forecast an offender's ``future dangerousness.'' Such algorithms have been correctly criticized for potential unfairness, and there is an active cottage industry trying to make repairs. In this paper, we use counterfactual reasoning to consider the pr… ▽ More

    Submitted 9 August, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

    Comments: 51 pages, 7 figures

  2. arXiv:2105.10624  [pdf, other

    stat.ME stat.AP

    Post-Model-Selection Statistical Inference with Interrupted Time Series Designs: An Evaluation of an Assault Weapons Ban in California

    Authors: Richard A. Berk

    Abstract: There have been many claims in the media and a bit of respectable research about the causes of variation in firearm sales. The challenges for causal inference can be quite daunting. This paper reports an analysis of daily handgun sales in California from 1996 through 2018 using an interrupted time series design and analysis. The design was introduced to social scientists in 1963 by Campbell and St… ▽ More

    Submitted 21 May, 2021; originally announced May 2021.

    Comments: 36 pages, 10 figures

  3. arXiv:2104.09358  [pdf, other

    stat.ME stat.AP

    Nested Conformal Prediction Sets for Classification with Applications to Probation Data

    Authors: Arun K. Kuchibhotla, Richard A. Berk

    Abstract: Risk assessments to help inform criminal justice decisions have been used in the United States since the 1920s. Over the past several years, statistical learning risk algorithms have been introduced amid much controversy about fairness, transparency and accuracy. In this paper, we focus on accuracy for a large department of probation and parole that is considering a major revision of its current,… ▽ More

    Submitted 13 April, 2021; originally announced April 2021.

  4. arXiv:2008.11664   

    stat.AP stat.ML

    Improving Fairness in Criminal Justice Algorithmic Risk Assessments Using Conformal Prediction Sets

    Authors: Richard A. Berk, Arun Kumar Kuchibhotla

    Abstract: Risk assessment algorithms have been correctly criticized for potential unfairness, and there is an active cottage industry trying to make repairs. In this paper, we adopt a framework from conformal prediction sets to remove unfairness from risk algorithms themselves and the covariates used for forecasting. From a sample of 300,000 offenders at their arraignments, we construct a confusion table an… ▽ More

    Submitted 21 May, 2021; v1 submitted 26 August, 2020; originally announced August 2020.

    Comments: We found an interpretive error in the method. We are trying now to develop a better approach

  5. arXiv:1910.11410  [pdf, other

    stat.AP

    Almost Politically Acceptable Criminal Justice Risk Assessment

    Authors: Richard A. Berk, Ayya A. Elzarka

    Abstract: In criminal justice risk forecasting, one can prove that it is impossible to optimize accuracy and fairness at the same time. One can also prove that it is impossible optimize at once all of the usual group definitions of fairness. In the policy arena, one is left with tradeoffs about which many stakeholders will adamantly disagree. In this paper, we offer a different approach. We do not seek perf… ▽ More

    Submitted 24 October, 2019; originally announced October 2019.

    Comments: 29 pages,5 figures

  6. arXiv:1903.00604  [pdf, other

    stat.AP

    An Algorithmic Approach to Forecasting Rare Violent Events: An Illustration Based in IPV Perpetration

    Authors: Richard A. Berk, Susan B. Sorenson

    Abstract: Mass violence, almost no matter how defined, is (thankfully) rare. Rare events are very difficult to study in a systematic manner. Standard statistical procedures can fail badly and usefully accurate forecasts of rare events often are little more than an aspiration. We offer an unconventional approach for the statistical analysis of rare events illustrated by an extensive case study. We report res… ▽ More

    Submitted 1 March, 2019; originally announced March 2019.

    Comments: 25 pages, 3 tables, 4 figures

  7. arXiv:1807.04164  [pdf, other

    stat.ME

    Using Recursive Partitioning to Find and Estimate Heterogenous Treatment Effects In Randomized Clinical Trials

    Authors: Richard Berk, Matthew Olson, Andreas Buja, Aurelie Ouss

    Abstract: Heterogeneous treatment effects can be very important in the analysis of randomized clinical trials. Heightened risks or enhanced benefits may exist for particular subsets of study subjects. When the heterogeneous treatment effects are specified as the research is being designed, there are proper and readily available analysis techniques. When the heterogeneous treatment effects are inductively ob… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

    Comments: 21 pages, 1 figure, under review

  8. arXiv:1806.09014  [pdf, other

    stat.ME

    Assumption Lean Regression

    Authors: Richard Berk, Andreas Buja, Lawrence Brown, Edward George, Arun Kumar Kuchibhotla, Weijie J. Su, Linda Zhao

    Abstract: It is well known that models used in conventional regression analysis are commonly misspecified. A standard response is little more than a shrug. Data analysts invoke Box's maxim that all models are wrong and then proceed as if the results are useful nevertheless. In this paper, we provide an alternative. Regression models are treated explicitly as approximations of a true response surface that ca… ▽ More

    Submitted 26 June, 2018; v1 submitted 23 June, 2018; originally announced June 2018.

    Comments: Submitted for review, 21 pages, 2 figures

  9. arXiv:1706.02409  [pdf, other

    cs.LG stat.ML

    A Convex Framework for Fair Regression

    Authors: Richard Berk, Hoda Heidari, Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, Aaron Roth

    Abstract: We introduce a flexible family of fairness regularizers for (linear and logistic) regression problems. These regularizers all enjoy convexity, permitting fast optimization, and they span the rang from notions of group fairness to strong individual fairness. By varying the weight on the fairness regularizer, we can compute the efficient frontier of the accuracy-fairness trade-off on any given datas… ▽ More

    Submitted 7 June, 2017; originally announced June 2017.

  10. arXiv:1703.09207  [pdf, ps, other

    stat.ML

    Fairness in Criminal Justice Risk Assessments: The State of the Art

    Authors: Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, Aaron Roth

    Abstract: Objectives: Discussions of fairness in criminal justice risk assessments typically lack conceptual precision. Rhetoric too often substitutes for careful analysis. In this paper, we seek to clarify the tradeoffs between different kinds of fairness and between fairness and accuracy. Methods: We draw on the existing literatures in criminology, computer science and statistics to provide an integrate… ▽ More

    Submitted 27 May, 2017; v1 submitted 27 March, 2017; originally announced March 2017.

    Comments: Under a Revise and Resubmit

  11. arXiv:1511.00273  [pdf, other

    stat.ME

    Calibrated Percentile Double Bootstrap For Robust Linear Regression Inference

    Authors: Daniel McCarthy, Kai Zhang, Lawrence Brown, Richard Berk, Andreas Buja, Edward George, Linda Zhao

    Abstract: We consider inference for the parameters of a linear model when the covariates are random and the relationship between response and covariates is possibly non-linear. Conventional inference methods such as z-intervals perform poorly in these cases. We propose a double bootstrap-based calibrated percentile method, perc-cal, as a general-purpose CI method which performs very well relative to alterna… ▽ More

    Submitted 16 January, 2017; v1 submitted 1 November, 2015; originally announced November 2015.

    MSC Class: 62F40

  12. arXiv:1409.1798  [pdf, other

    stat.AP

    Using Regression Kernels to Forecast A Failure to Appear in Court

    Authors: Richard Berk, Justin Bleich, Adam Kapelner, Jaime Henderson, Geoffrey Barnes, Ellen Kurtz

    Abstract: Forecasts of prospective criminal behavior have long been an important feature of many criminal justice decisions. There is now substantial evidence that machine learning procedures will classify and forecast at least as well, and typically better, than logistic regression, which has to date dominated conventional practice. However, machine learning procedures are adaptive. They "learn" inductivel… ▽ More

    Submitted 5 September, 2014; originally announced September 2014.

    Comments: 43 pages, 5 figures, 2 tables, 2 appendices

  13. Evaluating the Effectiveness of Personalized Medicine with Software

    Authors: Adam Kapelner, Justin Bleich, Alina Levine, Zachary D. Cohen, Robert J. DeRubeis, Richard Berk

    Abstract: We present methodological advances in understanding the effectiveness of personalized medicine models and supply easy-to-use open-source software. Personalized medicine involves the systematic use of individual patient characteristics to determine which treatment option is most likely to result in a better outcome for the patient on average. Why is personalized medicine not done more in practice?… ▽ More

    Submitted 21 November, 2020; v1 submitted 30 April, 2014; originally announced April 2014.

    Comments: 36 pages, 3 figures, 1 table

  14. arXiv:1404.1578  [pdf, other

    stat.ME

    Models as Approximations I: Consequences Illustrated with Linear Regression

    Authors: Andreas Buja, Richard Berk, Lawrence Brown, Edward George, Emil Pitkin, Mikhail Traskin, Linda Zhao, Kai Zhang

    Abstract: In the early 1980s Halbert White inaugurated a "model-robust'' form of statistical inference based on the "sandwich estimator'' of standard error. This estimator is known to be "heteroskedasticity-consistent", but it is less well-known to be "nonlinearity-consistent'' as well. Nonlinearity, however, raises fundamental issues because in its presence regressors are not ancillary, hence can't be trea… ▽ More

    Submitted 6 July, 2019; v1 submitted 6 April, 2014; originally announced April 2014.

    Comments: Submitted

  15. arXiv:1311.0291  [pdf, other

    stat.ME

    Improved Precision in Estimating Average Treatment Effects

    Authors: Emil Pitkin, Richard Berk, Lawrence Brown, Andreas Buja, Ed George, Kai Zhang, Linda Zhao

    Abstract: The Average Treatment Effect (ATE) is a global measure of the effectiveness of an experimental treatment intervention. Classical methods of its estimation either ignore relevant covariates or do not fully exploit them. Moreover, past work has considered covariates as fixed. We present a method for improving the precision of the ATE estimate: the treatment and control responses are estimated via a… ▽ More

    Submitted 1 November, 2013; originally announced November 2013.

    Comments: 22 pages, 1 figure

  16. Small area estimation of the homeless in Los Angeles: An application of cost-sensitive stochastic gradient boosting

    Authors: Brian Kriegler, Richard Berk

    Abstract: In many metropolitan areas efforts are made to count the homeless to ensure proper provision of social services. Some areas are very large, which makes spatial sampling a viable alternative to an enumeration of the entire terrain. Counts are observed in sampled regions but must be imputed in unvisited areas. Along with the imputation process, the costs of underestimating and overestimating may be… ▽ More

    Submitted 12 November, 2010; originally announced November 2010.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOAS328 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS328

    Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 3, 1234-1255

  17. Counting the homeless in Los Angeles County

    Authors: Richard Berk, Brian Kriegler, Donald Ylvisaker

    Abstract: Over the past two decades, a variety of methods have been used to count the homeless in large metropolitan areas. In this paper, we report on an effort to count the homeless in Los Angeles County, one that employed the sampling of census tracts. A number of complications are discussed, includingÊ the need to impute homeless counts to areas of Êthe CountyÊ not sampled. We conclude that, despite t… ▽ More

    Submitted 19 May, 2008; originally announced May 2008.

    Comments: Published in at http://dx.doi.org/10.1214/193940307000000428 the IMS Collections (http://www.imstat.org/publications/imscollections.htm) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-COLL2-IMSCOLL209 MSC Class: 62P25 (Primary)

    Journal ref: IMS Collections 2008, Vol. 2, 127-141