Skip to main content

Showing 1–50 of 86 results for author: Bates, S

.
  1. arXiv:2403.19605  [pdf, other

    stat.ME cs.LG

    Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction

    Authors: Drew T. Nguyen, Reese Pathak, Anastasios N. Angelopoulos, Stephen Bates, Michael I. Jordan

    Abstract: Decision-making pipelines are generally characterized by tradeoffs among various risk functions. It is often desirable to manage such tradeoffs in a data-adaptive manner. As we demonstrate, if this is done naively, state-of-the art uncertainty quantification methods can lead to significant violations of putative risk guarantees. To address this issue, we develop methods that permit valid control… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 27 pages, 10 figures

  2. arXiv:2402.01139  [pdf, other

    stat.ML cs.LG stat.ME

    Online conformal prediction with decaying step sizes

    Authors: Anastasios N. Angelopoulos, Rina Foygel Barber, Stephen Bates

    Abstract: We introduce a method for online conformal prediction with decaying step sizes. Like previous methods, ours possesses a retrospective guarantee of coverage for arbitrary sequences. However, unlike previous methods, we can simultaneously estimate a population quantile when it exists. Our theory and experiments indicate substantially improved practical properties: in particular, when the distributio… ▽ More

    Submitted 28 May, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  3. arXiv:2309.07435  [pdf, other

    stat.ME

    Uncertainty Intervals for Prediction Errors in Time Series Forecasting

    Authors: Hui Xu, Song Mei, Stephen Bates, Jonathan Taylor, Robert Tibshirani

    Abstract: Inference for prediction errors is critical in time series forecasting pipelines. However, providing statistically meaningful uncertainty intervals for prediction errors remains relatively under-explored. Practitioners often resort to forward cross-validation (FCV) for obtaining point estimators and constructing confidence intervals based on the Central Limit Theorem (CLT). The naive version assum… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: 35 pages, 17 figures

  4. arXiv:2309.01837  [pdf, other

    cs.LG stat.ML

    Delegating Data Collection in Decentralized Machine Learning

    Authors: Nivasini Ananthakrishnan, Stephen Bates, Michael I. Jordan, Nika Haghtalab

    Abstract: Motivated by the emergence of decentralized machine learning (ML) ecosystems, we study the delegation of data collection. Taking the field of contract theory as our starting point, we design optimal and near-optimal contracts that deal with two fundamental information asymmetries that arise in decentralized ML: uncertainty in the assessment of model quality and uncertainty regarding the optimal pe… ▽ More

    Submitted 2 May, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

  5. arXiv:2307.03748  [pdf, other

    stat.ME cs.GT cs.LG stat.ML

    Incentive-Theoretic Bayesian Inference for Collaborative Science

    Authors: Stephen Bates, Michael I. Jordan, Michael Sklar, Jake A. Soloff

    Abstract: Contemporary scientific research is a distributed, collaborative endeavor, carried out by teams of researchers, regulatory institutions, funding agencies, commercial partners, and scientific bodies, all interacting with each other and facing different incentives. To maintain scientific rigor, statistical methods should acknowledge this state of affairs. To this end, we study hypothesis testing whe… ▽ More

    Submitted 8 February, 2024; v1 submitted 7 July, 2023; originally announced July 2023.

  6. arXiv:2306.09335  [pdf, other

    stat.ML cs.CV cs.LG stat.ME

    Class-Conditional Conformal Prediction with Many Classes

    Authors: Tiffany Ding, Anastasios N. Angelopoulos, Stephen Bates, Michael I. Jordan, Ryan J. Tibshirani

    Abstract: Standard conformal prediction methods provide a marginal coverage guarantee, which means that for a random test point, the conformal prediction set contains the true label with a user-specified probability. In many classification problems, we would like to obtain a stronger guarantee--that for test points of a specific class, the prediction set contains the true label with the same user-chosen pro… ▽ More

    Submitted 27 October, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

  7. arXiv:2305.14595  [pdf, other

    cs.LG cs.CY cs.GT

    Operationalizing Counterfactual Metrics: Incentives, Ranking, and Information Asymmetry

    Authors: Serena Wang, Stephen Bates, P. M. Aronow, Michael I. Jordan

    Abstract: From the social sciences to machine learning, it has been well documented that metrics to be optimized are not always aligned with social welfare. In healthcare, Dranove et al. (2003) showed that publishing surgery mortality metrics actually harmed the welfare of sicker patients by increasing provider selection behavior. We analyze the incentive misalignments that arise from such average treated o… ▽ More

    Submitted 29 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  8. arXiv:2303.09045  [pdf

    cs.LG cs.CR

    Web and Mobile Platforms for Managing Elections based on IoT And Machine Learning Algorithms

    Authors: G. M. I. K. Galagoda, W. M. C. A. Karunarathne, R. S. Bates, K. M. H. V. P. Gangathilaka, Kanishka Yapa, Erandika Gamage

    Abstract: The global pandemic situation has severely affected all countries. As a result, almost all countries had to adjust to online technologies to continue their processes. In addition, Sri Lanka is yearly spending ten billion on elections. We have examined a proper way of minimizing the cost of hosting these events online. To solve the existing problems and increase the time potency and cost reduction… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Journal ref: International Journal of Engineering Applied Sciences and Technology, 2022, Vol 7, No 7, 29-35

  9. arXiv:2301.09633  [pdf, other

    stat.ML cs.AI cs.LG q-bio.QM stat.ME

    Prediction-Powered Inference

    Authors: Anastasios N. Angelopoulos, Stephen Bates, Clara Fannjiang, Michael I. Jordan, Tijana Zrnic

    Abstract: Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system. The framework yields simple algorithms for computing provably valid confidence intervals for quantities such as means, quantiles, and linear and logistic regression coefficients, without making any assumptions on the ma… ▽ More

    Submitted 9 November, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

    Comments: Code is available at https://github.com/aangelopoulos/ppi_py

  10. arXiv:2211.05732  [pdf, other

    cs.GT cs.AI cs.LG econ.TH

    The Sample Complexity of Online Contract Design

    Authors: Banghua Zhu, Stephen Bates, Zhuoran Yang, Yixin Wang, Jiantao Jiao, Michael I. Jordan

    Abstract: We study the hidden-action principal-agent problem in an online setting. In each round, the principal posts a contract that specifies the payment to the agent based on each outcome. The agent then makes a strategic choice of action that maximizes her own utility, but the action is not directly observable by the principal. The principal observes the outcome and receives utility from the agent's cho… ▽ More

    Submitted 19 May, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

  11. arXiv:2209.14295  [pdf, other

    cs.LG cs.AI math.ST stat.ME stat.ML

    Conformal Prediction is Robust to Dispersive Label Noise

    Authors: Shai Feldman, Bat-Sheva Einbinder, Stephen Bates, Anastasios N. Angelopoulos, Asaf Gendler, Yaniv Romano

    Abstract: We study the robustness of conformal prediction, a powerful tool for uncertainty quantification, to label noise. Our analysis tackles both regression and classification problems, characterizing when and how it is possible to construct uncertainty sets that correctly cover the unobserved noiseless ground truth labels. We further extend our theory and formulate the requirements for correctly control… ▽ More

    Submitted 19 September, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

  12. arXiv:2208.02814  [pdf, other

    stat.ME cs.AI cs.LG math.ST stat.ML

    Conformal Risk Control

    Authors: Anastasios N. Angelopoulos, Stephen Bates, Adam Fisch, Lihua Lei, Tal Schuster

    Abstract: We extend conformal prediction to control the expected value of any monotone loss function. The algorithm generalizes split conformal prediction together with its coverage guarantee. Like conformal prediction, the conformal risk control procedure is tight up to an $\mathcal{O}(1/n)$ factor. We also introduce extensions of the idea to distribution shift, quantile risk control, multiple and adversar… ▽ More

    Submitted 29 April, 2023; v1 submitted 4 August, 2022; originally announced August 2022.

    Comments: Code available at https://github.com/aangelopoulos/conformal-risk

  13. arXiv:2207.10074  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Semantic uncertainty intervals for disentangled latent spaces

    Authors: Swami Sankaranarayanan, Anastasios N. Angelopoulos, Stephen Bates, Yaniv Romano, Phillip Isola

    Abstract: Meaningful uncertainty quantification in computer vision requires reasoning about semantic information -- say, the hair color of the person in a photo or the location of a car on the street. To this end, recent breakthroughs in generative modeling allow us to represent semantic information in disentangled latent spaces, but providing uncertainties on the semantic latent variables has remained chal… ▽ More

    Submitted 30 November, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted to NeurIPS 2022. Project page: https://swamiviv.github.io/semantic_uncertainty_intervals/

  14. arXiv:2207.01609  [pdf, other

    cs.IR cs.LG stat.ML

    Recommendation Systems with Distribution-Free Reliability Guarantees

    Authors: Anastasios N. Angelopoulos, Karl Krauth, Stephen Bates, Yixin Wang, Michael I. Jordan

    Abstract: When building recommendation systems, we seek to output a helpful set of items to the user. Under the hood, a ranking model predicts which of two candidate items is better, and we must distill these pairwise comparisons into the user-facing output. However, a learned ranking model is never perfect, so taking its predictions at face value gives no guarantee that the user-facing output is reliable.… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

  15. arXiv:2206.02757  [pdf, other

    cs.LG cs.AI stat.ML

    Robust Calibration with Multi-domain Temperature Scaling

    Authors: Yaodong Yu, Stephen Bates, Yi Ma, Michael I. Jordan

    Abstract: Uncertainty quantification is essential for the reliable deployment of machine learning models to high-stakes application domains. Uncertainty quantification is all the more challenging when training distribution and test distribution are different, even the distribution shifts are mild. Despite the ubiquity of distribution shifts in real-world applications, existing uncertainty quantification app… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

  16. arXiv:2205.09095  [pdf, other

    cs.LG stat.ML

    Achieving Risk Control in Online Learning Settings

    Authors: Shai Feldman, Liran Ringel, Stephen Bates, Yaniv Romano

    Abstract: To provide rigorous uncertainty quantification for online learning models, we develop a framework for constructing uncertainty sets that provably control risk -- such as coverage of confidence intervals, false negative rate, or F1 score -- in the online setting. This extends conformal prediction to apply to a larger class of online learning problems. Our method guarantees risk control at any user-… ▽ More

    Submitted 27 January, 2023; v1 submitted 18 May, 2022; originally announced May 2022.

  17. arXiv:2205.06812  [pdf, other

    cs.GT cs.LG cs.MA math.ST stat.ME

    Principal-Agent Hypothesis Testing

    Authors: Stephen Bates, Michael I. Jordan, Michael Sklar, Jake A. Soloff

    Abstract: Consider the relationship between a regulator (the principal) and an experimenter (the agent) such as a pharmaceutical company. The pharmaceutical company wishes to sell a drug for profit, whereas the regulator wishes to allow only efficacious drugs to be marketed. The efficacy of the drug is not known to the regulator, so the pharmaceutical company must run a costly trial to prove efficacy to the… ▽ More

    Submitted 15 April, 2024; v1 submitted 13 May, 2022; originally announced May 2022.

  18. arXiv:2202.05265  [pdf, other

    cs.LG cs.CV eess.IV q-bio.QM stat.ML

    Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging

    Authors: Anastasios N Angelopoulos, Amit P Kohli, Stephen Bates, Michael I Jordan, Jitendra Malik, Thayer Alshaabi, Srigokul Upadhyayula, Yaniv Romano

    Abstract: Image-to-image regression is an important learning task, used frequently in biological imaging. Current algorithms, however, do not generally offer statistical guarantees that protect against a model's mistakes and hallucinations. To address this, we develop uncertainty quantification techniques with rigorous statistical guarantees for image-to-image regression problems. In particular, we show how… ▽ More

    Submitted 10 February, 2022; originally announced February 2022.

    Comments: Code available at https://github.com/aangelopoulos/im2im-uq

  19. arXiv:2202.03613  [pdf, other

    cs.LG q-bio.QM stat.ME

    Conformal prediction for the design problem

    Authors: Clara Fannjiang, Stephen Bates, Anastasios N. Angelopoulos, Jennifer Listgarten, Michael I. Jordan

    Abstract: Many applications of machine learning methods involve an iterative protocol in which data are collected, a model is trained, and then outputs of that model are used to choose what data to consider next. For example, one data-driven approach for designing proteins is to train a regression model to predict the fitness of protein sequences, then use it to propose new sequences believed to exhibit gre… ▽ More

    Submitted 31 May, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: for associated code, see https://github.com/clarafy/conformal-for-design

    Journal ref: Proc. Natl. Acad. Sci. 119 (43) e2204569119 (2022)

  20. arXiv:2201.13451  [pdf, other

    stat.ME stat.CO

    Nonlinear Regression with Residuals: Causal Estimation with Time-varying Treatments and Covariates

    Authors: Stephen Bates, Edward Kennedy, Robert Tibshirani, Valerie Ventura, Larry Wasserman

    Abstract: Standard regression adjustment gives inconsistent estimates of causal effects when there are time-varying treatment effects and time-varying covariates. Loosely speaking, the issue is that some covariates are post-treatment variables because they may be affected by prior treatment status, and regressing out post-treatment variables causes bias. More precisely, the bias is due to certain non-confou… ▽ More

    Submitted 10 March, 2024; v1 submitted 31 January, 2022; originally announced January 2022.

  21. arXiv:2201.11210  [pdf, other

    stat.ME

    Confidence Intervals for the Generalisation Error of Random Forests

    Authors: Samyak Rajanala, Stephen Bates, Trevor Hastie, Robert Tibshirani

    Abstract: Out-of-bag error is commonly used as an estimate of generalisation error in ensemble-based learning models such as random forests. We present confidence intervals for this quantity using the delta-method-after-bootstrap and the jackknife-after-bootstrap techniques. These methods do not require growing any additional trees. We show that these new confidence intervals have improved coverage properti… ▽ More

    Submitted 26 January, 2022; originally announced January 2022.

    Comments: 25 pages, 8 tables, 8 figures

  22. arXiv:2201.10547  [pdf, other

    cs.LG cs.AI cs.MA

    Optimal Data Selection: An Online Distributed View

    Authors: Mariel Werner, Anastasios Angelopoulos, Stephen Bates, Michael I. Jordan

    Abstract: The blessing of ubiquitous data also comes with a curse: the communication, storage, and labeling of massive, mostly redundant datasets. We seek to solve this problem at its core, collecting only valuable data and throwing out the rest via submodular maximization. Specifically, we develop algorithms for the online and distributed version of the problem, where data selection occurs in an uncoordina… ▽ More

    Submitted 14 December, 2023; v1 submitted 25 January, 2022; originally announced January 2022.

  23. arXiv:2110.01052  [pdf, other

    cs.LG cs.AI cs.CV stat.ME stat.ML

    Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

    Authors: Anastasios N. Angelopoulos, Stephen Bates, Emmanuel J. Candès, Michael I. Jordan, Lihua Lei

    Abstract: We introduce a framework for calibrating machine learning models so that their predictions satisfy explicit, finite-sample statistical guarantees. Our calibration algorithms work with any underlying model and (unknown) data-generating distribution and do not require model refitting. The framework addresses, among other examples, false discovery rate control in multi-label classification, intersect… ▽ More

    Submitted 29 September, 2022; v1 submitted 3 October, 2021; originally announced October 2021.

    Comments: Code available at https://github.com/aangelopoulos/ltt

  24. arXiv:2110.00816  [pdf, other

    cs.LG

    Calibrated Multiple-Output Quantile Regression with Representation Learning

    Authors: Shai Feldman, Stephen Bates, Yaniv Romano

    Abstract: We develop a method to generate predictive regions that cover a multivariate response variable with a user-specified probability. Our work is composed of two components. First, we use a deep generative model to learn a representation of the response that has a unimodal distribution. Existing multiple-output quantile regression approaches are effective in such cases, so we apply them on the learned… ▽ More

    Submitted 23 December, 2022; v1 submitted 2 October, 2021; originally announced October 2021.

  25. arXiv:2109.13412  [pdf, other

    cs.LG cs.CV

    Discriminative Attribution from Counterfactuals

    Authors: Nils Eckstein, Alexander S. Bates, Gregory S. X. E. Jefferis, Jan Funke

    Abstract: We present a method for neural network interpretability by combining feature attribution with counterfactual explanations to generate attribution maps that highlight the most discriminative features between pairs of classes. We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner, thus preventing potential observer bias.… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

  26. arXiv:2107.07511  [pdf, other

    cs.LG cs.AI math.ST stat.ME stat.ML

    A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification

    Authors: Anastasios N. Angelopoulos, Stephen Bates

    Abstract: Black-box machine learning models are now routinely used in high-risk settings, like medical diagnostics, which demand uncertainty quantification to avoid consequential model failures. Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models. Critically, the sets are valid in a distribution-free sense: they p… ▽ More

    Submitted 7 December, 2022; v1 submitted 15 July, 2021; originally announced July 2021.

    Comments: Blog and tutorial video at http://angelopoulos.ai/blog/posts/gentle-intro/ ; Code is available at https://github.com/aangelopoulos/conformal-prediction

  27. arXiv:2106.12012  [pdf, other

    cs.LG cs.DC stat.ML

    Test-time Collective Prediction

    Authors: Celestine Mendler-Dünner, Wenshuo Guo, Stephen Bates, Michael I. Jordan

    Abstract: An increasingly common setting in machine learning involves multiple parties, each with their own data, who want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents to make better predictions than they would individually, but may not be willing to release their data or model parameters. In this work, we explore a decentr… ▽ More

    Submitted 22 June, 2021; originally announced June 2021.

  28. arXiv:2106.00394  [pdf, other

    cs.LG

    Improving Conditional Coverage via Orthogonal Quantile Regression

    Authors: Shai Feldman, Stephen Bates, Yaniv Romano

    Abstract: We develop a method to generate prediction intervals that have a user-specified coverage level across all regions of feature-space, a property called conditional coverage. A typical approach to this task is to estimate the conditional quantiles with quantile regression -- it is well-known that this leads to correct coverage in the large-sample limit, although it may not be accurate in finite sampl… ▽ More

    Submitted 2 October, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

    Comments: 20 pages, 5 figures

  29. arXiv:2104.08279  [pdf, other

    stat.ME math.ST stat.ML

    Testing for Outliers with Conformal p-values

    Authors: Stephen Bates, Emmanuel Candès, Lihua Lei, Yaniv Romano, Matteo Sesia

    Abstract: This paper studies the construction of p-values for nonparametric outlier detection, taking a multiple-testing perspective. The goal is to test whether new independent samples belong to the same distribution as a reference data set or are outliers. We propose a solution based on conformal inference, a broadly applicable framework which yields p-values that are marginally valid but mutually depende… ▽ More

    Submitted 24 May, 2022; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: Revision May 24, 2022: added "asymptotic" and "Monte Carlo" conditional calibration methods; added power analyses; updated numerical experiments to include new methods

    Journal ref: Ann. Statist. 51(1): 149-178 (February 2023)

  30. arXiv:2104.00673  [pdf, other

    stat.ME math.ST stat.CO stat.ML

    Cross-validation: what does it estimate and how well does it do it?

    Authors: Stephen Bates, Trevor Hastie, Robert Tibshirani

    Abstract: Cross-validation is a widely-used technique to estimate prediction error, but its behavior is complex and not fully understood. Ideally, one would like to think that cross-validation estimates the prediction error for the model at hand, fit to the training data. We prove that this is not the case for the linear model fit by ordinary least squares; rather it estimates the average prediction error o… ▽ More

    Submitted 18 July, 2022; v1 submitted 1 April, 2021; originally announced April 2021.

  31. arXiv:2102.06202  [pdf, other

    cs.LG cs.AI cs.CR stat.ME stat.ML

    Private Prediction Sets

    Authors: Anastasios N. Angelopoulos, Stephen Bates, Tijana Zrnic, Michael I. Jordan

    Abstract: In real-world settings involving consequential decision-making, the deployment of machine learning systems generally requires both reliable uncertainty quantification and protection of individuals' privacy. We present a framework that treats these two desiderata jointly. Our framework is based on conformal prediction, a methodology that augments predictive models to return prediction sets that pro… ▽ More

    Submitted 3 March, 2024; v1 submitted 11 February, 2021; originally announced February 2021.

    Comments: Code available at https://github.com/aangelopoulos/private_prediction_sets

    Journal ref: Harvard Data Science Review, 4(2). 2022

  32. arXiv:2101.02703  [pdf, other

    cs.LG cs.AI cs.CV stat.ME stat.ML

    Distribution-Free, Risk-Controlling Prediction Sets

    Authors: Stephen Bates, Anastasios Angelopoulos, Lihua Lei, Jitendra Malik, Michael I. Jordan

    Abstract: While improving prediction accuracy has been the focus of machine learning in recent years, this alone does not suffice for reliable decision-making. Deploying learning systems in consequential settings also requires calibrating and communicating the uncertainty of predictions. To convey instance-wise uncertainty for prediction tasks, we show how to generate set-valued predictions from a black-box… ▽ More

    Submitted 4 August, 2021; v1 submitted 7 January, 2021; originally announced January 2021.

    Comments: Project website available at http://www.angelopoulos.ai/blog/posts/rcps/ and codebase available at https://github.com/aangelopoulos/rcps

  33. arXiv:2009.14193  [pdf, other

    cs.CV math.ST stat.ML

    Uncertainty Sets for Image Classifiers using Conformal Prediction

    Authors: Anastasios Angelopoulos, Stephen Bates, Jitendra Malik, Michael I. Jordan

    Abstract: Convolutional image classifiers can achieve high predictive accuracy, but quantifying their uncertainty remains an unresolved challenge, hindering their deployment in consequential settings. Existing uncertainty quantification techniques, such as Platt scaling, attempt to calibrate the network's probability estimates, but they do not have formal guarantees. We present an algorithm that modifies an… ▽ More

    Submitted 3 September, 2022; v1 submitted 29 September, 2020; originally announced September 2020.

    Comments: ICLR 2021 Spotlight, https://openreview.net/forum?id=eNdiU_DbM9 . Project website at https://people.eecs.berkeley.edu/~angelopoulos/blog/posts/conformal-classification/ . Codebase at https://github.com/aangelopoulos/conformal_classification

  34. arXiv:2006.04292  [pdf, other

    stat.ML cs.LG stat.ME

    Achieving Equalized Odds by Resampling Sensitive Attributes

    Authors: Yaniv Romano, Stephen Bates, Emmanuel J. Candès

    Abstract: We present a flexible framework for learning predictive models that approximately satisfy the equalized odds notion of fairness. This is achieved by introducing a general discrepancy functional that rigorously quantifies violations of this criterion. This differentiable functional is used as a penalty driving the model parameters towards equalized odds. To rigorously evaluate fitted models, we dev… ▽ More

    Submitted 7 June, 2020; originally announced June 2020.

    Comments: 14 pages, 4 figures

  35. Causal Inference in Genetic Trio Studies

    Authors: Stephen Bates, Matteo Sesia, Chiara Sabatti, Emmanuel Candes

    Abstract: We introduce a method to rigorously draw causal inferences---inferences immune to all possible confounding---from genetic data that include parents and offspring. Causal conclusions are possible with these data because the natural randomness in meiosis can be viewed as a high-dimensional randomized experiment. We make this observation actionable by develo** a novel conditional independence test… ▽ More

    Submitted 22 February, 2020; originally announced February 2020.

    Journal ref: Proc. Natl. Acad. Sci. U.S.A. 177 (2020) 24117-24126

  36. The High Time Resolution Universe Pulsar Survey -- XVI. Discovery and timing of 40 pulsars from the southern Galactic plane

    Authors: A. D. Cameron, D. J. Champion, M. Bailes, V. Balakrishnan, E. D. Barr, C. G. Bassa, S. Bates, S. Bhandari, N. D. R. Bhat, M. Burgay, S. Burke-Spolaor, C. M. L. Flynn, A. Jameson, S. Johnston, M. J. Keith, M. Kramer, L. Levin, A. G. Lyne, C. Ng, E. Petroff, A. Possenti, D. A. Smith, B. W. Stappers, W. van Straten, C. Tiburzi , et al. (1 additional authors not shown)

    Abstract: We present the results of processing an additional 44% of the High Time Resolution Universe South Low Latitude (HTRU-S LowLat) pulsar survey, the most sensitive blind pulsar survey of the southern Galactic plane to date. Our partially-coherent segmented acceleration search pipeline is designed to enable the discovery of pulsars in short, highly-accelerated orbits, while our 72-min integration leng… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

    Comments: 28 pages, 9 figures, 13 tables

  37. Uncooled Microbolometer Arrays for Ground Based Astronomy

    Authors: Maisie F. Rashman, Iain A. Steele, Stuart D. Bates, Dave Copley, Steven N. Longmore

    Abstract: We describe the design and commissioning of a simple prototype, low-cost 10$μ$m imaging instrument. The system is built using commercially available components including an uncooled microbolometer array as a detector. The incorporation of adjustable germanium reimaging optics rescale the image to the appropriate plate scale for the 2-m diameter Liverpool Telescope. From observations of bright sola… ▽ More

    Submitted 17 December, 2019; originally announced December 2019.

    Comments: Accepted for publication by MNRAS

  38. Metropolized Knockoff Sampling

    Authors: Stephen Bates, Emmanuel Candès, Lucas Janson, Wenshuo Wang

    Abstract: Model-X knockoffs is a wrapper that transforms essentially any feature importance measure into a variable selection algorithm, which discovers true effects while rigorously controlling the expected fraction of false positives. A frequently discussed challenge to apply this method is to construct knockoff variables, which are synthetic variables obeying a crucial exchangeability property with the e… ▽ More

    Submitted 1 March, 2019; originally announced March 2019.

    Journal ref: Journal of the American Statistical Association, 116:535, 1413-1427, 2021

  39. The High Time Resolution Universe Pulsar Survey -- XV: completion of the intermediate latitude survey with the discovery and timing of 25 further pulsars

    Authors: M. Burgay, B. Stappers, M. Bailes, E. D. Barr, S. Bates, N. D. R. Bhat, S. Burke-Spolaor, A. D. Cameron, D. J. Champion, R. P. Eatough, C. M. L. Flynn, A. Jameson, S. Johnston, M. J. Keith, E. F. Keane, M. Kramer, L. Levin, C. Ng, E. Petroff, A. Possenti, W. van Straten, C. Tiburzi, L. Bondonneau, A. G. Lyne

    Abstract: We report on the latest six pulsars discovered through our standard pipeline in the intermediate-latitude region (|b| < 15 deg) of the Parkes High Time Resolution Universe Survey (HTRU). We also present timing solutions for the new discoveries and for 19 further pulsars for which only discovery parameters were previously published. Highlights of the presented sample include the isolated millisecon… ▽ More

    Submitted 14 February, 2019; originally announced February 2019.

    Comments: Accepted for publication in MNRAS; 12 pages, 9 figures, 7 tables

  40. arXiv:1811.04929  [pdf, other

    astro-ph.IM astro-ph.HE

    The High Time Resolution Universe survey XIV: Discovery of 23 pulsars through GPU-accelerated reprocessing

    Authors: V. Morello, E. D. Barr, S. Cooper, M. Bailes, S. Bates, N. D. R. Bhat, M. Burgay, S. Burke-Spolaor, A. D. Cameron, D. J. Champion, R. P. Eatough, C. M. L. Flynn, A. Jameson, S. Johnston, M. J. Keith, E. F. Keane, M. Kramer, L. Levin, C. Ng, E. Petroff, A. Possenti, B. W. Stappers, W. van Straten, C. Tiburzi

    Abstract: We have performed a new search for radio pulsars in archival data of the intermediate and high Galactic latitude parts of the Southern High Time Resolution Universe pulsar survey. This is the first time the entire dataset has been searched for binary pulsars, an achievement enabled by GPU-accelerated dedispersion and periodicity search codes nearly 50 times faster than the previously used pipeline… ▽ More

    Submitted 12 November, 2018; originally announced November 2018.

    Comments: Accepted for publication in MNRAS, 14 pages, 5 figures, 10 tables

  41. A fast radio burst with a low dispersion measure

    Authors: E. Petroff, L. C. Oostrum, B. W. Stappers, M. Bailes, E. D. Barr, S. Bates, S. Bhandari, N. D. R. Bhat, M. Burgay, S. Burke-Spolaor, A. D. Cameron, D. J. Champion, R. P. Eatough, C. M. L. Flynn, A. Jameson, S. Johnston, E. F. Keane, M. J. Keith, L. Levin, V. Morello, C. Ng, A. Possenti, V. Ravi, W. van Straten, D. Thornton , et al. (1 additional authors not shown)

    Abstract: Fast radio bursts (FRBs) are millisecond pulses of radio emission of seemingly extragalactic origin. More than 50 FRBs have now been detected, with only one seen to repeat. Here we present a new FRB discovery, FRB 110214, which was detected in the high latitude portion of the High Time Resolution Universe South survey at the Parkes telescope. FRB 110214 has one of the lowest dispersion measures of… ▽ More

    Submitted 25 October, 2018; originally announced October 2018.

    Comments: 8 pages, 3 figures, accepted for publication in MNRAS

  42. Stable Frank-Kasper phases of self-assembled, soft matter spheres

    Authors: Abhiram Reddy, Michael B. Buckley, Akash Arora, Frank S. Bates, Kevin D. Dorfman, Gregory M. Grason

    Abstract: Single molecular species can self-assemble into Frank Kasper (FK) phases, finite approximants of dodecagonal quasicrystals, defying intuitive notions that thermodynamic ground states are maximally symmetric. FK phases are speculated to emerge as the minimal-distortional packings of space-filling spherical domains, but a precise quantitation of this distortion and how it affects assembly thermodyna… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

    Comments: 40 pages, 22 figures

  43. Log-ratio Lasso: Scalable, Sparse Estimation for Log-ratio Models

    Authors: Stephen Bates, Robert Tibshirani

    Abstract: Positive-valued signal data is common in many biological and medical applications, where the data are often generated from imaging techniques such as mass spectrometry. In such a setting, the relative intensities of the raw features are often the scientifically meaningful quantities, so it is of interest to identify relevant features that take the form of log-ratios of the raw inputs. When includi… ▽ More

    Submitted 4 September, 2017; originally announced September 2017.

    Journal ref: Biometrics 109 (2019) 613-624

  44. arXiv:1704.04760  [pdf

    cs.AR cs.LG cs.NE

    In-Datacenter Performance Analysis of a Tensor Processing Unit

    Authors: Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg , et al. (50 additional authors not shown)

    Abstract: Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOp… ▽ More

    Submitted 16 April, 2017; originally announced April 2017.

    Comments: 17 pages, 11 figures, 8 tables. To appear at the 44th International Symposium on Computer Architecture (ISCA), Toronto, Canada, June 24-28, 2017

  45. arXiv:1605.08238  [pdf, ps, other

    astro-ph.IM astro-ph.EP

    LOTUS: A low cost, ultraviolet spectrograph

    Authors: I. A. Steele, J. M. Marchant, H. E. Jermak, R. M. Barnsley, S. D. Bates, N. R. Clay, A. Fitzsimmons, E. Jehin, G. Jones, C. J. Mottram, R. J. Smith, C. Snodgrass, M. de Val-Borro

    Abstract: We describe the design, construction and commissioning of LOTUS; a simple, low-cost long-slit spectrograph for the Liverpool Telescope. The design is optimized for near-UV and visible wavelengths and uses all transmitting optics. It exploits the instrument focal plane field curvature to partially correct axial chromatic aberration. A stepped slit provides narrow (2.5x95 arcsec) and wide (5x25 arcs… ▽ More

    Submitted 26 May, 2016; originally announced May 2016.

    Comments: Accepted for publication in MNRAS. 10 pages. 14 figures

  46. arXiv:1603.01151  [pdf, ps, other

    astro-ph.HE astro-ph.IM astro-ph.SR

    New Discoveries from the Arecibo 327 MHz Drift Pulsar Survey Radio Transient Search

    Authors: J. S. Deneva, K. Stovall, M. A. McLaughlin, M. Bagchi, S. D. Bates, P. C. C. Freire, J. G. Martinez, F. Jenet, N. Garver-Daniels

    Abstract: We present Clusterrank, a new algorithm for identifying dispersed astrophysical pulses. Such pulses are commonly detected from Galactic pulsars and rotating radio transients (RRATs), which are neutron stars with sporadic radio emission. More recently, isolated, highly dispersed pulses dubbed fast radio bursts (FRBs) have been identified as the potential signature of an extragalactic cataclysmic ra… ▽ More

    Submitted 30 March, 2016; v1 submitted 2 March, 2016; originally announced March 2016.

    Comments: 41 pages, 16 figures, 4 tables, accepted by ApJ; added minor corrections to final ApJ proof

  47. Five new Fast Radio Bursts from the HTRU high latitude survey: first evidence for two-component bursts

    Authors: D. J. Champion, E. Petroff, M. Kramer, M. J. Keith, M. Bailes, E. D. Barr, S. D. Bates, N. D. R. Bhat, M. Burgay, S. Burke-Spolaor, C. M. L. Flynn, A. Jameson, S. Johnston, C. Ng, L. Levin, A. Possenti, B. W. Stappers, W. van Straten, C. Tiburzi, A. G. Lyne

    Abstract: The detection of five new fast radio bursts (FRBs) found in the High Time Resolution Universe high latitude survey is presented. The rate implied is 6$^{+4}_{-3}\times~10^3$ (95%) FRBs sky$^{-1}$ day$^{-1}$ above a fluence of between 0.13 and 5.9 Jy ms for FRBs between 0.128 and 262 ms in duration. One of these FRBs has a clear two-component profile, each component is similar to the known populati… ▽ More

    Submitted 24 November, 2015; originally announced November 2015.

    Comments: 5 pages, 1 figure, 1 table, submitted to MNRAS

  48. arXiv:1509.08805  [pdf, ps, other

    astro-ph.HE astro-ph.SR gr-qc

    Pulsar J0453+1559: A Double Neutron Star System with a Large Mass Asymmetry

    Authors: J. G. Martinez, K. Stovall, P. C. C. Freire, J. S. Deneva, F. A. Jenet, M. A. McLaughlin, M. Bagchi, S. D. Bates, A. Ridolfi

    Abstract: To understand the nature of supernovae and neutron star (NS) formation, as well as binary stellar evolution and their interactions, it is important to probe the distribution of NS masses. Until now, all double NS (DNS) systems have been measured to have a mass ratio close to unity (q $\geq$ 0.91). Here we report the measurement of the individual masses of the 4.07-day binary pulsar J0453+1559 from… ▽ More

    Submitted 29 September, 2015; originally announced September 2015.

  49. arXiv:1507.00906  [pdf, ps, other

    astro-ph.IM

    IO:I: A Near-Infrared Camera for the Liverpool Telescope

    Authors: Robert Barnsley, Helen Jermak, Iain Steele, Robert Smith, Stuart Bates, Chris Mottram

    Abstract: IO:I is a new instrument that has recently been commissioned for the Liverpool Telescope, extending current imaging capabilities beyond the optical and into the near infrared. Cost has been minimised by use of a previously decommissioned instrument's cryostat as the base for a prototype and retrofitting it with Teledyne's 1.7$μm$ cutoff Hawaii-2RG HgCdTe detector, SIDECAR ASIC controller and JADE2… ▽ More

    Submitted 21 December, 2015; v1 submitted 3 July, 2015; originally announced July 2015.

    Comments: v1: 35 pages, 18 figures. Submitted to the Journal of Astronomical Telescopes, Instruments, and Systems (JATIS). v2: post peer review

  50. arXiv:1505.00834  [pdf, ps, other

    astro-ph.HE astro-ph.SR

    A search for rotating radio transients and fast radio bursts in the Parkes high-latitude pulsar survey

    Authors: A. Rane, D. R. Lorimer, S. D. Bates, N. McMann, M. A. McLaughlin, K. Rajwade

    Abstract: Discoveries of rotating radio transients and fast radio bursts (FRBs) in pulsar surveys suggest that more of such transient sources await discovery in archival data sets. Here we report on a single-pulse search for dispersed radio bursts over a wide range of Galactic latitudes (|b| < $60^{\circ}$) in data previously searched for periodic sources by Burgay et al. We re-detected 20 of the 42 pulsars… ▽ More

    Submitted 15 October, 2015; v1 submitted 4 May, 2015; originally announced May 2015.

    Comments: Accepted, 10 pages, 6 figures