Skip to main content

Showing 1–24 of 24 results for author: Benjamini, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2404.00319  [pdf, other

    stat.ME stat.AP

    Direction Preferring Confidence Intervals

    Authors: Tzviel Frostig, Yoav Benjamini, Ruth Heller

    Abstract: Confidence intervals (CIs) are instrumental in statistical analysis, providing a range estimate of the parameters. In modern statistics, selective inference is common, where only certain parameters are highlighted. However, this selective approach can bias the inference, leading some to advocate for the use of CIs over p-values. To increase the flexibility of confidence intervals, we introduce dir… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 11 figures, 45 pages

    MSC Class: 62P10

  2. arXiv:2307.15361  [pdf, other

    stat.ML cs.AI cs.LG

    Confident Feature Ranking

    Authors: Bitya Neuhof, Yuval Benjamini

    Abstract: Machine learning models are widely applied in various fields. Stakeholders often use post-hoc feature importance methods to better understand the input features' contribution to the models' predictions. The interpretation of the importance values provided by these methods is frequently based on the relative order of the features (their ranking) rather than the importance values themselves. Since t… ▽ More

    Submitted 18 April, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

  3. arXiv:2303.13330  [pdf, other

    stat.ME stat.ML

    Logistic Regression Equivalence: A Framework for Comparing Logistic Regression Models Across Populations

    Authors: Guy Ashiri-Prossner, Yuval Benjamini

    Abstract: In this paper we discuss how to evaluate the differences between fitted logistic regression models across sub-populations. Our motivating example is in studying computerized diagnosis for learning disabilities, where sub-populations based on gender may or may not require separate models. In this context, significance tests for hypotheses of no difference between populations may provide perverse in… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

  4. arXiv:2111.07444  [pdf, other

    stat.AP stat.ME

    Detecting Differences Between Correlation-Matrix Populations due to Single-variable Perturbations, with Application to Resting State fMRI

    Authors: Itamar Faran, Michael Peer, Shahar Arzy, Yuval Benjamini

    Abstract: Correlation matrices provide a useful way to characterize variable dependencies in many real-world problems. Often, a perturbation in few variables can lead to small differences in multiple correlation coefficients related to these variables. In this paper we propose a low-dimensional representation of these differences as a product of single-variable perturbations that can efficiently characteriz… ▽ More

    Submitted 14 November, 2021; originally announced November 2021.

  5. arXiv:2010.15011  [pdf, other

    cs.LG cs.AI stat.ML

    Predicting Classification Accuracy When Adding New Unobserved Classes

    Authors: Yuli Slavutsky, Yuval Benjamini

    Abstract: Multiclass classifiers are often designed and evaluated only on a sample from the classes on which they will eventually be applied. Hence, their final accuracy remains unknown. In this work we study how a classifier's performance over the initial class sample can be used to extrapolate its expected accuracy on a larger, unobserved set of classes. For this, we define a measure of separation between… ▽ More

    Submitted 9 March, 2021; v1 submitted 28 October, 2020; originally announced October 2020.

    Journal ref: International Conference on Learning Representations (ICLR), 2021

  6. arXiv:2006.11585  [pdf

    stat.ME

    Ignored evident multiplicity harms replicability -- adjusting for it offers a remedy

    Authors: Yoav Zeevi, Sofi Astashenko, Yoav Benjamini

    Abstract: It is a central dogma in science that a result of a study should be replicable. Only 90 of the 190 replications attempts were successful. We attribute a substantial part of the problem to selective inference evident in the paper, which is the practice of selecting some of the results from the many. 100 papers in the Reproducibility Project in Psychology were analyzed. It was evident that the repor… ▽ More

    Submitted 19 May, 2021; v1 submitted 20 June, 2020; originally announced June 2020.

    Comments: 28 pages, 2 figures, 1 table

  7. arXiv:1912.10472  [pdf, other

    stat.ME

    Testing the equality of multivariate means when $p>n$ by combining the Hoteling and Simes tests

    Authors: Tzviel Frostig, Yoav Benjamini

    Abstract: We propose a method of testing the shift between mean vectors of two multivariate Gaussian random variables in a high-dimensional setting incorporating the possible dependency and allowing $p > n$. This method is a combination of two well-known tests: the Hotelling test and the Simes test. The tests are integrated by sampling several dimensions at each iteration, testing each using the Hotelling t… ▽ More

    Submitted 22 December, 2019; originally announced December 2019.

  8. arXiv:1907.06856  [pdf, other

    stat.ME

    Quantifying replicability and consistency in systematic reviews

    Authors: Iman Jaljuli, Yoav Benjamini, Liat Shenhav, Orestis Panagiotou, Ruth Heller

    Abstract: Systematic reviews of interventions are important tools for synthesizing evidence from multiple studies. They serve to increase power and improve precision, in the same way that larger studies can do, but also to establish the consistency of effects and replicability of results across studies which are not identical. In this work we suggest to incorporate replicability analysis tools to quantify t… ▽ More

    Submitted 18 April, 2021; v1 submitted 16 July, 2019; originally announced July 2019.

  9. arXiv:1906.00505  [pdf, other

    stat.ME math.ST stat.ML

    Confidence Intervals for Selected Parameters

    Authors: Yoav Benjamini, Yotam Hechtlinger, Philip B. Stark

    Abstract: Practical or scientific considerations often lead to selecting a subset of parameters as ``important.'' Inferences about those parameters often are based on the same data used to select them in the first place. That can make the reported uncertainties deceptively optimistic: confidence intervals that ignore selection generally have less than their nominal coverage probability. Controlling the prob… ▽ More

    Submitted 2 June, 2019; originally announced June 2019.

    Comments: 36 pages, 11 figures

  10. arXiv:1712.09713  [pdf, other

    stat.ML cs.CV cs.LG

    Extrapolating Expected Accuracies for Large Multi-Class Problems

    Authors: Charles Zheng, Rakesh Achanta, Yuval Benjamini

    Abstract: The difficulty of multi-class classification generally increases with the number of classes. Using data from a subset of the classes, can we predict how well a classifier will scale with an increased number of classes? Under the assumptions that the classes are sampled identically and independently from a population, and that the classifier is based on independently learned scoring functions, we s… ▽ More

    Submitted 27 December, 2017; originally announced December 2017.

    Comments: Submitted to JMLR

  11. arXiv:1705.07529  [pdf, other

    stat.ME

    Testing hypotheses on a tree: new error rates and controlling strategies

    Authors: Marina Bogomolov, Christine B. Peterson, Yoav Benjamini, Chiara Sabatti

    Abstract: We introduce a multiple testing procedure (TreeBH) which addresses the challenge of controlling error rates at multiple levels of resolution. Conceptually, we frame this problem as the selection of hypotheses which are organized hierarchically in a tree structure. We describe a fast algorithm for the proposed sequential procedure, and prove that it controls relevant error rates given certain assum… ▽ More

    Submitted 23 October, 2018; v1 submitted 21 May, 2017; originally announced May 2017.

  12. Better-Than-Chance Classification for Signal Detection

    Authors: Jonathan D. Rosenblatt, Yuval Benjamini, Roee Gilron, Roy Mukamel, Jelle J. Goeman

    Abstract: The estimated accuracy of a classifier is a random quantity with variability. A common practice in supervised machine learning, is thus to test if the estimated accuracy is significantly better than chance level. This method of signal detection is particularly popular in neuroimaging and genetics. We provide evidence that using a classifier's accuracy as a test statistic can be an underpowered str… ▽ More

    Submitted 14 December, 2017; v1 submitted 31 August, 2016; originally announced August 2016.

  13. arXiv:1606.05229  [pdf, other

    stat.ML cs.IT

    Estimating mutual information in high dimensions via classification error

    Authors: Charles Y. Zheng, Yuval Benjamini

    Abstract: Multivariate pattern analyses approaches in neuroimaging are fundamentally concerned with investigating the quantity and type of information processed by various regions of the human brain; typically, estimates of classification accuracy are used to quantify information. While a extensive and powerful library of methods can be applied to train and assess classifiers, it is not always clear how to… ▽ More

    Submitted 10 October, 2016; v1 submitted 16 June, 2016; originally announced June 2016.

  14. arXiv:1606.05228  [pdf, other

    stat.ML cs.CV cs.IT cs.LG

    How many faces can be recognized? Performance extrapolation for multi-class classification

    Authors: Charles Y. Zheng, Rakesh Achanta, Yuval Benjamini

    Abstract: The difficulty of multi-class classification generally increases with the number of classes. Using data from a subset of the classes, can we predict how well a classifier will scale with an increased number of classes? Under the assumption that the classes are sampled exchangeably, and under the assumption that the classifier is generative (e.g. QDA or Naive Bayes), we show that the expected accur… ▽ More

    Submitted 16 June, 2016; originally announced June 2016.

    Comments: Submitted to NIPS 2016

  15. arXiv:1504.00701  [pdf, other

    stat.AP

    Many Phenotypes without Many False Discoveries: Error Controlling Strategies for Multi-Traits Association Studies

    Authors: Christine Peterson, Marina Bogomolov, Yoav Benjamini, Chiara Sabatti

    Abstract: The genetic basis of multiple phenotypes such as gene expression, metabolite levels, or imaging features is often investigated by testing a large collection of hypotheses, probing the existence of association between each of the traits and hundreds of thousands of genotyped variants. Appropriate multiplicity adjustment is crucial to guarantee replicability of findings, and False Discovery Rate (FD… ▽ More

    Submitted 2 April, 2015; originally announced April 2015.

  16. arXiv:1503.02278  [pdf, ps, other

    stat.ME

    Testing for replicability in a follow-up study when the primary study hypotheses are two-sided

    Authors: Ruth Heller, Marina Bogomolov, Yoav Benjamini, Tamar Sofer

    Abstract: When testing for replication of results from a primary study with two-sided hypotheses in a follow-up study, we are usually interested in discovering the features with discoveries in the same direction in the two studies. The direction of testing in the follow-up study for each feature can therefore be decided by the primary study. We prove that in this case the methods suggested in Heller, Bogomo… ▽ More

    Submitted 8 March, 2015; originally announced March 2015.

    Comments: arXiv admin note: text overlap with arXiv:1310.0606

  17. arXiv:1502.00088  [pdf, other

    stat.AP

    Quantifying replicability in systematic reviews: the r-value

    Authors: Liat Shenhav, Ruth Heller, Yoav Benjamini

    Abstract: In order to assess the effect of a health care intervention, it is useful to look at an ensemble of relevant studies. The Cochrane Collaboration's admirable goal is to provide systematic reviews of all relevant clinical studies, in order to establish whether or not there is a conclusive evidence about a specific intervention. This is done mainly by conducting a meta-analysis: a statistical synthes… ▽ More

    Submitted 10 May, 2015; v1 submitted 31 January, 2015; originally announced February 2015.

  18. arXiv:1412.3242  [pdf, other

    stat.ME

    Selective Correlations - the conditional estimators

    Authors: Yoav Benjamini, Amit Meir

    Abstract: The problem of Voodoo correlations is recognized in neuroimaging as the problem of estimating quantities of interest from the same data that was used to select them as interesting. In statistical terminology, the problem of inference following selection from the same data is that of selective inference. Motivated by the unwelcome side-effects of the recommended remedy- splitting the data. A method… ▽ More

    Submitted 10 December, 2014; originally announced December 2014.

    Comments: 18 pages, 10 figures

  19. The shuffle estimator for explainable variance in fMRI experiments

    Authors: Yuval Benjamini, Bin Yu

    Abstract: In computational neuroscience, it is important to estimate well the proportion of signal variance in the total variance of neural activity measurements. This explainable variance measure helps neuroscientists assess the adequacy of predictive models that describe how images are encoded in the brain. Complicating the estimation problem are strong noise correlations, which may confound the neural re… ▽ More

    Submitted 13 January, 2014; originally announced January 2014.

    Comments: Published in at http://dx.doi.org/10.1214/13-AOAS681 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS681

    Journal ref: Annals of Applied Statistics 2013, Vol. 7, No. 4, 2007-2033

  20. arXiv:1310.0606  [pdf, ps, other

    stat.AP stat.ME

    Deciding whether follow-up studies have replicated findings in a preliminary large-scale "omics' study"

    Authors: Ruth Heller, Marina Bogomolov, Yoav Benjamini

    Abstract: We propose a formal method to declare that findings from a primary study have been replicated in a follow-up study. Our proposal is appropriate for primary studies that involve large-scale searches for rare true positives (i.e. needles in a haystack). Our proposal assigns an $r$-value to each finding; this is the lowest false discovery rate at which the finding can be called replicated. Examples a… ▽ More

    Submitted 10 June, 2014; v1 submitted 2 October, 2013; originally announced October 2013.

    Journal ref: Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2014 vol. 111 no. 46, 16262-16267

  21. Revisiting Multi-Subject Random Effects in fMRI: Advocating Prevalence Estimation

    Authors: Jonathan D. Rosenblatt, Matthijs Vink, Yoav Benjamini

    Abstract: Random Effects analysis has been introduced into fMRI research in order to generalize findings from the study group to the whole population. Generalizing findings is obviously harder than detecting activation in the study group since in order to be significant, an activation has to be larger than the inter-subject variability. Indeed, detected regions are smaller when using random effect analysis… ▽ More

    Submitted 31 March, 2013; v1 submitted 14 December, 2012; originally announced December 2012.

  22. High-throughput data analysis in behavior genetics

    Authors: Anat Sakov, Ilan Golani, Dina Lipkind, Yoav Benjamini

    Abstract: In recent years, a growing need has arisen in different fields for the development of computational systems for automated analysis of large amounts of data (high-throughput). Dealing with nonstandard noise structure and outliers, that could have been detected and corrected in manual analysis, must now be built into the system with the aid of robust methods. We discuss such problems and present ins… ▽ More

    Submitted 9 November, 2010; originally announced November 2010.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOAS304 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS304

    Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 2, 743-763

  23. A simple forward selection procedure based on false discovery rate control

    Authors: Yoav Benjamini, Yulia Gavrilov

    Abstract: We propose the use of a new false discovery rate (FDR) controlling procedure as a model selection penalized method, and compare its performance to that of other penalized methods over a wide range of realistic settings: nonorthogonal design matrices, moderate and large pool of explanatory variables, and both sparse and nonsparse models, in the sense that they may include a small and large fracti… ▽ More

    Submitted 18 May, 2009; originally announced May 2009.

    Comments: Published in at http://dx.doi.org/10.1214/08-AOAS194 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS194

    Journal ref: Annals of Applied Statistics 2009, Vol. 3, No. 1, 179-198

  24. Comment: Microarrays, Empirical Bayes and the Two-Groups Model

    Authors: Yoav Benjamini

    Abstract: Comment on ``Microarrays, Empirical Bayes and the Two-Groups Model'' [arXiv:0808.0572]

    Submitted 5 August, 2008; originally announced August 2008.

    Comments: Published in at http://dx.doi.org/10.1214/07-STS236B the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-STS-STS236B

    Journal ref: Statistical Science 2008, Vol. 23, No. 1, 23-28