Skip to main content

Showing 1–13 of 13 results for author: Barber, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15107  [pdf, ps, other

    stat.ML cs.LG math.ST

    Is Algorithmic Stability Testable? A Unified Framework under Computational Constraints

    Authors: Yuetian Luo, Rina Foygel Barber

    Abstract: Algorithmic stability is a central notion in learning theory that quantifies the sensitivity of an algorithm to small changes in the training data. If a learning algorithm satisfies certain stability properties, this leads to many important downstream implications, such as generalization, robustness, and reliable predictive inference. Verifying that stability holds for a particular algorithm is th… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  2. arXiv:2405.14064  [pdf, other

    stat.ML cs.LG math.ST

    Building a stable classifier with the inflated argmax

    Authors: Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: We propose a new framework for algorithmic stability in the context of multiclass classification. In practice, classification algorithms often operate by first assigning a continuous score (for instance, an estimated probability) to each possible label, then taking the maximizer -- i.e., selecting the class that has the highest score. A drawback of this type of approach is that it is inherently un… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  3. arXiv:2402.07388  [pdf, ps, other

    math.ST cs.LG stat.ML

    The Limits of Assumption-free Tests for Algorithm Performance

    Authors: Yuetian Luo, Rina Foygel Barber

    Abstract: Algorithm evaluation and comparison are fundamental questions in machine learning and statistics -- how well does an algorithm perform at a given modeling task, and which algorithm performs best? Many methods have been developed to assess algorithm performance, often based around cross-validation type strategies, retraining the algorithm of interest on different subsets of the data and assessing i… ▽ More

    Submitted 1 March, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  4. arXiv:2402.01139  [pdf, other

    stat.ML cs.LG stat.ME

    Online conformal prediction with decaying step sizes

    Authors: Anastasios N. Angelopoulos, Rina Foygel Barber, Stephen Bates

    Abstract: We introduce a method for online conformal prediction with decaying step sizes. Like previous methods, ours possesses a retrospective guarantee of coverage for arbitrary sequences. However, unlike previous methods, we can simultaneously estimate a population quantile when it exists. Our theory and experiments indicate substantially improved practical properties: in particular, when the distributio… ▽ More

    Submitted 28 May, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  5. arXiv:2301.12600  [pdf, other

    stat.ML cs.LG math.ST

    Bagging Provides Assumption-free Stability

    Authors: Jake A. Soloff, Rina Foygel Barber, Rebecca Willett

    Abstract: Bagging is an important technique for stabilizing machine learning models. In this paper, we derive a finite-sample guarantee on the stability of bagging for any model. Our result places no assumptions on the distribution of the data, on the properties of the base algorithm, or on the dimensionality of the covariates. Our guarantee applies to many variants of bagging and is optimal up to a constan… ▽ More

    Submitted 25 April, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

  6. arXiv:2111.15546  [pdf, ps, other

    cs.LG math.ST

    Black-box tests for algorithmic stability

    Authors: Byol Kim, Rina Foygel Barber

    Abstract: Algorithmic stability is a concept from learning theory that expresses the degree to which changes to the input data (e.g., removal of a single data point) may affect the outputs of a regression algorithm. Knowing an algorithm's stability properties is often useful for many downstream applications -- for example, stability is known to lead to desirable generalization properties and predictive infe… ▽ More

    Submitted 21 December, 2022; v1 submitted 30 November, 2021; originally announced November 2021.

    Comments: 37 pages. Minor edits to match the journal-submitted version

  7. arXiv:1908.05428  [pdf, other

    stat.ME cs.CY stat.AP stat.ML

    With Malice Towards None: Assessing Uncertainty via Equalized Coverage

    Authors: Yaniv Romano, Rina Foygel Barber, Chiara Sabatti, Emmanuel J. Candès

    Abstract: An important factor to guarantee a fair use of data-driven recommendation systems is that we should be able to communicate their uncertainty to decision makers. This can be accomplished by constructing prediction intervals, which provide an intuitive measure of the limits of predictive performance. To support equitable treatment, we force the construction of such intervals to be unbiased in the se… ▽ More

    Submitted 15 August, 2019; originally announced August 2019.

    Comments: 14 pages, 1 figure, 1 table

  8. arXiv:1908.01908  [pdf, other

    cs.DB

    WiSer: A Highly Available HTAP DBMS for IoT Applications

    Authors: Ronald Barber, Christian Garcia-Arellano, Ronen Grosman, Guy Lohman, C. Mohan, Rene Muller, Hamid Pirahesh, Vijayshankar Raman, Richard Sidle, Adam Storm, Yuanyuan Tian, Pinar Tozun, Yingjun Wu

    Abstract: In a classic transactional distributed database management system (DBMS), write transactions invariably synchronize with a coordinator before final commitment. While enforcing serializability, this model has long been criticized for not satisfying the applications' availability requirements. When entering the era of Internet of Things (IoT), this problem has become more severe, as an increasing nu… ▽ More

    Submitted 5 August, 2019; originally announced August 2019.

  9. arXiv:1903.11203  [pdf, other

    cs.DB

    Designing Succinct Secondary Indexing Mechanism by Exploiting Column Correlations (Extended Version)

    Authors: Yingjun Wu, Jia Yu, Yuanyuan Tian, Richard Sidle, Ronald Barber

    Abstract: Database administrators construct secondary indexes on data tables to accelerate query processing in relational database management systems (RDBMSs). These indexes are built on top of the most frequently queried columns according to the data statistics. Unfortunately, maintaining multiple secondary indexes in the same database can be extremely space consuming, causing significant performance degra… ▽ More

    Submitted 1 April, 2019; v1 submitted 26 March, 2019; originally announced March 2019.

    Comments: To appear in SIGMOD 2019

  10. arXiv:1805.06439  [pdf, other

    stat.ML cs.LG

    Prediction Rule Resha**

    Authors: Matt Bonakdarpour, Sabyasachi Chatterjee, Rina Foygel Barber, John Lafferty

    Abstract: Two methods are proposed for high-dimensional shape-constrained regression and classification. These methods reshape pre-trained prediction rules to satisfy shape constraints like monotonicity and convexity. The first method can be applied to any pre-trained prediction rule, while the second method deals specifically with random forests. In both cases, efficient algorithms are developed for comput… ▽ More

    Submitted 16 May, 2018; originally announced May 2018.

  11. arXiv:1502.07641  [pdf, other

    math.ST cs.LG

    ROCKET: Robust Confidence Intervals via Kendall's Tau for Transelliptical Graphical Models

    Authors: Rina Foygel Barber, Mladen Kolar

    Abstract: Undirected graphical models are used extensively in the biological and social sciences to encode a pattern of conditional independences between variables, where the absence of an edge between two nodes $a$ and $b$ indicates that the corresponding two variables $X_a$ and $X_b$ are believed to be conditionally independent, after controlling for all other measured variables. In the Gaussian case, con… ▽ More

    Submitted 1 September, 2017; v1 submitted 26 February, 2015; originally announced February 2015.

  12. arXiv:1412.4451  [pdf, ps, other

    math.ST cs.IT

    Privacy and Statistical Risk: Formalisms and Minimax Bounds

    Authors: Rina Foygel Barber, John C. Duchi

    Abstract: We explore and compare a variety of definitions for privacy and disclosure limitation in statistical estimation and data analysis, including (approximate) differential privacy, testing-based definitions of privacy, and posterior guarantees on disclosure risk. We give equivalence results between the definitions, shedding light on the relationships between different formalisms for privacy. We also t… ▽ More

    Submitted 14 December, 2014; originally announced December 2014.

    Comments: 29 pages

  13. arXiv:1107.1974  [pdf

    cs.SI physics.soc-ph

    On an Efficient Marie Curie Initial Training Network

    Authors: Ali Dinler, Cengis Hasan, Kamil Orucoglu, Robert W. Barber

    Abstract: Collaboration in science is one of the key components of world-class research. The European Commission supports collaboration between institutions and funds young researchers appointed by these partner institutions. In these networks, the mobility of the researchers is enforced in order to enhance the collaboration. In this study, based on a real Marie Curie Initial Training Network, an algorithm… ▽ More

    Submitted 11 July, 2011; originally announced July 2011.

    Comments: Proceedings of the International Conference on Mathematical Finance and Economics (ICMFE-2011), Istanbul, Turkey, 6-8 July 2011