Skip to main content

Showing 1–25 of 25 results for author: Dick, T

.
  1. arXiv:2406.02797  [pdf, other

    cs.LG cs.CR

    Auditing Privacy Mechanisms via Label Inference Attacks

    Authors: Róbert István Busa-Fekete, Travis Dick, Claudio Gentile, Andrés Muñoz Medina, Adam Smith, Marika Swanberg

    Abstract: We propose reconstruction advantage measures to audit label privatization mechanisms. A reconstruction advantage measure quantifies the increase in an attacker's ability to infer the true label of an unlabeled example when provided with a private version of the labels in a dataset (e.g., aggregate of labels from different users or noisy labels output by randomized response), compared to an attacke… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  2. arXiv:2306.00920  [pdf, other

    cs.LG cs.CR

    Better Private Linear Regression Through Better Private Feature Selection

    Authors: Travis Dick, Jennifer Gillenwater, Matthew Joseph

    Abstract: Existing work on differentially private linear regression typically assumes that end users can precisely set data bounds or algorithmic hyperparameters. End users often struggle to meet these requirements without directly examining the data (and violating privacy). Recent work has attempted to develop solutions that shift these burdens from users to algorithms, but they struggle to provide utility… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  3. arXiv:2304.07210  [pdf, other

    cs.CR cs.LG

    Measuring Re-identification Risk

    Authors: CJ Carey, Travis Dick, Alessandro Epasto, Adel Javanmard, Josh Karlin, Shankar Kumar, Andres Munoz Medina, Vahab Mirrokni, Gabriel Henrique Nunes, Sergei Vassilvitskii, Peilin Zhong

    Abstract: Compact user representations (such as embeddings) form the backbone of personalization services. In this work, we present a new theoretical framework to measure re-identification risk in such user representations. Our framework, based on hypothesis testing, formally bounds the probability that an attacker may be able to obtain the identity of a user from their representation. As an application, we… ▽ More

    Submitted 31 July, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

  4. arXiv:2303.01262  [pdf, other

    cs.LG cs.CR cs.IT

    Subset-Based Instance Optimality in Private Estimation

    Authors: Travis Dick, Alex Kulesza, Ziteng Sun, Ananda Theertha Suresh

    Abstract: We propose a new definition of instance optimality for differentially private estimation algorithms. Our definition requires an optimal algorithm to compete, simultaneously for every dataset $D$, with the best private benchmark algorithm that (a) knows $D$ in advance and (b) is evaluated by its worst-case performance on large subsets of $D$. That is, the benchmark algorithm need not perform well w… ▽ More

    Submitted 28 May, 2024; v1 submitted 1 March, 2023; originally announced March 2023.

  5. arXiv:2302.03115  [pdf, other

    cs.LG stat.ML

    Easy Learning from Label Proportions

    Authors: Robert Istvan Busa-Fekete, Hee** Choi, Travis Dick, Claudio Gentile, Andres Munoz medina

    Abstract: We consider the problem of Learning from Label Proportions (LLP), a weakly supervised classification setup where instances are grouped into "bags", and only the frequency of class labels at each bag is available. Albeit, the objective of the learner is to achieve low task loss at an individual instance level. Here we propose Easyllp: a flexible and simple-to-implement debiasing approach based on a… ▽ More

    Submitted 13 February, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

  6. arXiv:2211.03128  [pdf, other

    cs.CY cs.CR cs.LG

    Confidence-Ranked Reconstruction of Census Microdata from Published Statistics

    Authors: Travis Dick, Cynthia Dwork, Michael Kearns, Terrance Liu, Aaron Roth, Giuseppe Vietri, Zhiwei Steven Wu

    Abstract: A reconstruction attack on a private dataset $D$ takes as input some publicly accessible information about the dataset and produces a list of candidate elements of $D$. We introduce a new class of data reconstruction attacks based on randomized methods for non-convex optimization. We empirically demonstrate that our attacks can not only reconstruct full rows of $D$ from aggregate query statistics… ▽ More

    Submitted 6 February, 2023; v1 submitted 6 November, 2022; originally announced November 2022.

  7. arXiv:2210.11222  [pdf, other

    cs.CR cs.AI cs.DS cs.LG stat.ML

    Learning-Augmented Private Algorithms for Multiple Quantile Release

    Authors: Mikhail Khodak, Kareem Amin, Travis Dick, Sergei Vassilvitskii

    Abstract: When applying differential privacy to sensitive data, we can often improve performance using external information such as other sensitive data, public data, or human priors. We propose to use the learning-augmented algorithms (or algorithms with predictions) framework -- previously applied largely to improve time complexity or competitive ratios -- as a powerful way of designing and analyzing priv… ▽ More

    Submitted 8 May, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: To appear in ICML 2023

  8. arXiv:2208.03291  [pdf

    stat.AP

    Comparing Unit Trains versus Manifest Trains for the Risk of Rail Transport of Hazardous Materials -- Part II: Application and Case Study

    Authors: Di Kang, Jiaxi Zhao, C. Tyler Dick, Xiang Liu, Zheyong Bian, Steven W. Kirkpatrick, Chen-Yu Lin

    Abstract: Built upon the risk analysis methodology (presented in the part I paper), this part II paper focuses on applying this methodology. Five illustrative scenarios were used to analyze the best or worst cases and compare the transportation risk differences between service options using unit trains and manifest trains. The comparison results indicate that if all tank cars are placed at the positions wit… ▽ More

    Submitted 4 July, 2022; originally announced August 2022.

  9. arXiv:2207.02113  [pdf

    stat.AP stat.ME

    Comparing Unit Trains versus Manifest Trains for the Risk of Rail Transport of Hazardous Materials -- Part I: Risk Analysis Methodology

    Authors: Di Kang, Jiaxi Zhao, C. Tyler Dick, Xiang Liu, Zheyong Bian, Steven W. Kirkpatrick, Chen-Yu Lin

    Abstract: Transporting hazardous materials (hazmats) using tank cars has more significant economic benefits than other transportation modes. Although railway transportation is roughly four times more fuel-efficient than roadway transportation, a train derailment has greater potential to cause more disastrous consequences than a truck incident. Train types, such as unit train or manifest train (also called m… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

  10. arXiv:2109.15279  [pdf, other

    math.OC cs.CE

    Combining Sobolev Smoothing with Parameterized Shape Optimization

    Authors: Thomas Dick, Stephan Schmidt, Nicolas R. Gauger

    Abstract: On the one hand, Sobolev gradient smoothing can considerably improve the performance of aerodynamic shape optimization and prevent issues with regularity. On the other hand, Sobolev smoothing can also be interpreted as an approximation for the shape Hessian. This paper demonstrates, how Sobolev smoothing, interpreted as a shape Hessian approximation, offers considerable benefits, although the para… ▽ More

    Submitted 19 March, 2022; v1 submitted 30 September, 2021; originally announced September 2021.

    Comments: Version 2 of the paper. We included revisions and suggestions from peer review. We corrected spelling and grammar errors throughout the paper

  11. arXiv:2012.10602  [pdf, other

    cs.LG cs.CR stat.ML

    Scalable and Provably Accurate Algorithms for Differentially Private Distributed Decision Tree Learning

    Authors: Kaiwen Wang, Travis Dick, Maria-Florina Balcan

    Abstract: This paper introduces the first provably accurate algorithms for differentially private, top-down decision tree learning in the distributed setting (Balcan et al., 2012). We propose DP-TopDown, a general privacy preserving decision tree learning algorithm, and present two distributed implementations. Our first method NoisyCounts naturally extends the single machine algorithm by using the Laplace m… ▽ More

    Submitted 22 February, 2021; v1 submitted 19 December, 2020; originally announced December 2020.

    Comments: In AAAI Workshop on Privacy-Preserving Artificial Intelligence, 2020

  12. arXiv:2006.07281  [pdf, other

    cs.LG cs.CE cs.GT stat.ML

    Algorithms and Learning for Fair Portfolio Design

    Authors: Emily Diana, Travis Dick, Hadi Elzayn, Michael Kearns, Aaron Roth, Zachary Schutzman, Saeed Sharifi-Malvajerdi, Juba Ziani

    Abstract: We consider a variation on the classical finance problem of optimal portfolio design. In our setting, a large population of consumers is drawn from some distribution over risk tolerances, and each consumer must be assigned to a portfolio of lower risk than her tolerance. The consumers may also belong to underlying groups (for instance, of demographic properties or wealth), and the goal is to desig… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

  13. arXiv:2002.03517  [pdf, other

    cs.LG cs.CR stat.ML

    Random Smoothing Might be Unable to Certify $\ell_\infty$ Robustness for High-Dimensional Images

    Authors: Avrim Blum, Travis Dick, Naren Manoj, Hongyang Zhang

    Abstract: We show a hardness result for random smoothing to achieve certified adversarial robustness against attacks in the $\ell_p$ ball of radius $ε$ when $p>2$. Although random smoothing has been well understood for the $\ell_2$ case using the Gaussian distribution, much remains unknown concerning the existence of a noise distribution that works for the case of $p>2$. This has been posed as an open probl… ▽ More

    Submitted 5 March, 2020; v1 submitted 9 February, 2020; originally announced February 2020.

    Comments: 20 pages, 2 figures; Code is available at https://github.com/hongyanz/TRADES-smoothing

  14. arXiv:1910.12190  [pdf, other

    physics.bio-ph cond-mat.mtrl-sci

    Mechanoradicals in tensed tendon collagen as a new source of oxidative stress

    Authors: Christopher Zapp, Agnieszka Obarska-Kosinska, Benedikt Rennekamp, Davide Mercadante, Uladzimir Barayeu, Tobias P. Dick, Vasyl Denysenkov, Thomas Prisner, Marina Bennati, Csaba Daday, Reinhard Kappl, Frauke Gräter

    Abstract: As established nearly a century ago, mechanoradicals originate from homolytic bond scission in polymers. The existence, nature and biological relevance of mechanoradicals in proteins, instead, are unknown. We here show that mechanical stress on collagen produces radicals and subsequently reactive oxygen species, essential biological signaling molecules. Electron-paramagnetic resonance (EPR) spectr… ▽ More

    Submitted 27 October, 2019; originally announced October 2019.

  15. arXiv:1908.02894  [pdf, other

    cs.LG stat.ML

    How much data is sufficient to learn high-performing algorithms? Generalization guarantees for data-driven algorithm design

    Authors: Maria-Florina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, Ellen Vitercik

    Abstract: Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worst-case bounds, so the parameters are made available for the user to tune. Alternatively, parameters may be tuned implicitly within the proof of a worst-case approximation ratio or runtime bound. Worst-case in… ▽ More

    Submitted 25 April, 2021; v1 submitted 7 August, 2019; originally announced August 2019.

  16. arXiv:1907.09137  [pdf, other

    cs.LG stat.ML

    Learning piecewise Lipschitz functions in changing environments

    Authors: Maria-Florina Balcan, Travis Dick, Dravyansh Sharma

    Abstract: Optimization in the presence of sharp (non-Lipschitz), unpredictable (w.r.t. time and amount) changes is a challenging and largely unexplored problem of great significance. We consider the class of piecewise Lipschitz functions, which is the most general online setting considered in the literature for the problem, and arises naturally in various combinatorial algorithm selection problems where uti… ▽ More

    Submitted 6 August, 2020; v1 submitted 22 July, 2019; originally announced July 2019.

  17. arXiv:1907.00533  [pdf, other

    cs.LG cs.DS

    Learning to Link

    Authors: Maria-Florina Balcan, Travis Dick, Manuel Lang

    Abstract: Clustering is an important part of many modern data analysis pipelines, including network analysis and data retrieval. There are many different clustering algorithms developed by various communities, and it is often not clear which algorithm will give the best performance on a specific clustering task. Similarly, we often have multiple ways to measure distances between data points, and the best cl… ▽ More

    Submitted 2 October, 2019; v1 submitted 1 July, 2019; originally announced July 2019.

  18. arXiv:1904.09014  [pdf, other

    cs.LG stat.ML

    Semi-bandit Optimization in the Dispersed Setting

    Authors: Maria-Florina Balcan, Travis Dick, Wesley Pegden

    Abstract: The goal of data-driven algorithm design is to obtain high-performing algorithms for specific application domains using machine learning and data. Across many fields in AI, science, and engineering, practitioners will often fix a family of parameterized algorithms and then optimize those parameters to obtain good performance on example instances from the application domain. In the online setting,… ▽ More

    Submitted 21 December, 2020; v1 submitted 18 April, 2019; originally announced April 2019.

  19. arXiv:1809.08700  [pdf, other

    cs.LG cs.GT stat.ML

    Envy-Free Classification

    Authors: Maria-Florina Balcan, Travis Dick, Ritesh Noothigattu, Ariel D. Procaccia

    Abstract: In classic fair division problems such as cake cutting and rent division, envy-freeness requires that each individual (weakly) prefer his allocation to anyone else's. On a conceptual level, we argue that envy-freeness also provides a compelling notion of fairness for classification tasks. Our technical focus is the generalizability of envy-free classification, i.e., understanding whether a classif… ▽ More

    Submitted 24 September, 2020; v1 submitted 23 September, 2018; originally announced September 2018.

    Journal ref: Advances in Neural Information Processing Systems, 2019, pp. 1240-1250

  20. arXiv:1809.06987  [pdf, other

    cs.DS cs.AI cs.LG

    Data-Driven Clustering via Parameterized Lloyd's Families

    Authors: Maria-Florina Balcan, Travis Dick, Colin White

    Abstract: Algorithms for clustering points in metric spaces is a long-studied area of research. Clustering has seen a multitude of work both theoretically, in understanding the approximation guarantees possible for many objective functions such as k-median and k-means clustering, and experimentally, in finding the fastest algorithms and seeding procedures for Lloyd's algorithm. The performance of a given cl… ▽ More

    Submitted 24 May, 2019; v1 submitted 18 September, 2018; originally announced September 2018.

  21. arXiv:1803.10150  [pdf, other

    cs.AI cs.DS

    Learning to Branch

    Authors: Maria-Florina Balcan, Travis Dick, Tuomas Sandholm, Ellen Vitercik

    Abstract: Tree search algorithms, such as branch-and-bound, are the most widely used tools for solving combinatorial and nonconvex problems. For example, they are the foremost method for solving (mixed) integer programs and constraint satisfaction problems. Tree search algorithms recursively partition the search space to find an optimal solution. In order to keep the tree size small, it is crucial to carefu… ▽ More

    Submitted 16 May, 2018; v1 submitted 27 March, 2018; originally announced March 2018.

  22. arXiv:1711.03091  [pdf, other

    cs.LG

    Dispersion for Data-Driven Algorithm Design, Online Learning, and Private Optimization

    Authors: Maria-Florina Balcan, Travis Dick, Ellen Vitercik

    Abstract: Data-driven algorithm design, that is, choosing the best algorithm for a specific application, is a crucial problem in modern data science. Practitioners often optimize over a parameterized algorithm family, tuning parameters based on problems from their domain. These procedures have historically come with no guarantees, though a recent line of work studies algorithm selection from a theoretical p… ▽ More

    Submitted 22 October, 2018; v1 submitted 8 November, 2017; originally announced November 2017.

  23. arXiv:1611.04327  [pdf, other

    math.OC

    On ideal dynamic climbing ropes

    Authors: Davit Harutyunyan, Graeme W. Milton, Trevor J. Dick, Justin Boyer

    Abstract: We consider the rope climber fall problem in two different settings. The simplest formulation of the problem is when the climber falls from a given altitude and is attached to one end of the rope while the other end of the rope is attached to the rock at a given height. The problem is then finding the properties of the rope for which the peak force felt by the climber during the fall is minimal. T… ▽ More

    Submitted 14 November, 2016; originally announced November 2016.

    Comments: 21, 7 figures

  24. arXiv:1512.04848  [pdf, other

    cs.LG cs.DS stat.ML

    Data Driven Resource Allocation for Distributed Learning

    Authors: Travis Dick, Mu Li, Venkata Krishna Pillutla, Colin White, Maria Florina Balcan, Alex Smola

    Abstract: In distributed machine learning, data is dispatched to multiple machines for processing. Motivated by the fact that similar data points often belong to the same or similar classes, and more generally, classification rules of high accuracy tend to be "locally simple but globally complex" (Vapnik & Bottou 1993), we propose data dependent dispatching that takes advantage of such structure. We present… ▽ More

    Submitted 15 December, 2016; v1 submitted 15 December, 2015; originally announced December 2015.

  25. arXiv:1511.03225  [pdf, other

    cs.LG

    Label Efficient Learning by Exploiting Multi-class Output Codes

    Authors: Maria Florina Balcan, Travis Dick, Yishay Mansour

    Abstract: We present a new perspective on the popular multi-class algorithmic techniques of one-vs-all and error correcting output codes. Rather than studying the behavior of these techniques for supervised learning, we establish a connection between the success of these methods and the existence of label-efficient learning procedures. We show that in both the realizable and agnostic cases, if output codes… ▽ More

    Submitted 25 November, 2016; v1 submitted 10 November, 2015; originally announced November 2015.