Skip to main content

Showing 1–50 of 114 results for author: Suresh, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.04354  [pdf, other

    cs.IT eess.SP math.AG

    A transversality theorem for semi-algebraic sets with application to signal recovery from the second moment and cryo-EM

    Authors: Tamir Bendory, Nadav Dym, Dan Edidin, Arun Suresh

    Abstract: Semi-algebraic priors are ubiquitous in signal processing and machine learning. Prevalent examples include a) linear models where the signal lies in a low-dimensional subspace; b) sparse models where the signal can be represented by only a few coefficients under a suitable basis; and c) a large family of neural network generative models. In this paper, we prove a transversality theorem for semi-al… ▽ More

    Submitted 10 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  2. arXiv:2404.11607  [pdf, other

    cs.DS

    Private federated discovery of out-of-vocabulary words for Gboard

    Authors: Ziteng Sun, Peter Kairouz, Haicheng Sun, Adria Gascon, Ananda Theertha Suresh

    Abstract: The vocabulary of language models in Gboard, Google's keyboard application, plays a crucial role for improving user experience. One way to improve the vocabulary is to discover frequently typed out-of-vocabulary (OOV) words on user devices. This task requires strong privacy protection due to the sensitive nature of user input data. In this report, we present a private OOV discovery algorithm for G… ▽ More

    Submitted 18 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  3. arXiv:2404.09221  [pdf, other

    cs.CL cs.AI cs.LG

    Exploring and Improving Drafts in Blockwise Parallel Decoding

    Authors: Taehyeon Kim, Ananda Theertha Suresh, Kishore Papineni, Michael Riley, Sanjiv Kumar, Adrian Benton

    Abstract: Despite the remarkable strides made by autoregressive language models, their potential is often hampered by the slow inference speeds inherent in sequential token generation. Blockwise parallel decoding (BPD) was proposed by Stern et al. as a method to improve inference speed of language models by simultaneously predicting multiple future tokens, termed block drafts, which are subsequently verifie… ▽ More

    Submitted 5 June, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

  4. arXiv:2404.01730  [pdf, other

    cs.LG cs.IT stat.ML

    Asymptotics of Language Model Alignment

    Authors: Joy Qi** Yang, Salman Salamatian, Ziteng Sun, Ananda Theertha Suresh, Ahmad Beirami

    Abstract: Let $p$ denote a generative language model. Let $r$ denote a reward model that returns a scalar that captures the degree at which a draw from $p$ is preferred. The goal of language model alignment is to alter $p$ to a new distribution $φ$ that results in a higher expected reward while kee** $φ$ close to $p.$ A popular alignment method is the KL-constrained reinforcement learning (RL), which choo… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  5. arXiv:2403.10444  [pdf, other

    cs.LG cs.CL cs.DS cs.IT

    Optimal Block-Level Draft Verification for Accelerating Speculative Decoding

    Authors: Ziteng Sun, Jae Hun Ro, Ahmad Beirami, Ananda Theertha Suresh

    Abstract: Speculative decoding has shown to be an effective method for lossless acceleration of large language models (LLMs) during inference. In each iteration, the algorithm first uses a smaller model to draft a block of tokens. The tokens are then verified by the large model in parallel and only a subset of tokens will be kept to guarantee that the final output follows the distribution of the large model… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  6. arXiv:2403.08100  [pdf, other

    cs.LG cs.CR cs.DC

    Efficient Language Model Architectures for Differentially Private Federated Learning

    Authors: Jae Hun Ro, Srinadh Bhojanapalli, Zheng Xu, Yanxiang Zhang, Ananda Theertha Suresh

    Abstract: Cross-device federated learning (FL) is a technique that trains a model on data distributed across typically millions of edge devices without data leaving the devices. SGD is the standard client optimizer for on device training in cross-device FL, favored for its memory and computational efficiency. However, in centralized training of neural language models, adaptive optimizers are preferred as th… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  7. arXiv:2402.10161  [pdf, other

    cs.RO cs.IT

    Robotic Exploration using Generalized Behavioral Entropy

    Authors: Aamodh Suresh, Carlos Nieto-Granda, Sonia Martinez

    Abstract: This work presents and evaluates a novel strategy for robotic exploration that leverages human models of uncertainty perception. To do this, we introduce a measure of uncertainty that we term ``Behavioral entropy'', which builds on Prelec's probability weighting from Behavioral Economics. We show that the new operator is an admissible generalized entropy, analyze its theoretical properties and com… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  8. arXiv:2401.01879  [pdf, other

    cs.LG cs.CL cs.IT

    Theoretical guarantees on the best-of-n alignment policy

    Authors: Ahmad Beirami, Alekh Agarwal, Jonathan Berant, Alexander D'Amour, Jacob Eisenstein, Chirag Nagpal, Ananda Theertha Suresh

    Abstract: A simple and effective method for the alignment of generative models is the best-of-$n$ policy, where $n$ samples are drawn from a base policy, and ranked based on a reward function, and the highest ranking one is selected. A commonly used analytical expression in the literature claims that the KL divergence between the best-of-$n$ policy and the base policy is equal to $\log (n) - (n-1)/n.$ We di… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  9. arXiv:2312.06658  [pdf, other

    cs.DS cs.CR cs.IT stat.ML

    Mean estimation in the add-remove model of differential privacy

    Authors: Alex Kulesza, Ananda Theertha Suresh, Yuyan Wang

    Abstract: Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical applications rely on the more conservative add-remove model, where obtaining tight results can be difficult. Here, we study the problem of one-dimensional mean estim… ▽ More

    Submitted 19 February, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

  10. arXiv:2312.03867  [pdf, other

    cs.LG cs.CY cs.IT stat.ML

    Multi-Group Fairness Evaluation via Conditional Value-at-Risk Testing

    Authors: Lucas Monteiro Paes, Ananda Theertha Suresh, Alex Beutel, Flavio P. Calmon, Ahmad Beirami

    Abstract: Machine learning (ML) models used in prediction and classification tasks may display performance disparities across population groups determined by sensitive attributes (e.g., race, sex, age). We consider the problem of evaluating the performance of a fixed ML model across population groups defined by multiple sensitive attributes (e.g., race and sex and age). Here, the sample complexity for estim… ▽ More

    Submitted 25 May, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in the IEEE Journal on Selected Areas in Information Theory (JSAIT)

  11. arXiv:2311.08833  [pdf, ps, other

    eess.SP cs.IT

    Phase retrieval with semi-algebraic and ReLU neural network priors

    Authors: Tamir Bendory, Nadav Dym, Dan Edidin, Arun Suresh

    Abstract: The key ingredient to retrieving a signal from its Fourier magnitudes, namely, to solve the phase retrieval problem, is an effective prior on the sought signal. In this paper, we study the phase retrieval problem under the prior that the signal lies in a semi-algebraic set. This is a very general prior as semi-algebraic sets include linear models, sparse models, and ReLU neural network generative… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  12. arXiv:2310.15141  [pdf, other

    cs.LG cs.CL cs.DS cs.IT

    SpecTr: Fast Speculative Decoding via Optimal Transport

    Authors: Ziteng Sun, Ananda Theertha Suresh, Jae Hun Ro, Ahmad Beirami, Himanshu Jain, Felix Yu

    Abstract: Autoregressive sampling from large language models has led to state-of-the-art results in several natural language tasks. However, autoregressive sampling generates tokens one at a time making it slow, and even prohibitive in certain tasks. One way to speed up sampling is $\textit{speculative decoding}$: use a small model to sample a $\textit{draft}$ (block or sequence of tokens), and then score a… ▽ More

    Submitted 17 January, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  13. arXiv:2309.11381  [pdf, other

    cs.CL cs.CE cs.CY cs.SI

    Studying Lobby Influence in the European Parliament

    Authors: Aswin Suresh, Lazar Radojevic, Francesco Salvi, Antoine Magron, Victor Kristof, Matthias Grossglauser

    Abstract: We present a method based on natural language processing (NLP), for studying the influence of interest groups (lobbies) in the law-making process in the European Parliament (EP). We collect and analyze novel datasets of lobbies' position papers and speeches made by members of the EP (MEPs). By comparing these texts on the basis of semantic similarity and entailment, we are able to discover interpr… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: 11 pages, 5 figures. Under review for presentation at ICWSM 2024

  14. arXiv:2307.13347  [pdf, other

    cs.DS cs.CR cs.IT

    Federated Heavy Hitter Recovery under Linear Sketching

    Authors: Adria Gascon, Peter Kairouz, Ziteng Sun, Ananda Theertha Suresh

    Abstract: Motivated by real-life deployments of multi-round federated analytics with secure aggregation, we investigate the fundamental communication-accuracy tradeoffs of the heavy hitter discovery and approximate (open-domain) histogram problems under a linear sketching constraint. We propose efficient algorithms based on local subsampling and invertible bloom look-up tables (IBLTs). We also show that our… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  15. arXiv:2307.11106  [pdf, other

    cs.LG cs.CR cs.IT

    The importance of feature preprocessing for differentially private linear optimization

    Authors: Ziteng Sun, Ananda Theertha Suresh, Aditya Krishna Menon

    Abstract: Training machine learning models with differential privacy (DP) has received increasing interest in recent years. One of the most popular algorithms for training differentially private models is differentially private stochastic gradient descent (DPSGD) and its variants, where at each step gradients are clipped and combined with some noise. Given the increasing usage of DPSGD, we ask the question:… ▽ More

    Submitted 19 February, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

  16. arXiv:2307.08139  [pdf, other

    cs.CL

    It's All Relative: Interpretable Models for Scoring Bias in Documents

    Authors: Aswin Suresh, Chi-Hsuan Wu, Matthias Grossglauser

    Abstract: We propose an interpretable model to score the bias present in web documents, based only on their textual content. Our model incorporates assumptions reminiscent of the Bradley-Terry axioms and is trained on pairs of revisions of the same Wikipedia article, where one version is more biased than the other. While prior approaches based on absolute bias classification have struggled to obtain a high… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: 12 pages

  17. arXiv:2307.06835  [pdf, ps, other

    math.FA cs.IT math.AG

    The generic crystallographic phase retrieval problem

    Authors: Dan Edidin, Arun Suresh

    Abstract: In this paper we consider the problem of recovering a signal $x \in \mathbb{R}^N$ from its power spectrum assuming that the signal is sparse with respect to a generic basis for $\mathbb{R}^N$. Our main result is that if the sparsity level is at most $\sim\! N/2$ in this basis then the generic sparse vector is uniquely determined up to sign from its power spectrum. We also prove that if the sparsit… ▽ More

    Submitted 24 July, 2023; v1 submitted 13 July, 2023; originally announced July 2023.

    Comments: 20 pages

    MSC Class: 42A10; 94A12; 94A15

  18. arXiv:2307.04905  [pdf, other

    cs.LG cs.DC

    FedYolo: Augmenting Federated Learning with Pretrained Transformers

    Authors: Xuechen Zhang, Mingchen Li, Xiangyu Chang, Jiasi Chen, Amit K. Roy-Chowdhury, Ananda Theertha Suresh, Samet Oymak

    Abstract: The growth and diversity of machine learning applications motivate a rethinking of learning with mobile and edge devices. How can we address diverse client goals and learn with scarce heterogeneous data? While federated learning aims to address these issues, it has challenges hindering a unified solution. Large transformer models have been shown to work across a variety of tasks achieving remarkab… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: 20 pages, 18 figures

  19. arXiv:2303.08284  [pdf, other

    cs.RO cs.AI

    Robot Navigation in Risky, Crowded Environments: Understanding Human Preferences

    Authors: Aamodh Suresh, Angelique Taylor, Laurel D. Riek, Sonia Martinez

    Abstract: Risky and crowded environments (RCE) contain abstract sources of risk and uncertainty, which are perceived differently by humans, leading to a variety of behaviors. Thus, robots deployed in RCEs, need to exhibit diverse perception and planning capabilities in order to interpret other human agents' behavior and act accordingly in such environments. To understand this problem domain, we conducted a… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Under review

  20. arXiv:2303.01262  [pdf, other

    cs.LG cs.CR cs.IT

    Subset-Based Instance Optimality in Private Estimation

    Authors: Travis Dick, Alex Kulesza, Ziteng Sun, Ananda Theertha Suresh

    Abstract: We propose a new definition of instance optimality for differentially private estimation algorithms. Our definition requires an optimal algorithm to compete, simultaneously for every dataset $D$, with the best private benchmark algorithm that (a) knows $D$ in advance and (b) is evaluated by its worst-case performance on large subsets of $D$. That is, the benchmark algorithm need not perform well w… ▽ More

    Submitted 28 May, 2024; v1 submitted 1 March, 2023; originally announced March 2023.

  21. arXiv:2302.09904  [pdf, other

    cs.LG cs.CR cs.DC cs.IT

    WW-FL: Secure and Private Large-Scale Federated Learning

    Authors: Felix Marx, Thomas Schneider, Ajith Suresh, Tobias Wehrle, Christian Weinert, Hossein Yalame

    Abstract: Federated learning (FL) is an efficient approach for large-scale distributed machine learning that promises data privacy by kee** training data on client devices. However, recent research has uncovered vulnerabilities in FL, impacting both security and privacy through poisoning attacks and the potential disclosure of sensitive information in individual model updates as well as the aggregated glo… ▽ More

    Submitted 30 May, 2024; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: WWFL combines private training and inference with secure aggregation and hierarchical FL to provide end-to-end protection and to facilitate large-scale global deployment

  22. arXiv:2302.06869  [pdf, other

    stat.ML cs.DM cs.IT cs.LG math.PR

    Concentration Bounds for Discrete Distribution Estimation in KL Divergence

    Authors: Clément L. Canonne, Ziteng Sun, Ananda Theertha Suresh

    Abstract: We study the problem of discrete distribution estimation in KL divergence and provide concentration bounds for the Laplace estimator. We show that the deviation from mean scales as $\sqrt{k}/n$ when $n \ge k$, improving upon the best prior result of $k/n$. We also establish a matching lower bound that shows that our bounds are tight up to polylogarithmic factors.

    Submitted 12 June, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: Updated discussion of previous work

  23. arXiv:2211.15913  [pdf, other

    cs.LO

    Branch-Well-Structured Transition Systems and Extensions

    Authors: Benedikt Bollig, Alain Finkel, Amrita Suresh

    Abstract: We propose a relaxation to the definition of well-structured transition systems (\WSTS) while retaining the decidability of boundedness and non-termination. In this class, the well-quasi-ordered (wqo) condition is relaxed such that it is applicable only between states that are reachable one from another. Furthermore, the monotony condition is relaxed in the same way. While this retains the decidab… ▽ More

    Submitted 11 June, 2024; v1 submitted 28 November, 2022; originally announced November 2022.

  24. arXiv:2211.09727  [pdf, other

    cond-mat.mtrl-sci cs.CV cs.LG eess.IV

    A Survey on Evaluation Metrics for Synthetic Material Micro-Structure Images from Generative Models

    Authors: Devesh Shah, Anirudh Suresh, Alemayehu Admasu, Devesh Upadhyay, Kalyanmoy Deb

    Abstract: The evaluation of synthetic micro-structure images is an emerging problem as machine learning and materials science research have evolved together. Typical state of the art methods in evaluating synthetic images from generative models have relied on the Fréchet Inception Distance. However, this and other similar methods, are limited in the materials domain due to both the unique features that char… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: Accepted in Neural Information Processing Systems (NeurIPS) 2022 Workshop on AI for Accelerated Materials Design (AI4Mat). Selected as spotlight paper for workshop

    ACM Class: I.2.m; J.2

  25. arXiv:2211.04367  [pdf, other

    cs.LG cs.CV

    Much Easier Said Than Done: Falsifying the Causal Relevance of Linear Decoding Methods

    Authors: Lucas Hayne, Abhijit Suresh, Hunar Jain, Rahul Kumar, R. McKell Carter

    Abstract: Linear classifier probes are frequently utilized to better understand how neural networks function. Researchers have approached the problem of determining unit importance in neural networks by probing their learned, internal representations. Linear classifier probes identify highly selective units as the most important for network function. Whether or not a network actually relies on high selectiv… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: 6 pages, 3 figures, to be published in I Can't Believe It's Note Better Workshop at NeurIPS 2022

  26. ScionFL: Efficient and Robust Secure Quantized Aggregation

    Authors: Yaniv Ben-Itzhak, Helen Möllering, Benny Pinkas, Thomas Schneider, Ajith Suresh, Oleksandr Tkachenko, Shay Vargaftik, Christian Weinert, Hossein Yalame, Avishay Yanai

    Abstract: Secure aggregation is commonly used in federated learning (FL) to alleviate privacy concerns related to the central aggregator seeing all parameter updates in the clear. Unfortunately, most existing secure aggregation schemes ignore two critical orthogonal research directions that aim to (i) significantly reduce client-server communication and (ii) mitigate the impact of malicious clients. However… ▽ More

    Submitted 17 May, 2024; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Published in 2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)

  27. arXiv:2210.03461  [pdf, other

    cs.CV cs.AI cs.LG

    FastCLIPstyler: Optimisation-free Text-based Image Style Transfer Using Style Representations

    Authors: Ananda Padhmanabhan Suresh, Sanjana Jain, Pavit Noinongyao, Ankush Ganguly, Ukrit Watchareeruetai, Aubin Samacoits

    Abstract: In recent years, language-driven artistic style transfer has emerged as a new type of style transfer technique, eliminating the need for a reference style image by using natural language descriptions of the style. The first model to achieve this, called CLIPstyler, has demonstrated impressive stylisation results. However, its lengthy optimisation procedure at runtime for each query limits its suit… ▽ More

    Submitted 14 November, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: Accepted at the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024)

  28. arXiv:2208.06135  [pdf, other

    cs.LG cs.CR stat.ML

    Private Domain Adaptation from a Public Source

    Authors: Raef Bassily, Mehryar Mohri, Ananda Theertha Suresh

    Abstract: A key problem in a variety of applications is that of domain adaptation from a public source domain, for which a relatively large amount of labeled data with no privacy constraints is at one's disposal, to a private target domain, for which a private sample is available with very few or no labeled data. In regression problems with no privacy constraints on the source or target data, a discrepancy… ▽ More

    Submitted 12 August, 2022; originally announced August 2022.

  29. arXiv:2207.11250  [pdf, other

    cs.CV cs.LG eess.IV

    Rich Feature Distillation with Feature Affinity Module for Efficient Image Dehazing

    Authors: Sai Mitheran, Anushri Suresh, Nisha J. S., Varun P. Gopi

    Abstract: Single-image haze removal is a long-standing hurdle for computer vision applications. Several works have been focused on transferring advances from image classification, detection, and segmentation to the niche of image dehazing, primarily focusing on contrastive learning and knowledge distillation. However, these approaches prove computationally expensive, raising concern regarding their applicab… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

    Comments: Preprint version. Accepted at Optik

  30. arXiv:2206.12224  [pdf, other

    cs.CR cs.DC cs.IT cs.LG

    MPClan: Protocol Suite for Privacy-Conscious Computations

    Authors: Nishat Koti, Shravani Patil, Arpita Patra, Ajith Suresh

    Abstract: The growing volumes of data being collected and its analysis to provide better services are creating worries about digital privacy. To address privacy concerns and give practical solutions, the literature has relied on secure multiparty computation. However, recent research has mostly focused on the small-party honest-majority setting of up to four parties, noting efficiency concerns. In this work… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

  31. arXiv:2206.03776  [pdf, other

    cs.CR

    High-Throughput Secure Multiparty Computation with an Honest Majority in Various Network Settings

    Authors: Christopher Harth-Kitzerow, Ajith Suresh, Yonqing Wang, Hossein Yalame, Georg Carle, Murali Annavaram

    Abstract: In this work, we present novel protocols over rings for semi-honest secure three-party computation (3PC) and malicious four-party computation (4PC) with one corruption. While most existing works focus on improving total communication complexity, challenges such as network heterogeneity and computational complexity, which impact MPC performance in practice, remain underexplored. Our protocols add… ▽ More

    Submitted 28 June, 2024; v1 submitted 8 June, 2022; originally announced June 2022.

  32. arXiv:2206.03008  [pdf, other

    cs.LG cs.CR

    Algorithms for bounding contribution for histogram estimation under user-level privacy

    Authors: Yuhan Liu, Ananda Theertha Suresh, Wennan Zhu, Peter Kairouz, Marco Gruteser

    Abstract: We study the problem of histogram estimation under user-level differential privacy, where the goal is to preserve the privacy of all entries of any single user. We consider the heterogeneous scenario where the quantity of data can be different for each user. In this scenario, the amount of noise injected into the histogram to obtain differential privacy is proportional to the maximum user contribu… ▽ More

    Submitted 30 June, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: 32 pages, ICML 2023

  33. arXiv:2206.00539  [pdf, other

    cs.CR cs.CY cs.SI

    Privacy-Preserving Epidemiological Modeling on Mobile Graphs

    Authors: Daniel Günther, Marco Holz, Benjamin Judkewitz, Helen Möllering, Benny Pinkas, Thomas Schneider, Ajith Suresh

    Abstract: Over the last two years, governments all over the world have used a variety of containment measures to control the spread of COVID-19, such as contact tracing, social distance regulations, and curfews. Epidemiological simulations are commonly used to assess the impact of those policies before they are implemented in actuality. Unfortunately, their predictive accuracy is hampered by the scarcity of… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  34. arXiv:2205.04961  [pdf, other

    cs.CR

    Privadome: Protecting Citizen Privacy from Delivery Drones

    Authors: Gokulnath Pillai, Eikansh Gupta, Ajith Suresh, Vinod Ganapathy, Arpita Patra

    Abstract: As e-commerce companies begin to consider using delivery drones for customer fulfillment, there are growing concerns around citizen privacy. Drones are equipped with cameras, and the video feed from these cameras is often required as part of routine navigation, be it for semi autonomous or fully-autonomous drones. Footage of ground-based citizens may be captured in this video feed, thereby leading… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

  35. arXiv:2204.10376  [pdf, other

    cs.LG stat.ML

    Differentially Private Learning with Margin Guarantees

    Authors: Raef Bassily, Mehryar Mohri, Ananda Theertha Suresh

    Abstract: We present a series of new differentially private (DP) algorithms with dimension-independent margin guarantees. For the family of linear hypotheses, we give a pure DP learning algorithm that benefits from relative deviation margin guarantees, as well as an efficient DP learning algorithm with margin guarantees. We also present a new efficient DP learning algorithm with margin guarantees for kernel… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

  36. arXiv:2204.09715  [pdf, other

    cs.CL cs.LG

    Scaling Language Model Size in Cross-Device Federated Learning

    Authors: Jae Hun Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Theertha Suresh, Shankar Kumar, Rajiv Mathews

    Abstract: Most studies in cross-device federated learning focus on small models, due to the server-client communication and on-device computation bottlenecks. In this work, we leverage various techniques for mitigating these bottlenecks to train larger language models in cross-device federated learning. With systematic applications of partial model training, quantization, efficient transfer learning, and co… ▽ More

    Submitted 24 June, 2022; v1 submitted 31 March, 2022; originally announced April 2022.

  37. arXiv:2204.09652  [pdf, other

    cs.CL cs.AI cs.LG

    The TalkMoves Dataset: K-12 Mathematics Lesson Transcripts Annotated for Teacher and Student Discursive Moves

    Authors: Abhijit Suresh, Jennifer Jacobs, Charis Harty, Margaret Perkoff, James H. Martin, Tamara Sumner

    Abstract: Transcripts of teaching episodes can be effective tools to understand discourse patterns in classroom instruction. According to most educational experts, sustained classroom discourse is a critical component of equitable, engaging, and rich learning environments for students. This paper describes the TalkMoves dataset, composed of 567 human-annotated K-12 mathematics lesson transcripts (including… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: 9 pages, 2 figures, Accepted for a Poster + Demo presentation at the 13th International Conference on Language Resources and Evaluation 2022

  38. arXiv:2203.15037  [pdf, other

    cs.GT

    Online Algorithms for Matching Platforms with Multi-Channel Traffic

    Authors: Vahideh Manshadi, Scott Rodilitz, Daniela Saban, Akshaya Suresh

    Abstract: Two-sided platforms rely on their recommendation algorithms to help visitors successfully find a match. However, on platforms such as VolunteerMatch (VM) -- which has facilitated millions of connections between volunteers and nonprofits -- a sizable fraction of website traffic arrives directly to a nonprofit's volunteering page via an external link, thus bypassing the platform's recommendation alg… ▽ More

    Submitted 1 August, 2023; v1 submitted 28 March, 2022; originally announced March 2022.

  39. arXiv:2203.04925  [pdf, other

    cs.LG cs.DS cs.IT

    Correlated quantization for distributed mean estimation and optimization

    Authors: Ananda Theertha Suresh, Ziteng Sun, Jae Hun Ro, Felix Yu

    Abstract: We study the problem of distributed mean estimation and optimization under communication constraints. We propose a correlated quantization protocol whose leading term in the error guarantee depends on the mean deviation of data points rather than only their absolute range. The design doesn't need any prior knowledge on the concentration property of the dataset, which is required to get such depend… ▽ More

    Submitted 8 July, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

  40. arXiv:2203.03761  [pdf, other

    cs.LG stat.ML

    The Fundamental Price of Secure Aggregation in Differentially Private Federated Learning

    Authors: Wei-Ning Chen, Christopher A. Choquette-Choo, Peter Kairouz, Ananda Theertha Suresh

    Abstract: We consider the problem of training a $d$ dimensional model with distributed differential privacy (DP) where secure aggregation (SecAgg) is used to ensure that the server only sees the noisy sum of $n$ model updates in every training round. Taking into account the constraints imposed by SecAgg, we characterize the fundamental communication cost required to obtain the best accuracy achievable under… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

  41. arXiv:2202.00388  [pdf

    cs.RO

    A Novel Method to Estimate Tilt Angle of a Body using a Pendulum

    Authors: Anandhu Suresh, Dr. Karnam Venkata Manga Raju

    Abstract: Most of the advanced control systems use sensor-based feedback for robust control. Tilt angle estimation is key feedback for many robotics and mechatronics applications in order to stabilize a system. Tilt angle cannot be directly measured when the system in consideration is not attached to a stationary frame. it is usually estimated through indirect measurements in such systems. The precision of… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

  42. arXiv:2112.13338  [pdf, other

    cs.CR cs.LG

    MPCLeague: Robust MPC Platform for Privacy-Preserving Machine Learning

    Authors: Ajith Suresh

    Abstract: In the modern era of computing, machine learning tools have demonstrated their potential in vital sectors, such as healthcare and finance, to derive proper inferences. The sensitive and confidential nature of the data in such sectors raises genuine concerns for data privacy. This motivated the area of Privacy-preserving Machine Learning (PPML), where privacy of data is guaranteed. In this thesis,… ▽ More

    Submitted 26 December, 2021; originally announced December 2021.

    Comments: PhD thesis

  43. arXiv:2111.05320  [pdf, ps, other

    cs.DS cs.IT math.ST stat.ML

    Robust Estimation for Random Graphs

    Authors: Jayadev Acharya, Ayush Jain, Gautam Kamath, Ananda Theertha Suresh, Huanyu Zhang

    Abstract: We study the problem of robustly estimating the parameter $p$ of an Erdős-Rényi random graph on $n$ nodes, where a $γ$ fraction of nodes may be adversarially corrupted. After showing the deficiencies of canonical estimators, we design a computationally-efficient spectral algorithm which estimates $p$ up to accuracy $\tilde O(\sqrt{p(1-p)}/n + γ\sqrt{p(1-p)} /\sqrt{n}+ γ/n)$ for $γ< 1/60$. Furtherm… ▽ More

    Submitted 15 February, 2022; v1 submitted 9 November, 2021; originally announced November 2021.

  44. arXiv:2110.15440  [pdf, other

    cs.CR cs.LG

    HD-cos Networks: Efficient Neural Architectures for Secure Multi-Party Computation

    Authors: Wittawat Jitkrittum, Michal Lukasik, Ananda Theertha Suresh, Felix Yu, Gang Wang

    Abstract: Multi-party computation (MPC) is a branch of cryptography where multiple non-colluding parties execute a well designed protocol to securely compute a function. With the non-colluding party assumption, MPC has a cryptographic guarantee that the parties will not learn sensitive information from the computation process, making it an appealing framework for applications that involve privacy-sensitive… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

  45. arXiv:2109.04570  [pdf, other

    eess.SY cs.RO

    Risk-perception-aware control design under dynamic spatial risks

    Authors: Aamodh Suresh, Sonia Martinez

    Abstract: This work proposes a novel risk-perception-aware (RPA) control design using non-rational perception of risks associated with uncertain dynamic spatial costs. We use Cumulative Prospect Theory (CPT) to model the risk perception of a decision maker (DM) and use it to construct perceived risk functions that transform the uncertain dynamic spatial cost to deterministic perceived risks of a DM. These r… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: 8 pages

  46. arXiv:2108.02117  [pdf, other

    cs.LG

    FedJAX: Federated learning simulation with JAX

    Authors: Jae Hun Ro, Ananda Theertha Suresh, Ke Wu

    Abstract: Federated learning is a machine learning technique that enables training across decentralized data. Recently, federated learning has become an active area of research due to an increased focus on privacy and security. In light of this, a variety of open source federated learning libraries have been developed and released. We introduce FedJAX, a JAX-based open source library for federated learning… ▽ More

    Submitted 5 November, 2021; v1 submitted 4 August, 2021; originally announced August 2021.

  47. arXiv:2107.06917  [pdf, other

    cs.LG

    A Field Guide to Federated Optimization

    Authors: Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz , et al. (28 additional authors not shown)

    Abstract: Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  48. arXiv:2106.10370  [pdf, other

    stat.ML cs.AI cs.LG

    On the benefits of maximum likelihood estimation for Regression and Forecasting

    Authors: Pranjal Awasthi, Abhimanyu Das, Rajat Sen, Ananda Theertha Suresh

    Abstract: We advocate for a practical Maximum Likelihood Estimation (MLE) approach towards designing loss functions for regression and forecasting, as an alternative to the typical approach of direct empirical risk minimization on a specific target metric. The MLE approach is better suited to capture inductive biases such as prior domain knowledge in datasets, and can output post-hoc estimators at inference… ▽ More

    Submitted 9 October, 2021; v1 submitted 18 June, 2021; originally announced June 2021.

  49. Tetrad: Actively Secure 4PC for Secure Training and Inference

    Authors: Nishat Koti, Arpita Patra, Rahul Rachuri, Ajith Suresh

    Abstract: Mixing arithmetic and boolean circuits to perform privacy-preserving machine learning has become increasingly popular. Towards this, we propose a framework for the case of four parties with at most one active corruption called Tetrad. Tetrad works over rings and supports two levels of security, fairness and robustness. The fair multiplication protocol costs 5 ring elements, improving over the st… ▽ More

    Submitted 15 February, 2022; v1 submitted 5 June, 2021; originally announced June 2021.

  50. arXiv:2105.07949  [pdf, other

    cs.CY cs.CL

    Using Transformers to Provide Teachers with Personalized Feedback on their Classroom Discourse: The TalkMoves Application

    Authors: Abhijit Suresh, Jennifer Jacobs, Vivian Lai, Chenhao Tan, Wayne Ward, James H. Martin, Tamara Sumner

    Abstract: TalkMoves is an innovative application designed to support K-12 mathematics teachers to reflect on, and continuously improve their instructional practices. This application combines state-of-the-art natural language processing capabilities with automated speech recognition to automatically analyze classroom recordings and provide teachers with personalized feedback on their use of specific types o… ▽ More

    Submitted 29 April, 2021; originally announced May 2021.

    Comments: Presented at the AAAI 2021 Spring Symposium on Artificial Intelligence for K-12 Education