Skip to main content

Showing 1–20 of 20 results for author: Das, R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.11206  [pdf, other

    cs.LG cs.CR stat.ML

    Retraining with Predicted Hard Labels Provably Increases Model Accuracy

    Authors: Rudrajit Das, Inderjit S. Dhillon, Alessandro Epasto, Adel Javanmard, Jieming Mao, Vahab Mirrokni, Sujay Sanghavi, Peilin Zhong

    Abstract: The performance of a model trained with \textit{noisy labels} is often improved by simply \textit{retraining} the model with its own predicted \textit{hard} labels (i.e., $1$/$0$ labels). Yet, a detailed theoretical characterization of this phenomenon is lacking. In this paper, we theoretically analyze retraining in a linearly separable setting with randomly corrupted labels given to us and prove… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2402.09020  [pdf, ps, other

    stat.AP

    Bayesian reliability acceptance sampling plan with optional warranty under hybrid censoring

    Authors: Rathin Das, Biswabrata Pradhan

    Abstract: This work considers design of Bayesian reliability acceptance sampling plan (RASP) under hybrid censored life test for the products sold under optional warranty. The consumer and manufacturer agree on a common lifetime distribution of the product. However, they differ in the assessment of the prior distributions because of the adversarial nature of the consumer and manufacturer. The consumer takes… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  3. arXiv:2402.07114  [pdf, other

    cs.LG math.NA math.OC stat.ML

    Towards Quantifying the Preconditioning Effect of Adam

    Authors: Rudrajit Das, Naman Agarwal, Sujay Sanghavi, Inderjit S. Dhillon

    Abstract: There is a notable dearth of results characterizing the preconditioning effect of Adam and showing how it may alleviate the curse of ill-conditioning -- an issue plaguing gradient descent (GD). In this work, we perform a detailed analysis of Adam's preconditioning effect for quadratic functions and quantify to what extent Adam can mitigate the dependence on the condition number of the Hessian. Our… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  4. arXiv:2402.07052  [pdf, other

    cs.LG stat.ML

    Understanding the Training Speedup from Sampling with Approximate Losses

    Authors: Rudrajit Das, Xi Chen, Bertram Ieong, Parikshit Bansal, Sujay Sanghavi

    Abstract: It is well known that selecting samples with large losses/gradients can significantly reduce the number of training steps. However, the selection overhead is often too high to yield any meaningful gains in terms of overall training time. In this work, we focus on the greedy approach of selecting samples with large \textit{approximate losses} instead of exact losses in order to reduce the selection… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

  5. arXiv:2310.12145  [pdf, other

    cs.LG cs.AI cs.CY stat.ML

    Fairer and More Accurate Tabular Models Through NAS

    Authors: Richeek Das, Samuel Dooley

    Abstract: Making models algorithmically fairer in tabular data has been long studied, with techniques typically oriented towards fixes which usually take a neural model with an undesirable outcome and make changes to how the data are ingested, what the model weights are, or how outputs are processed. We employ an emergent and different strategy where we consider updating the model's architecture and trainin… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  6. arXiv:2301.13304  [pdf, other

    cs.LG cs.AI stat.ML

    Understanding Self-Distillation in the Presence of Label Noise

    Authors: Rudrajit Das, Sujay Sanghavi

    Abstract: Self-distillation (SD) is the process of first training a \enquote{teacher} model and then using its predictions to train a \enquote{student} model with the \textit{same} architecture. Specifically, the student's objective function is $\big(ξ*\ell(\text{teacher's predictions}, \text{ student's predictions}) + (1-ξ)*\ell(\text{given labels}, \text{ student's predictions})\big)$, where $\ell$ is som… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

  7. Modelling and classifying joint trajectories of self-reported mood and pain in a large cohort study

    Authors: Rajenki Das, Mark Muldoon, Mark Lunt, John McBeth, Belay Birlie Yimer, Thomas House

    Abstract: It is well-known that mood and pain interact with each other, however individual-level variability in this relationship has been less well quantified than overall associations between low mood and pain. Here, we leverage the possibilities presented by mobile health data, in particular the "Cloudy with a Chance of Pain" study, which collected longitudinal data from the residents of the UK with chro… ▽ More

    Submitted 1 January, 2024; v1 submitted 30 September, 2022; originally announced September 2022.

  8. arXiv:2206.10713  [pdf, other

    cs.LG stat.ML

    Beyond Uniform Lipschitz Condition in Differentially Private Optimization

    Authors: Rudrajit Das, Satyen Kale, Zheng Xu, Tong Zhang, Sujay Sanghavi

    Abstract: Most prior results on differentially private stochastic gradient descent (DP-SGD) are derived under the simplistic assumption of uniform Lipschitzness, i.e., the per-sample gradients are uniformly bounded. We generalize uniform Lipschitzness by assuming that the per-sample gradients have sample-dependent upper bounds, i.e., per-sample Lipschitz constants, which themselves may be unbounded. We prov… ▽ More

    Submitted 5 June, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: To appear in ICML 2023

  9. arXiv:2111.05728  [pdf, other

    stat.AP q-bio.PE

    Diversity of symptom phenotypes in SARS-CoV-2 community infections observed in multiple large datasets

    Authors: Martyn Fyles, Karina-Doris Vihta, Carole H Sudre, Harry Long, Rajenki Das, Caroline Jay, Tom Wingfield, Fergus Cumming, William Green, Pantelis Hadjipantelis, Joni Kirk, Claire J Steves, Sebastien Ourselin, Graham F Medley, Elizabeth Fearon, Thomas House

    Abstract: Through the use of cutting-edge unsupervised classification techniques from statistics and machine learning, we characterise symptom phenotypes among symptomatic SARS-CoV-2 PCR-positive community cases. We first analyse each dataset in isolation and across age bands, before using methods that allow us to compare multiple datasets. While we observe separation due to the total number of symptoms exp… ▽ More

    Submitted 20 November, 2023; v1 submitted 10 November, 2021; originally announced November 2021.

    Comments: 60 pages; 29 figures

    MSC Class: 62P10

  10. arXiv:2110.07531  [pdf

    stat.ML cs.LG physics.bio-ph q-bio.BM

    Deep learning models for predicting RNA degradation via dual crowdsourcing

    Authors: Hannah K. Wayment-Steele, Wipapat Kladwang, Andrew M. Watkins, Do Soon Kim, Bojan Tunguz, Walter Reade, Maggie Demkin, Jonathan Romano, Roger Wellington-Oguri, John J. Nicol, Jiayang Gao, Kazuki Onodera, Kazuki Fujikawa, Hanfei Mao, Gilles Vandewiele, Michele Tinti, Bram Steenwinckel, Takuya Ito, Taiga Noumi, Shujun He, Keiichiro Ishi, Youhan Lee, Fatih Öztürk, Anthony Chiu, Emin Öztürk , et al. (4 additional authors not shown)

    Abstract: Messenger RNA-based medicines hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a ke… ▽ More

    Submitted 22 April, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

  11. arXiv:2106.07094  [pdf, other

    cs.LG cs.DC eess.SP math.OC stat.ML

    On the Convergence of Differentially Private Federated Learning on Non-Lipschitz Objectives, and with Normalized Client Updates

    Authors: Rudrajit Das, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon

    Abstract: There is a dearth of convergence results for differentially private federated learning (FL) with non-Lipschitz objective functions (i.e., when gradient norms are not bounded). The primary reason for this is that the clip** operation (i.e., projection onto an $\ell_2$ ball of a fixed radius called the clip** threshold) for bounding the sensitivity of the average update to each client's update i… ▽ More

    Submitted 15 April, 2022; v1 submitted 13 June, 2021; originally announced June 2021.

  12. arXiv:2106.03480  [pdf, other

    stat.ML cs.LG

    A Distance Covariance-based Kernel for Nonlinear Causal Clustering in Heterogeneous Populations

    Authors: Alex Markham, Richeek Das, Moritz Grosse-Wentrup

    Abstract: We consider the problem of causal structure learning in the setting of heterogeneous populations, i.e., populations in which a single causal structure does not adequately represent all population members, as is common in biological and social sciences. To this end, we introduce a distance covariance-based kernel designed specifically to measure the similarity between the underlying nonlinear causa… ▽ More

    Submitted 18 February, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: 17 pages, 3 figures; accepted to 1st Conference on Causal Learning and Reasoning (CLeaR 2022)

  13. arXiv:2012.04061  [pdf, other

    stat.ML cs.DC cs.LG math.OC

    Faster Non-Convex Federated Learning via Global and Local Momentum

    Authors: Rudrajit Das, Anish Acharya, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu

    Abstract: We propose \texttt{FedGLOMO}, a novel federated learning (FL) algorithm with an iteration complexity of $\mathcal{O}(ε^{-1.5})$ to converge to an $ε$-stationary point (i.e., $\mathbb{E}[\|\nabla f(\bm{x})\|^2] \leq ε$) for smooth non-convex functions -- under arbitrary client heterogeneity and compressed communication -- compared to the $\mathcal{O}(ε^{-2})$ complexity of most prior works. Our key… ▽ More

    Submitted 24 October, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

  14. arXiv:2011.10643  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    On the Benefits of Multiple Gossip Steps in Communication-Constrained Decentralized Optimization

    Authors: Abolfazl Hashemi, Anish Acharya, Rudrajit Das, Haris Vikalo, Sujay Sanghavi, Inderjit Dhillon

    Abstract: In decentralized optimization, it is common algorithmic practice to have nodes interleave (local) gradient descent iterations with gossip (i.e. averaging over the network) steps. Motivated by the training of large-scale machine learning models, it is also increasingly common to require that messages be {\em lossy compressed} versions of the local parameters. In this paper, we show that, in such co… ▽ More

    Submitted 20 November, 2020; originally announced November 2020.

  15. arXiv:1909.06930  [pdf, other

    cs.LG stat.ML

    On the Separability of Classes with the Cross-Entropy Loss Function

    Authors: Rudrajit Das, Subhasis Chaudhuri

    Abstract: In this paper, we focus on the separability of classes with the cross-entropy loss function for classification problems by theoretically analyzing the intra-class distance and inter-class distance (i.e. the distance between any two points belonging to the same class and different classes, respectively) in the feature space, i.e. the space of representations learnt by neural networks. Specifically,… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

  16. arXiv:1907.10165  [pdf, other

    cs.LG cs.CL stat.ML

    Optimal Transport-based Alignment of Learned Character Representations for String Similarity

    Authors: Derek Tam, Nicholas Monath, Ari Kobren, Aaron Traylor, Rajarshi Das, Andrew McCallum

    Abstract: String similarity models are vital for record linkage, entity resolution, and search. In this work, we present STANCE --a learned model for computing the similarity of two strings. Our approach encodes the characters of each string, aligns the encodings using Sinkhorn Iteration (alignment is posed as an instance of optimal transport) and scores the alignment with a convolutional neural network. W… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: ACL Long Paper

  17. arXiv:1809.06798  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Generative x-vectors for text-independent speaker verification

    Authors: Longting Xu, Rohan Kumar Das, Emre Yılmaz, Jichen Yang, Haizhou Li

    Abstract: Speaker verification (SV) systems using deep neural network embeddings, so-called the x-vector systems, are becoming popular due to its good performance superior to the i-vector systems. The fusion of these systems provides improved performance benefiting both from the discriminatively trained x-vectors and generative i-vectors capturing distinct speaker characteristics. In this paper, we propose… ▽ More

    Submitted 17 September, 2018; originally announced September 2018.

    Comments: Accepted for publication at SLT 2018

  18. arXiv:1809.02497  [pdf, other

    cs.LG stat.ML

    Sparse Kernel PCA for Outlier Detection

    Authors: Rudrajit Das, Aditya Golatkar, Suyash P. Awate

    Abstract: In this paper, we propose a new method to perform Sparse Kernel Principal Component Analysis (SKPCA) and also mathematically analyze the validity of SKPCA. We formulate SKPCA as a constrained optimization problem with elastic net regularization (Hastie et al.) in kernel feature space and solve it. We consider outlier detection (where KPCA is employed) as an application for SKPCA, using the RBF ker… ▽ More

    Submitted 13 September, 2018; v1 submitted 7 September, 2018; originally announced September 2018.

    Comments: Accepted at IEEE ICMLA 2018 for Oral Presentation

  19. arXiv:1803.03146  [pdf

    q-bio.QM cs.AI stat.ML

    SentRNA: Improving computational RNA design by incorporating a prior of human design strategies

    Authors: Jade Shi, Rhiju Das, Vijay S. Pande

    Abstract: Solving the RNA inverse folding problem is a critical prerequisite to RNA design, an emerging field in bioengineering with a broad range of applications from reaction catalysis to cancer therapy. Although significant progress has been made in develo** machine-based inverse RNA folding algorithms, current approaches still have difficulty designing sequences for large or complex targets. On the ot… ▽ More

    Submitted 5 March, 2019; v1 submitted 8 March, 2018; originally announced March 2018.

    Comments: 27 pages (not including Supplementary Information), 9 figures, 7 tables

  20. arXiv:1703.07915  [pdf, other

    stat.ML cond-mat.dis-nn cs.CV cs.LG hep-th

    Perspective: Energy Landscapes for Machine Learning

    Authors: Andrew J. Ballard, Ritankar Das, Stefano Martiniani, Dhagash Mehta, Levent Sagun, Jacob D. Stevenson, David J. Wales

    Abstract: Machine learning techniques are being increasingly used as flexible non-linear fitting and prediction tools in the physical sciences. Fitting functions that exhibit multiple solutions as local minima can be analysed in terms of the corresponding machine learning landscape. Methods to explore and visualise molecular potential energy landscapes can be applied to these machine learning landscapes to… ▽ More

    Submitted 22 March, 2017; originally announced March 2017.

    Comments: 41 pages, 25 figures. Accepted for publication in Physical Chemistry Chemical Physics, 2017