Skip to main content

Showing 1–14 of 14 results for author: Maity, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.15172  [pdf, other

    stat.ML cs.LG

    Learning the Distribution Map in Reverse Causal Performative Prediction

    Authors: Daniele Bracale, Subha Maity, Moulinath Banerjee, Yuekai Sun

    Abstract: In numerous predictive scenarios, the predictive model affects the sampling distribution; for example, job applicants often meticulously craft their resumes to navigate through a screening systems. Such shifts in distribution are particularly prevalent in the realm of social computing, yet, the strategies to learn these shifts from data remain remarkably limited. Inspired by a microeconomic model… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 17 pages, 4 figures

  2. arXiv:2312.04601  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Estimating Fréchet bounds for validating programmatic weak supervision

    Authors: Felipe Maia Polo, Mikhail Yurochkin, Moulinath Banerjee, Subha Maity, Yuekai Sun

    Abstract: We develop methods for estimating Fréchet bounds on (possibly high-dimensional) distribution classes in which some variables are continuous-valued. We establish the statistical correctness of the computed bounds under uncertainty in the marginal constraints and demonstrate the usefulness of our algorithms by evaluating the performance of machine learning (ML) models trained with programmatic weak… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  3. arXiv:2310.01583  [pdf, other

    stat.ML cs.LG

    An Investigation of Representation and Allocation Harms in Contrastive Learning

    Authors: Subha Maity, Mayank Agarwal, Mikhail Yurochkin, Yuekai Sun

    Abstract: The effect of underrepresentation on the performance of minority groups is known to be a serious problem in supervised learning settings; however, it has been underexplored so far in the context of self-supervised learning (SSL). In this paper, we demonstrate that contrastive learning (CL), a popular variant of SSL, tends to collapse representations of minority groups with certain majority groups.… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  4. arXiv:2304.06574  [pdf, other

    stat.ML cs.LG

    Bayes classifier cannot be learned from noisy responses with unknown noise rates

    Authors: Soham Bakshi, Subha Maity

    Abstract: Training a classifier with noisy labels typically requires the learner to specify the distribution of label noise, which is often unknown in practice. Although there have been some recent attempts to relax that requirement, we show that the Bayes decision rule is unidentified in most classification problems with noisy labels. This suggests it is generally not possible to bypass/relax the requireme… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: Invited to present in ICLR Tiny Paper 2023

  5. arXiv:2302.09795  [pdf, other

    cs.LG cs.CV stat.ML

    Simple Disentanglement of Style and Content in Visual Representations

    Authors: Lilian Ngweta, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

    Abstract: Learning visual representations with interpretable features, i.e., disentangled representations, remains a challenging problem. Existing methods demonstrate some success but are hard to apply to large-scale vision datasets like ImageNet. In this work, we propose a simple post-processing framework to disentangle content and style in learned representations from pre-trained vision models. We model t… ▽ More

    Submitted 31 May, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: International Conference on Machine Learning (ICML) 2023

  6. arXiv:2205.13577  [pdf, other

    cs.LG stat.ME stat.ML

    Understanding new tasks through the lens of training data via exponential tilting

    Authors: Subha Maity, Mikhail Yurochkin, Moulinath Banerjee, Yuekai Sun

    Abstract: Deploying machine learning models to new tasks is a major challenge despite the large size of the modern training datasets. However, it is conceivable that the training data can be reweighted to be more representative of the new (target) task. We consider the problem of reweighing the training samples to gain insights into the distribution of the target task. Specifically, we formulate a distribut… ▽ More

    Submitted 21 February, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted in ICLR 2023

  7. arXiv:2205.13575  [pdf, other

    cs.LG stat.CO

    Predictor-corrector algorithms for stochastic optimization under gradual distribution shift

    Authors: Subha Maity, Debarghya Mukherjee, Moulinath Banerjee, Yuekai Sun

    Abstract: Time-varying stochastic optimization problems frequently arise in machine learning practice (e.g. gradual domain shift, object tracking, strategic classification). Although most problems are solved in discrete time, the underlying process is often continuous in nature. We exploit this underlying continuity by develo** predictor-corrector algorithms for time-varying stochastic optimizations. We p… ▽ More

    Submitted 23 February, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted in ICLR 2023

  8. arXiv:2111.10841  [pdf, other

    stat.ME

    A linear adjustment based approach to posterior drift in transfer learning

    Authors: Subha Maity, Diptavo Dutta, Jonathan Terhorst, Yuekai Sun, Moulinath Banerjee

    Abstract: We present a new model and methods for the posterior drift problem where the regression function in the target domain is modeled as a linear adjustment (on an appropriate scale) of that in the source domain, an idea that inherits the simplicity and the usefulness of generalized linear models and accelerated failure time models from the classical statistics literature, and study the theoretical pro… ▽ More

    Submitted 12 December, 2021; v1 submitted 21 November, 2021; originally announced November 2021.

  9. arXiv:2103.16714  [pdf, other

    stat.ML cs.LG

    Statistical inference for individual fairness

    Authors: Subha Maity, Songkai Xue, Mikhail Yurochkin, Yuekai Sun

    Abstract: As we rely on machine learning (ML) models to make more consequential decisions, the issue of ML models perpetuating or even exacerbating undesirable historical biases (e.g., gender and racial biases) has come to the fore of the public's attention. In this paper, we focus on the problem of detecting violations of individual fairness in ML models. We formalize the problem as measuring the susceptib… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

  10. arXiv:2011.03173  [pdf, other

    stat.ML cs.LG

    Does enforcing fairness mitigate biases caused by subpopulation shift?

    Authors: Subha Maity, Debarghya Mukherjee, Mikhail Yurochkin, Yuekai Sun

    Abstract: Many instances of algorithmic bias are caused by subpopulation shifts. For example, ML models often perform worse on demographic groups that are underrepresented in the training data. In this paper, we study whether enforcing algorithmic fairness during training improves the performance of the trained model in the \emph{target domain}. On one hand, we conceive scenarios in which enforcing fairness… ▽ More

    Submitted 26 October, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

  11. arXiv:2003.10443  [pdf, other

    math.ST stat.ML

    Minimax optimal approaches to the label shift problem in non-parametric settings

    Authors: Subha Maity, Yuekai Sun, Moulinath Banerjee

    Abstract: We study the minimax rates of the label shift problem in non-parametric classification. In addition to the unsupervised setting in which the learner only has access to unlabeled examples from the target domain, we also consider the setting in which a small number of labeled examples from the target domain is available to the learner. Our study reveals a difference in the difficulty of the label sh… ▽ More

    Submitted 22 November, 2022; v1 submitted 23 March, 2020; originally announced March 2020.

  12. arXiv:1912.11928  [pdf, other

    stat.ME stat.ML

    Meta-analysis of heterogeneous data: integrative sparse regression in high-dimensions

    Authors: Subha Maity, Yuekai Sun, Moulinath Banerjee

    Abstract: We consider the task of meta-analysis in high-dimensional settings in which the data sources are similar but non-identical. To borrow strength across such heterogeneous datasets, we introduce a global parameter that emphasizes interpretability and statistical efficiency in the presence of heterogeneity. We also propose a one-shot estimator of the global parameter that preserves the anonymity of th… ▽ More

    Submitted 30 June, 2022; v1 submitted 26 December, 2019; originally announced December 2019.

  13. arXiv:1906.09769  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications

    Authors: Nimisha Ghosh, Rourab Paul, Satyabrata Maity, Krishanu Maity, Sayantan Saha

    Abstract: Fault detection in sensor nodes is a pertinent issue that has been an important area of research for a very long time. But it is not explored much as yet in the context of Internet of Things. Internet of Things work with a massive amount of data so the responsibility for guaranteeing the accuracy of the data also lies with it. Moreover, a lot of important and critical decisions are made based on t… ▽ More

    Submitted 24 June, 2019; originally announced June 2019.

  14. arXiv:1203.2511  [pdf

    cs.LG cs.CE cs.NI eess.SY stat.AP

    A Simple Flood Forecasting Scheme Using Wireless Sensor Networks

    Authors: Victor Seal, Arnab Raha, Shovan Maity, Souvik Kr Mitra, Amitava Mukherjee, Mrinal Kanti Naskar

    Abstract: This paper presents a forecasting model designed using WSNs (Wireless Sensor Networks) to predict flood in rivers using simple and fast calculations to provide real-time results and save the lives of people who may be affected by the flood. Our prediction model uses multiple variable robust linear regression which is easy to understand and simple and cost effective in implementation, is speed effi… ▽ More

    Submitted 9 March, 2012; originally announced March 2012.

    Comments: 16 pages, 4 figures, published in International Journal Of Ad-Hoc, Sensor And Ubiquitous Computing, February 2012; V. seal et al, 'A Simple Flood Forecasting Scheme Using Wireless Sensor Networks', IJASUC, Feb.2012