Skip to main content

Showing 1–19 of 19 results for author: Ramaswamy, H

.
  1. arXiv:2406.13154  [pdf, other

    stat.ML cs.AI cs.LG

    Conditional score-based diffusion models for solving inverse problems in mechanics

    Authors: Agnimitra Dasgupta, Harisankar Ramaswamy, Javier Murgoitio Esandi, Ken Foo, Runze Li, Qifa Zhou, Brendan Kennedy, Assad Oberai

    Abstract: We propose a framework to perform Bayesian inference using conditional score-based diffusion models to solve a class of inverse problems in mechanics involving the inference of a specimen's spatially varying material properties from noisy measurements of its mechanical response to loading. Conditional score-based diffusion models are generative models that learn to approximate the score function o… ▽ More

    Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2404.04312  [pdf, other

    cs.LG cs.AI cs.NE

    Half-Space Feature Learning in Neural Networks

    Authors: Mahesh Lorik Yadav, Harish Guruprasad Ramaswamy, Chandrashekar Lakshminarayanan

    Abstract: There currently exist two extreme viewpoints for neural network feature learning -- (i) Neural networks simply implement a kernel method (a la NTK) and hence no features are learned (ii) Neural networks can represent (and hence learn) intricate hierarchical features suitable for the data. We argue in this paper neither interpretation is likely to be correct based on a novel viewpoint. Neural netwo… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  3. On the Learning Dynamics of Attention Networks

    Authors: Rahul Vashisht, Harish G. Ramaswamy

    Abstract: Attention models are typically learned by optimizing one of three standard loss functions that are variously called -- soft attention, hard attention, and latent variable marginal likelihood (LVML) attention. All three paradigms are motivated by the same goal of finding two models -- a `focus' model that `selects' the right \textit{segment} of the input and a `classification' model that processes… ▽ More

    Submitted 12 October, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: Proceedings at ECAI-2023 IOS Press

  4. arXiv:2212.14776  [pdf, ps, other

    cs.LG

    On the Interpretability of Attention Networks

    Authors: Lakshmi Narayan Pandey, Rahul Vashisht, Harish G. Ramaswamy

    Abstract: Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: ''The output depends only on a small (but unknown) segment of the input.'' In several practical applications like image captioning and language translation, this is mostly true. In trained models with an attention mechanism, the outputs of an intermediate module that encodes… ▽ More

    Submitted 14 May, 2023; v1 submitted 30 December, 2022; originally announced December 2022.

    Comments: ACML 2022,PMLR, Volume 189, https://proceedings.mlr.press/v189/pandey23a/pandey23a.pdf

    Journal ref: Proceedings of The 14th Asian Conference on Machine, 832--847, 2023, Volume:189; PMLR

  5. arXiv:2210.09695  [pdf, other

    stat.ML cs.LG

    Consistent Multiclass Algorithms for Complex Metrics and Constraints

    Authors: Harikrishna Narasimhan, Harish G. Ramaswamy, Shiv Kumar Tavker, Drona Khurana, Praneeth Netrapalli, Shivani Agarwal

    Abstract: We present consistent algorithms for multiclass learning with complex performance metrics and constraints, where the objective and constraints are defined by arbitrary functions of the confusion matrix. This setting includes many common performance metrics such as the multiclass G-mean and micro F1-measure, and constraints such as those on the classifier's precision and recall and more recent meas… ▽ More

    Submitted 18 October, 2022; v1 submitted 18 October, 2022; originally announced October 2022.

  6. arXiv:2204.07899  [pdf, ps, other

    cs.HC

    QTBIPOC PD: Exploring the Intersections of Race, Gender, and Sexual Orientation in Participatory Design

    Authors: Naba Rizvi, Reggie Casanova-Perez, Harshini Ramaswamy, Emily Bascom, Lisa Dirks, Nadir Weibel

    Abstract: As Human-Computer Interaction (HCI) research aims to be inclusive and representative of many marginalized identities, there is still a lack of available literature and research on intersectional considerations of race, gender, and sexual orientation, especially when it comes to participatory design. We aim to create a space to generate community recommendations for effectively and appropriately en… ▽ More

    Submitted 16 April, 2022; originally announced April 2022.

    Comments: 6 pages, 2 tables, CHI 2022 Workshop

  7. arXiv:2204.07897  [pdf, other

    cs.HC

    Making Hidden Bias Visible: Designing a Feedback Ecosystem for Primary Care Providers

    Authors: Naba Rizvi, Harshini Ramaswamy, Reggie Casanova-Perez, Andrea Hartzler, Nadir Weibel

    Abstract: Implicit bias may perpetuate healthcare disparities for marginalized patient populations. Such bias is expressed in communication between patients and their providers. We design an ecosystem with guidance from providers to make this bias explicit in patient-provider communication. Our end users are providers seeking to improve their quality of care for patients who are Black, Indigenous, People of… ▽ More

    Submitted 16 April, 2022; originally announced April 2022.

    Comments: 6 pages, 2 figures, 2 tables, CHI 2022 Workshop Publication (Complex Health Ecosystems)

  8. arXiv:2202.07773  [pdf, other

    stat.ML cs.LG

    The efficacy and generalizability of conditional GANs for posterior inference in physics-based inverse problems

    Authors: Deep Ray, Harisankar Ramaswamy, Dhruv V. Patel, Assad A. Oberai

    Abstract: In this work, we train conditional Wasserstein generative adversarial networks to effectively sample from the posterior of physics-based Bayesian inference problems. The generator is constructed using a U-Net architecture, with the latent information injected using conditional instance normalization. The former facilitates a multiscale inverse map, while the latter enables the decoupling of the la… ▽ More

    Submitted 17 November, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

    MSC Class: 62F15; 68T07; 65M32

  9. arXiv:2111.13075  [pdf, other

    cs.LG

    Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

    Authors: Umangi Jain, Harish G. Ramaswamy

    Abstract: Despite their massive success, training successful deep neural networks still largely relies on experimentally choosing an architecture, hyper-parameters, initialization, and training mechanism. In this work, we focus on determining the success of standard gradient descent method for training deep neural networks on a specified dataset, architecture, and initialization (DAI) combination. Through e… ▽ More

    Submitted 25 November, 2021; originally announced November 2021.

    Comments: 10 pages, 9 figures

  10. arXiv:2103.02648  [pdf, other

    q-bio.PE

    The Effect of Super-spreader Events in Epidemics

    Authors: Harisankar Ramaswamy, Assad A Oberai, Mitul Luhar, Yannis C Yortsos

    Abstract: The spread of infectious epidemics is often accelerated by super-spreader events. Understanding their effect is important, particularly in the context of standard epidemiological models, which require estimates for parameters such as $R_0$. In this letter, we show that the effective value of $R_0$ in super-spreader situations is significantly large, of the order of hundreds, suggesting a delta-fun… ▽ More

    Submitted 26 March, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

    MSC Class: 92-02 ACM Class: I.6.3

  11. arXiv:2012.08854  [pdf, ps, other

    cs.LG stat.ML

    Using noise resilience for ranking generalization of deep neural networks

    Authors: Depen Morwani, Rahul Vashisht, Harish G. Ramaswamy

    Abstract: Recent papers have shown that sufficiently overparameterized neural networks can perfectly fit even random labels. Thus, it is crucial to understand the underlying reason behind the generalization performance of a network on real-world data. In this work, we propose several measures to predict the generalization error of a network given the training data and its parameters. Using one of these meas… ▽ More

    Submitted 16 December, 2020; originally announced December 2020.

    ACM Class: I.5.1

  12. arXiv:2010.12909  [pdf, other

    cs.LG stat.ML

    Inductive Bias of Gradient Descent for Weight Normalized Smooth Homogeneous Neural Nets

    Authors: Depen Morwani, Harish G. Ramaswamy

    Abstract: We analyze the inductive bias of gradient descent for weight normalized smooth homogeneous neural nets, when trained on exponential or cross-entropy loss. We analyse both standard weight normalization (SWN) and exponential weight normalization (EWN), and show that the gradient flow path with EWN is equivalent to gradient flow on standard networks with an adaptive learning rate. We extend these res… ▽ More

    Submitted 31 January, 2023; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: Accepted to ALT 2022

    ACM Class: I.5.1; I.2.6

  13. arXiv:2009.07801  [pdf, other

    stat.ML cs.LG

    Convex Calibrated Surrogates for the Multi-Label F-Measure

    Authors: Mingyuan Zhang, Harish G. Ramaswamy, Shivani Agarwal

    Abstract: The F-measure is a widely used performance measure for multi-label classification, where multiple labels can be active in an instance simultaneously (e.g. in image tagging, multiple tags can be active in any image). In particular, the F-measure explicitly balances recall (fraction of active labels predicted to be active) and precision (fraction of labels predicted to be active that are actually so… ▽ More

    Submitted 16 September, 2020; originally announced September 2020.

    Comments: Accepted to ICML 2020

  14. arXiv:2008.12766  [pdf, other

    q-bio.PE q-bio.QM

    A comprehensive spatial-temporal infection model

    Authors: Harisankar Ramaswamy, Assad A Oberai, Yannis C Yortsos

    Abstract: Motivated by analogies between the spreading of human-to-human infections and of chemical processes, we develop a comprehensive model that accounts both for infection and for transport. In this analogy, the three different populations of infection models correspond to three chemical species. Areal densities emerge as the key variables, thus capturing the effect of spatial density. We derive expres… ▽ More

    Submitted 4 December, 2020; v1 submitted 28 August, 2020; originally announced August 2020.

    MSC Class: 35Q92

  15. arXiv:1810.11975  [pdf, other

    cs.LG cs.CL stat.ML

    On Controllable Sparse Alternatives to Softmax

    Authors: Anirban Laha, Saneem A. Chemmengath, Priyanka Agrawal, Mitesh M. Khapra, Karthik Sankaranarayanan, Harish G. Ramaswamy

    Abstract: Converting an n-dimensional vector to a probability distribution over n objects is a commonly used component in many machine learning tasks like multiclass classification, multilabel classification, attention mechanisms etc. For this, several probability map** functions have been proposed and employed in literature such as softmax, sum-normalization, spherical softmax, and sparsemax, but there i… ▽ More

    Submitted 30 October, 2018; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: To appear in NIPS 2018, Total 16 pages including appendix

  16. arXiv:1603.02501  [pdf, other

    cs.LG stat.ML

    Mixture Proportion Estimation via Kernel Embedding of Distributions

    Authors: Harish G. Ramaswamy, Clayton Scott, Ambuj Tewari

    Abstract: Mixture proportion estimation (MPE) is the problem of estimating the weight of a component distribution in a mixture, given samples from the mixture and component. This problem constitutes a key part in many "weakly supervised learning" problems like learning with positive and unlabelled samples, learning with label noise, anomaly detection and crowdsourcing. While there have been several methods… ▽ More

    Submitted 31 May, 2016; v1 submitted 8 March, 2016; originally announced March 2016.

  17. arXiv:1505.04137  [pdf, other

    cs.LG stat.ML

    Consistent Algorithms for Multiclass Classification with a Reject Option

    Authors: Harish G. Ramaswamy, Ambuj Tewari, Shivani Agarwal

    Abstract: We consider the problem of $n$-class classification ($n\geq 2$), where the classifier can choose to abstain from making predictions at a given cost, say, a factor $α$ of the cost of misclassification. Designing consistent algorithms for such $n$-class classification problems with a `reject option' is the main goal of this paper, thereby extending and generalizing previously known results for… ▽ More

    Submitted 15 May, 2015; originally announced May 2015.

  18. arXiv:1501.00287  [pdf, ps, other

    cs.LG stat.ML

    Consistent Classification Algorithms for Multi-class Non-Decomposable Performance Metrics

    Authors: Harish G. Ramaswamy, Harikrishna Narasimhan, Shivani Agarwal

    Abstract: We study consistency of learning algorithms for a multi-class performance metric that is a non-decomposable function of the confusion matrix of a classifier and cannot be expressed as a sum of losses on individual data points; examples of such performance metrics include the macro F-measure popular in information retrieval and the G-mean metric used in class-imbalanced problems. While there has be… ▽ More

    Submitted 1 January, 2015; originally announced January 2015.

  19. arXiv:1408.2764  [pdf, other

    cs.LG stat.ML

    Convex Calibration Dimension for Multiclass Loss Matrices

    Authors: Harish G. Ramaswamy, Shivani Agarwal

    Abstract: We study consistency properties of surrogate loss functions for general multiclass learning problems, defined by a general multiclass loss matrix. We extend the notion of classification calibration, which has been studied for binary and multiclass 0-1 classification problems (and for certain other specific learning problems), to the general multiclass setting, and derive necessary and sufficient c… ▽ More

    Submitted 23 August, 2015; v1 submitted 12 August, 2014; originally announced August 2014.

    Comments: Accepted to JMLR, pending editing