Skip to main content

Showing 1–50 of 100 results for author: Thiagarajan, J

.
  1. arXiv:2407.00356  [pdf, other

    cs.LG cs.CV

    Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization

    Authors: Hongjun Choi, Jayaraman J. Thiagarajan, Ruben Glatt, Shusen Liu

    Abstract: In this work, we investigate the fundamental trade-off regarding accuracy and parameter efficiency in the parameterization of neural network weights using predictor networks. We present a surprising finding that, when recovering the original model accuracy is the sole objective, it can be achieved effectively through the weight reconstruction objective alone. Additionally, we explore the underlyin… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  2. arXiv:2406.17117  [pdf, other

    cs.CV

    Speeding Up Image Classifiers with Little Companions

    Authors: Yang Liu, Kowshik Thopalli, Jayaraman Thiagarajan

    Abstract: Scaling up neural networks has been a key recipe to the success of large language and vision models. However, in practice, up-scaled models can be disproportionately costly in terms of computations, providing only marginal improvements in performance; for example, EfficientViT-L3-384 achieves <2% improvement on ImageNet-1K accuracy over the base L1-224 model, while requiring $14\times$ more multip… ▽ More

    Submitted 26 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  3. arXiv:2406.00529  [pdf, other

    cs.LG cs.CV stat.ML

    On the Use of Anchoring for Training Vision Models

    Authors: Vivek Narayanaswamy, Kowshik Thopalli, Rushil Anirudh, Yamen Mubarka, Wesam Sakla, Jayaraman J. Thiagarajan

    Abstract: Anchoring is a recent, architecture-agnostic principle for training deep neural networks that has been shown to significantly improve uncertainty estimation, calibration, and extrapolation capabilities. In this paper, we systematically explore anchoring as a general protocol for training vision models, providing fundamental insights into its training and inference processes and their implications… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  4. arXiv:2404.08761  [pdf, ps, other

    cs.CV cs.LG

    `Eyes of a Hawk and Ears of a Fox': Part Prototype Network for Generalized Zero-Shot Learning

    Authors: Joshua Feinglass, Jayaraman J. Thiagarajan, Rushil Anirudh, T. S. Jayram, Yezhou Yang

    Abstract: Current approaches in Generalized Zero-Shot Learning (GZSL) are built upon base models which consider only a single class attribute vector representation over the entire image. This is an oversimplification of the process of novel category recognition, where different regions of the image may have properties from different seen classes and thus have different predominant attributes. With this in m… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted to the CVPR 2024 LIMIT Workshop

  5. arXiv:2401.03350  [pdf, other

    cs.LG stat.ML

    Accurate and Scalable Estimation of Epistemic Uncertainty for Graph Neural Networks

    Authors: Puja Trivedi, Mark Heimann, Rushil Anirudh, Danai Koutra, Jayaraman J. Thiagarajan

    Abstract: While graph neural networks (GNNs) are widely used for node and graph representation learning tasks, the reliability of GNN uncertainty estimates under distribution shifts remains relatively under-explored. Indeed, while post-hoc calibration strategies can be used to improve in-distribution calibration, they need not also improve calibration under distribution shift. However, techniques which prod… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: 33 pages; 10 Figures. arXiv admin note: text overlap with arXiv:2309.10976

  6. arXiv:2312.03642  [pdf, other

    cs.LG

    Transformer-Powered Surrogates Close the ICF Simulation-Experiment Gap with Extremely Limited Data

    Authors: Matthew L. Olson, Shusen Liu, Jayaraman J. Thiagarajan, Bogdan Kustowski, Weng-Keen Wong, Rushil Anirudh

    Abstract: Recent advances in machine learning, specifically transformer architecture, have led to significant advancements in commercial domains. These powerful models have demonstrated superior capability to learn complex relationships and often generalize better to new data and problems. This paper presents a novel transformer-powered approach for enhancing prediction accuracy in multi-modal output scenar… ▽ More

    Submitted 28 May, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: MLST

  7. arXiv:2309.10977  [pdf, other

    cs.LG stat.ML

    PAGER: A Framework for Failure Analysis of Deep Regression Models

    Authors: Jayaraman J. Thiagarajan, Vivek Narayanaswamy, Puja Trivedi, Rushil Anirudh

    Abstract: Safe deployment of AI models requires proactive detection of failures to prevent costly errors. To this end, we study the important problem of detecting failures in deep regression models. Existing approaches rely on epistemic uncertainty estimates or inconsistency w.r.t the training data to identify failure. Interestingly, we find that while uncertainties are necessary they are insufficient to ac… ▽ More

    Submitted 1 June, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: Published at ICML 2024

  8. arXiv:2309.10976  [pdf, other

    cs.LG

    Accurate and Scalable Estimation of Epistemic Uncertainty for Graph Neural Networks

    Authors: Puja Trivedi, Mark Heimann, Rushil Anirudh, Danai Koutra, Jayaraman J. Thiagarajan

    Abstract: Safe deployment of graph neural networks (GNNs) under distribution shift requires models to provide accurate confidence indicators (CI). However, while it is well-known in computer vision that CI quality diminishes under distribution shift, this behavior remains understudied for GNNs. Hence, we begin with a case study on CI calibration under controlled structural and feature distribution shifts an… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: 22 pages, 11 figures

  9. arXiv:2307.04838  [pdf, other

    cs.CV cs.LG

    CREPE: Learnable Prompting With CLIP Improves Visual Relationship Prediction

    Authors: Rakshith Subramanyam, T. S. Jayram, Rushil Anirudh, Jayaraman J. Thiagarajan

    Abstract: In this paper, we explore the potential of Vision-Language Models (VLMs), specifically CLIP, in predicting visual object relationships, which involves interpreting visual features from images into language-based relations. Current state-of-the-art methods use complex graphical models that utilize language cues and visual features to address this challenge. We hypothesize that the strong language p… ▽ More

    Submitted 19 July, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

  10. arXiv:2305.13284  [pdf, other

    cs.CV cs.AI

    Target-Aware Generative Augmentations for Single-Shot Adaptation

    Authors: Kowshik Thopalli, Rakshith Subramanyam, Pavan Turaga, Jayaraman J. Thiagarajan

    Abstract: In this paper, we address the problem of adapting models from a source domain to a target domain, a task that has become increasingly important due to the brittle generalization of deep neural networks. While several test-time adaptation techniques have emerged, they typically rely on synthetic toolbox data augmentations in cases of limited target data availability. We consider the challenging set… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted at International Conference Machine Learning (ICML) 2023

  11. arXiv:2303.13589  [pdf, other

    cs.LG stat.ML

    On the Efficacy of Generalization Error Prediction Scoring Functions

    Authors: Puja Trivedi, Danai Koutra, Jayaraman J. Thiagarajan

    Abstract: Generalization error predictors (GEPs) aim to predict model performance on unseen distributions by deriving dataset-level error estimates from sample-level scores. However, GEPs often utilize disparate mechanisms (e.g., regressors, thresholding functions, calibration datasets, etc), to derive such error estimates, which can obfuscate the benefits of a particular scoring function. Therefore, in thi… ▽ More

    Submitted 29 May, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted to ICASSP 2023. (Previous title: A Closer Look at Scoring Functions and Generalization Prediction.)

  12. arXiv:2303.13500  [pdf, other

    cs.LG

    A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias

    Authors: Puja Trivedi, Danai Koutra, Jayaraman J. Thiagarajan

    Abstract: Advances in the expressivity of pretrained models have increased interest in the design of adaptation protocols which enable safe and effective transfer learning. Going beyond conventional linear probing (LP) and fine tuning (FT) strategies, protocols that can effectively control feature distortion, i.e., the failure to update features orthogonal to the in-distribution, have been found to achieve… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted to ICLR 2023 as notable-25% (spotlight)

  13. arXiv:2303.10774  [pdf, other

    cs.LG cs.CV

    Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models

    Authors: Matthew L. Olson, Shusen Liu, Rushil Anirudh, Jayaraman J. Thiagarajan, Peer-Timo Bremer, Weng-Keen Wong

    Abstract: Generative Adversarial Networks (GANs) are notoriously difficult to train especially for complex distributions and with limited data. This has driven the need for tools to audit trained networks in human intelligible format, for example, to identify biases or ensure fairness. Existing GAN audit tools are restricted to coarse-grained, model-data comparisons based on summary statistics such as FID o… ▽ More

    Submitted 2 May, 2023; v1 submitted 19 March, 2023; originally announced March 2023.

    Comments: CVPR 2023. Source code is available at https://github.com/mattolson93/cross_gan_auditing

  14. arXiv:2211.12340  [pdf, other

    eess.IV cs.CV

    DOLCE: A Model-Based Probabilistic Diffusion Framework for Limited-Angle CT Reconstruction

    Authors: Jiaming Liu, Rushil Anirudh, Jayaraman J. Thiagarajan, Stewart He, K. Aditya Mohan, Ulugbek S. Kamilov, Hyo** Kim

    Abstract: Limited-Angle Computed Tomography (LACT) is a non-destructive evaluation technique used in a variety of applications ranging from security to medicine. The limited angle coverage in LACT is often a dominant source of severe artifacts in the reconstructed images, making it a challenging inverse problem. We present DOLCE, a new deep model-based framework for LACT that uses a conditional diffusion mo… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

    Comments: 29 pages, 21 figures

  15. arXiv:2210.16742  [pdf, other

    cs.CV cs.AI cs.LG

    On-the-fly Object Detection using StyleGAN with CLIP Guidance

    Authors: Yuzhe Lu, Shusen Liu, Jayaraman J. Thiagarajan, Wesam Sakla, Rushil Anirudh

    Abstract: We present a fully automated framework for building object detectors on satellite imagery without requiring any human annotation or intervention. We achieve this by leveraging the combined power of modern generative models (e.g., StyleGAN) and recent advances in multi-modal learning (e.g., CLIP). While deep generative models effectively encode the key semantics pertinent to a data distribution, th… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

  16. arXiv:2210.16692  [pdf, other

    cs.CV cs.LG stat.ML

    Single-Shot Domain Adaptation via Target-Aware Generative Augmentation

    Authors: Rakshith Subramanyam, Kowshik Thopalli, Spring Berman, Pavan Turaga, Jayaraman J. Thiagarajan

    Abstract: The problem of adapting models from a source domain using data from any target domain of interest has gained prominence, thanks to the brittle generalization in deep neural networks. While several test-time adaptation techniques have emerged, they typically rely on synthetic data augmentations in cases of limited target data availability. In this paper, we consider the challenging setting of singl… ▽ More

    Submitted 29 October, 2022; originally announced October 2022.

  17. arXiv:2208.02810  [pdf, other

    cs.LG

    Analyzing Data-Centric Properties for Graph Contrastive Learning

    Authors: Puja Trivedi, Ekdeep Singh Lubana, Mark Heimann, Danai Koutra, Jayaraman J. Thiagarajan

    Abstract: Recent analyses of self-supervised learning (SSL) find the following data-centric properties to be critical for learning good representations: invariance to task-irrelevant semantics, separability of classes in some latent space, and recoverability of labels from augmented samples. However, given their discrete, non-Euclidean nature, graph datasets and graph SSL methods are unlikely to satisfy the… ▽ More

    Submitted 22 January, 2023; v1 submitted 4 August, 2022; originally announced August 2022.

    Comments: Accepted to NeurIPS 2022

  18. arXiv:2207.12615  [pdf, other

    cs.LG

    Exploring the Design of Adaptation Protocols for Improved Generalization and Machine Learning Safety

    Authors: Puja Trivedi, Danai Koutra, Jayaraman J. Thiagarajan

    Abstract: While directly fine-tuning (FT) large-scale, pretrained models on task-specific data is well-known to induce strong in-distribution task performance, recent works have demonstrated that different adaptation protocols, such as linear probing (LP) prior to FT, can improve out-of-distribution generalization. However, the design space of such adaptation protocols remains under-explored and the evaluat… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Comments: Principles of Distribution Shift (PODS) Workshop at ICML 2022, 4 pages, 2 figures

  19. arXiv:2207.12346  [pdf, other

    cs.LG

    Contrastive Knowledge-Augmented Meta-Learning for Few-Shot Classification

    Authors: Rakshith Subramanyam, Mark Heimann, Jayram Thathachar, Rushil Anirudh, Jayaraman J. Thiagarajan

    Abstract: Model agnostic meta-learning algorithms aim to infer priors from several observed tasks that can then be used to adapt to a new task with few examples. Given the inherent diversity of tasks arising in existing benchmarks, recent methods use separate, learnable structure, such as hierarchies or graphs, for enabling task-specific adaptation of the prior. While these approaches have produced signific… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

  20. arXiv:2207.07235  [pdf, other

    cs.LG cs.CV stat.ML

    Single Model Uncertainty Estimation via Stochastic Data Centering

    Authors: Jayaraman J. Thiagarajan, Rushil Anirudh, Vivek Narayanaswamy, Peer-Timo Bremer

    Abstract: We are interested in estimating the uncertainties of deep neural networks, which play an important role in many scientific and engineering problems. In this paper, we present a striking new finding that an ensemble of neural networks with the same weight initialization, trained on datasets that are shifted by a constant bias gives rise to slightly inconsistent trained models, where the differences… ▽ More

    Submitted 1 December, 2022; v1 submitted 14 July, 2022; originally announced July 2022.

    Comments: Spotlight at NeurIPS 2022

  21. arXiv:2207.05286  [pdf, other

    cs.CV cs.LG

    Know Your Space: Inlier and Outlier Construction for Calibrating Medical OOD Detectors

    Authors: Vivek Narayanaswamy, Yamen Mubarka, Rushil Anirudh, Deepta Rajan, Andreas Spanias, Jayaraman J. Thiagarajan

    Abstract: We focus on the problem of producing well-calibrated out-of-distribution (OOD) detectors, in order to enable safe deployment of medical image classifiers. Motivated by the difficulty of curating suitable calibration datasets, synthetic augmentations have become highly prevalent for inlier/outlier specification. While there have been rapid advances in data augmentation techniques, this paper makes… ▽ More

    Submitted 22 April, 2023; v1 submitted 11 July, 2022; originally announced July 2022.

  22. arXiv:2207.04185  [pdf, other

    cs.CV cs.LG

    Domain Alignment Meets Fully Test-Time Adaptation

    Authors: Kowshik Thopalli, Pavan Turaga, Jayaraman J. Thiagarajan

    Abstract: A foundational requirement of a deployed ML model is to generalize to data drawn from a testing distribution that is different from training. A popular solution to this problem is to adapt a pre-trained model to novel domains using only unlabeled data. In this paper, we focus on a challenging variant of this problem, where access to the original source data is restricted. While fully test-time ada… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

    Comments: 16 Pages including references, 5 figures

  23. arXiv:2207.04125  [pdf, other

    cs.LG cs.AI cs.CV

    Out of Distribution Detection via Neural Network Anchoring

    Authors: Rushil Anirudh, Jayaraman J. Thiagarajan

    Abstract: Our goal in this paper is to exploit heteroscedastic temperature scaling as a calibration strategy for out of distribution (OOD) detection. Heteroscedasticity here refers to the fact that the optimal temperature parameter for each sample can be different, as opposed to conventional approaches that use the same value for the entire distribution. To enable this, we propose a new training strategy ca… ▽ More

    Submitted 1 December, 2022; v1 submitted 8 July, 2022; originally announced July 2022.

    Comments: ACML 2022

  24. arXiv:2206.07736  [pdf, other

    cs.LG cs.CV

    Improving Diversity with Adversarially Learned Transformations for Domain Generalization

    Authors: Tejas Gokhale, Rushil Anirudh, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Chitta Baral, Yezhou Yang

    Abstract: To be successful in single source domain generalization, maximizing diversity of synthesized domains has emerged as one of the most effective strategies. Many of the recent successes have come from methods that pre-specify the types of diversity that a model is exposed to during training, so that it can ultimately generalize well to new domains. However, naïve diversity based augmentations do not… ▽ More

    Submitted 12 December, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: WACV 2023. Code: https://github.com/tejas-gokhale/ALT

  25. 2022 Review of Data-Driven Plasma Science

    Authors: Rushil Anirudh, Rick Archibald, M. Salman Asif, Markus M. Becker, Sadruddin Benkadda, Peer-Timo Bremer, Rick H. S. Budé, C. S. Chang, Lei Chen, R. M. Churchill, Jonathan Citrin, Jim A Gaffney, Ana Gainaru, Walter Gekelman, Tom Gibbs, Satoshi Hamaguchi, Christian Hill, Kelli Humbird, Sören Jalas, Satoru Kawaguchi, Gon-Ho Kim, Manuel Kirchen, Scott Klasky, John L. Kline, Karl Krushelnick , et al. (38 additional authors not shown)

    Abstract: Data science and technology offer transformative tools and methods to science. This review article highlights latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS). A large amount of data and machine learning algorithms go hand in hand. Most plasma data, whether experimental, observational or computational, are generated or collected by machines today.… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: 112 pages (including 700+ references), 44 figures, submitted to IEEE Transactions on Plasma Science as a part of the IEEE Golden Anniversary Special Issue

    Report number: Los Alamos Report number LA-UR-22-24834

    Journal ref: IEEE Transactions on Plasma Science 51, 1750 - 1838 (2023)

  26. arXiv:2201.01806  [pdf, other

    cs.LG cs.CV

    Revisiting Deep Subspace Alignment for Unsupervised Domain Adaptation

    Authors: Kowshik Thopalli, Jayaraman J Thiagarajan, Rushil Anirudh, Pavan K Turaga

    Abstract: Unsupervised domain adaptation (UDA) aims to transfer and adapt knowledge from a labeled source domain to an unlabeled target domain. Traditionally, subspace-based methods form an important class of solutions to this problem. Despite their mathematical elegance and tractability, these methods are often found to be ineffective at producing domain-invariant features with complex, real-world datasets… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

    Comments: arXiv admin note: text overlap with arXiv:1906.04338

  27. arXiv:2112.09802  [pdf, other

    cs.LG cs.CV

    Automated Domain Discovery from Multiple Sources to Improve Zero-Shot Generalization

    Authors: Kowshik Thopalli, Sameeksha Katoch, Pavan Turaga, Jayaraman J. Thiagarajan

    Abstract: Domain generalization (DG) methods aim to develop models that generalize to settings where the test distribution is different from the training data. In this paper, we focus on the challenging problem of multi-source zero shot DG (MDG), where labeled training data from multiple source domains is available but with no access to data from the target domain. A wide range of solutions have been propos… ▽ More

    Submitted 3 November, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

  28. arXiv:2111.12798  [pdf, other

    cs.LG cs.CV

    Geometric Priors for Scientific Generative Models in Inertial Confinement Fusion

    Authors: Ankita Shukla, Rushil Anirudh, Eugene Kur, Jayaraman J. Thiagarajan, Peer-Timo Bremer, Brian K. Spears, Tammy Ma, Pavan Turaga

    Abstract: In this paper, we develop a Wasserstein autoencoder (WAE) with a hyperspherical prior for multimodal data in the application of inertial confinement fusion. Unlike a typical hyperspherical generative model that requires computationally inefficient sampling from distributions like the von Mis Fisher, we sample from a normal distribution followed by a projection layer before the generator. Finally,… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

    Comments: 5 pages, 4 figures, Fourth Workshop on Machine Learning and the Physical Sciences, NeurIPS 2021

  29. arXiv:2110.02197  [pdf, other

    cs.LG cs.CV stat.ML

    $Δ$-UQ: Accurate Uncertainty Quantification via Anchor Marginalization

    Authors: Rushil Anirudh, Jayaraman J. Thiagarajan

    Abstract: We present $Δ$-UQ -- a novel, general-purpose uncertainty estimator using the concept of anchoring in predictive models. Anchoring works by first transforming the input into a tuple consisting of an anchor point drawn from a prior distribution, and a combination of the input sample with the anchor using a pretext encoding scheme. This encoding is such that the original input can be perfectly recov… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

  30. arXiv:2110.01406  [pdf

    cs.LG cs.DC cs.PF cs.SE

    MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

    Authors: Alexandros Karargyris, Renato Umeton, Micah J. Sheller, Alejandro Aristizabal, Johnu George, Srini Bala, Daniel J. Beutel, Victor Bittorf, Akshay Chaudhari, Alexander Chowdhury, Cody Coleman, Bala Desinghu, Gregory Diamos, Debo Dutta, Diane Feddema, Grigori Fursin, Junyi Guo, Xinyuan Huang, David Kanter, Satyananda Kashyap, Nicholas Lane, Indranil Mallick, Pietro Mascagni, Virendra Mehta, Vivek Natarajan , et al. (17 additional authors not shown)

    Abstract: Medical AI has tremendous potential to advance healthcare by supporting the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving provider and patient experience. We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data. To meet this need, we are building MedPerf,… ▽ More

    Submitted 28 December, 2021; v1 submitted 29 September, 2021; originally announced October 2021.

  31. arXiv:2109.14274  [pdf, other

    cs.LG cs.CV

    Designing Counterfactual Generators using Deep Model Inversion

    Authors: Jayaraman J. Thiagarajan, Vivek Narayanaswamy, Deepta Rajan, Jason Liang, Akshay Chaudhari, Andreas Spanias

    Abstract: Explanation techniques that synthesize small, interpretable changes to a given image while producing desired changes in the model prediction have become popular for introspecting black-box models. Commonly referred to as counterfactuals, the synthesized explanations are required to contain discernible changes (for easy interpretability) while also being realistic (consistency to the data manifold)… ▽ More

    Submitted 5 October, 2021; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: Neurips 2021

  32. arXiv:2104.09684  [pdf, other

    cs.LG

    Suppressing simulation bias using multi-modal data

    Authors: Bogdan Kustowski, Jim A. Gaffney, Brian K. Spears, Gemma J. Anderson, Rushil Anirudh, Peer-Timo Bremer, Jayaraman J. Thiagarajan, Michael K. G. Kruse, Ryan C. Nora

    Abstract: Many problems in science and engineering require making predictions based on few observations. To build a robust predictive model, these sparse data may need to be augmented with simulated data, especially when the design space is multi-dimensional. Simulations, however, often suffer from an inherent bias. Estimation of this bias may be poorly constrained not only because of data sparsity, but als… ▽ More

    Submitted 15 March, 2022; v1 submitted 19 April, 2021; originally announced April 2021.

    Report number: LLNL-JRNL-829622

  33. arXiv:2104.07161  [pdf, other

    cs.SD cs.LG eess.AS

    On the Design of Deep Priors for Unsupervised Audio Restoration

    Authors: Vivek Sivaraman Narayanaswamy, Jayaraman J. Thiagarajan, Andreas Spanias

    Abstract: Unsupervised deep learning methods for solving audio restoration problems extensively rely on carefully tailored neural architectures that carry strong inductive biases for defining priors in the time or spectral domain. In this context, lot of recent success has been achieved with sophisticated convolutional network constructions that recover audio signals in the spectral domain. However, in prac… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

  34. arXiv:2103.03788  [pdf, other

    cs.LG stat.ML

    Loss Estimators Improve Model Generalization

    Authors: Vivek Narayanaswamy, Jayaraman J. Thiagarajan, Deepta Rajan, Andreas Spanias

    Abstract: With increased interest in adopting AI methods for clinical diagnosis, a vital step towards safe deployment of such tools is to ensure that the models not only produce accurate predictions but also do not generalize to data regimes where the training data provide no meaningful evidence. Existing approaches for ensuring the distribution of model predictions to be similar to that of the true distrib… ▽ More

    Submitted 5 March, 2021; originally announced March 2021.

  35. arXiv:2102.07660  [pdf, other

    cs.LG cs.PF

    Comparative Code Structure Analysis using Deep Learning for Performance Prediction

    Authors: Nathan Pinnow, Tarek Ramadan, Tanzima Z. Islam, Chase Phelps, Jayaraman J. Thiagarajan

    Abstract: Performance analysis has always been an afterthought during the application development process, focusing on application correctness first. The learning curve of the existing static and dynamic analysis tools are steep, which requires understanding low-level details to interpret the findings for actionable optimizations. Additionally, application performance is a function of an infinite number of… ▽ More

    Submitted 21 April, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: 11 pages, To appear in proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS) 2021 conference

  36. arXiv:2012.01806  [pdf, other

    cs.CV cs.LG

    Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

    Authors: Tejas Gokhale, Rushil Anirudh, Bhavya Kailkhura, Jayaraman J. Thiagarajan, Chitta Baral, Yezhou Yang

    Abstract: While existing work in robust deep learning has focused on small pixel-level norm-based perturbations, this may not account for perturbations encountered in several real-world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expe… ▽ More

    Submitted 7 April, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: AAAI 2021. Camera Ready version + Appendix

  37. arXiv:2010.13749  [pdf, other

    stat.ML cs.LG physics.plasm-ph

    Meaningful uncertainties from deep neural network surrogates of large-scale numerical simulations

    Authors: Gemma J. Anderson, Jim A. Gaffney, Brian K. Spears, Peer-Timo Bremer, Rushil Anirudh, Jayaraman J. Thiagarajan

    Abstract: Large-scale numerical simulations are used across many scientific disciplines to facilitate experimental development and provide insights into underlying physical processes, but they come with a significant computational cost. Deep neural networks (DNNs) can serve as highly-accurate surrogate models, with the capacity to handle diverse datatypes, offering tremendous speed-ups for prediction and ma… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

  38. arXiv:2010.12046  [pdf, other

    cs.LG cs.CV

    Using Deep Image Priors to Generate Counterfactual Explanations

    Authors: Vivek Narayanaswamy, Jayaraman J. Thiagarajan, Andreas Spanias

    Abstract: Through the use of carefully tailored convolutional neural network architectures, a deep image prior (DIP) can be used to obtain pre-images from latent representation encodings. Though DIP inversion has been known to be superior to conventional regularized inversion strategies such as total variation, such an over-parameterized generator is able to effectively reconstruct even images that are not… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

  39. arXiv:2010.08478  [pdf, other

    cs.LG cs.CY

    Machine Learning-Powered Mitigation Policy Optimization in Epidemiological Models

    Authors: Jayaraman J. Thiagarajan, Peer-Timo Bremer, Rushil Anirudh, Timothy C. Germann, Sara Y. Del Valle, Frederick H. Streitz

    Abstract: A crucial aspect of managing a public health crisis is to effectively balance prevention and mitigation strategies, while taking their socio-economic impact into account. In particular, determining the influence of different non-pharmaceutical interventions (NPIs) on the effective use of public resources is an important problem, given the uncertainties on when a vaccine will be made available. In… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

  40. arXiv:2010.06558  [pdf, other

    cs.LG

    Accurate Calibration of Agent-based Epidemiological Models with Neural Network Surrogates

    Authors: Rushil Anirudh, Jayaraman J. Thiagarajan, Peer-Timo Bremer, Timothy C. Germann, Sara Y. Del Valle, Frederick H. Streitz

    Abstract: Calibrating complex epidemiological models to observed data is a crucial step to provide both insights into the current disease dynamics, i.e.\ by estimating a reproductive number, as well as to provide reliable forecasts and scenario explorations. Here we present a new approach to calibrate an agent-based model -- EpiCast -- using a large set of simulation ensembles for different major metropolit… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

  41. arXiv:2009.14455  [pdf, other

    stat.ML cs.CR cs.LG

    Uncertainty-Matching Graph Neural Networks to Defend Against Poisoning Attacks

    Authors: Uday Shankar Shanthamallu, Jayaraman J. Thiagarajan, Andreas Spanias

    Abstract: Graph Neural Networks (GNNs), a generalization of neural networks to graph-structured data, are often implemented using message passes between entities of a graph. While GNNs are effective for node classification, link prediction and graph classification, they are vulnerable to adversarial attacks, i.e., a small perturbation to the structure can lead to a non-trivial performance degradation. In th… ▽ More

    Submitted 30 September, 2020; originally announced September 2020.

  42. arXiv:2009.14454  [pdf, other

    stat.ML cs.LG

    Accurate and Robust Feature Importance Estimation under Distribution Shifts

    Authors: Jayaraman J. Thiagarajan, Vivek Narayanaswamy, Rushil Anirudh, Peer-Timo Bremer, Andreas Spanias

    Abstract: With increasing reliance on the outcomes of black-box models in critical applications, post-hoc explainability tools that do not require access to the model internals are often used to enable humans understand and trust these models. In particular, we focus on the class of methods that can reveal the influence of input features on the predicted outputs. Despite their wide-spread adoption, existing… ▽ More

    Submitted 30 September, 2020; originally announced September 2020.

  43. arXiv:2009.14448  [pdf, other

    stat.ML cs.CV cs.LG

    Ask-n-Learn: Active Learning via Reliable Gradient Representations for Image Classification

    Authors: Bindya Venkatesh, Jayaraman J. Thiagarajan

    Abstract: Deep predictive models rely on human supervision in the form of labeled training data. Obtaining large amounts of annotated training data can be expensive and time consuming, and this becomes a critical bottleneck while building such models in practice. In such scenarios, active learning (AL) strategies are used to achieve faster convergence in terms of labeling efforts. Existing active learning e… ▽ More

    Submitted 30 September, 2020; originally announced September 2020.

  44. arXiv:2005.13769  [pdf, other

    eess.AS cs.SD stat.ML

    Unsupervised Audio Source Separation using Generative Priors

    Authors: Vivek Narayanaswamy, Jayaraman J. Thiagarajan, Rushil Anirudh, Andreas Spanias

    Abstract: State-of-the-art under-determined audio source separation systems rely on supervised end-end training of carefully tailored neural network architectures operating either in the time or the spectral domain. However, these methods are severely challenged in terms of requiring access to expensive source level labeled data and being specific to a given set of sources and the mixing process, which dema… ▽ More

    Submitted 27 May, 2020; originally announced May 2020.

    Comments: 5 pages, 2 figures

  45. arXiv:2005.02328  [pdf, other

    stat.ML cs.LG physics.data-an

    Designing Accurate Emulators for Scientific Processes using Calibration-Driven Deep Models

    Authors: Jayaraman J. Thiagarajan, Bindya Venkatesh, Rushil Anirudh, Peer-Timo Bremer, Jim Gaffney, Gemma Anderson, Brian Spears

    Abstract: Predictive models that accurately emulate complex scientific processes can achieve exponential speed-ups over numerical simulators or experiments, and at the same time provide surrogates for improving the subsequent analysis. Consequently, there is a recent surge in utilizing modern machine learning (ML) methods, such as deep neural networks, to build data-driven emulators. While the majority of e… ▽ More

    Submitted 5 May, 2020; originally announced May 2020.

  46. arXiv:2005.02231  [pdf, other

    cs.CV cs.LG

    Self-Training with Improved Regularization for Sample-Efficient Chest X-Ray Classification

    Authors: Deepta Rajan, Jayaraman J. Thiagarajan, Alexandros Karargyris, Satyananda Kashyap

    Abstract: Automated diagnostic assistants in healthcare necessitate accurate AI models that can be trained with limited labeled data, can cope with severe class imbalances and can support simultaneous prediction of multiple disease conditions. To this end, we present a deep learning framework that utilizes a number of key components to enable robust modeling in such challenging scenarios. Using an important… ▽ More

    Submitted 10 February, 2021; v1 submitted 2 May, 2020; originally announced May 2020.

  47. arXiv:2004.14480  [pdf, other

    cs.LG stat.ML

    Calibrating Healthcare AI: Towards Reliable and Interpretable Deep Predictive Models

    Authors: Jayaraman J. Thiagarajan, Prasanna Sattigeri, Deepta Rajan, Bindya Venkatesh

    Abstract: The wide-spread adoption of representation learning technologies in clinical decision making strongly emphasizes the need for characterizing model reliability and enabling rigorous introspection of model behavior. While the former need is often addressed by incorporating uncertainty quantification strategies, the latter challenge is addressed using a broad class of interpretability techniques. In… ▽ More

    Submitted 27 April, 2020; originally announced April 2020.

  48. arXiv:2002.03875  [pdf, other

    stat.ML cs.LG

    Calibrate and Prune: Improving Reliability of Lottery Tickets Through Prediction Calibration

    Authors: Bindya Venkatesh, Jayaraman J. Thiagarajan, Kowshik Thopalli, Prasanna Sattigeri

    Abstract: The hypothesis that sub-network initializations (lottery) exist within the initializations of over-parameterized networks, which when trained in isolation produce highly generalizable models, has led to crucial insights into network initialization and has enabled efficient inferencing. Supervised models with uncalibrated confidences tend to be overconfident even when making wrong prediction. In th… ▽ More

    Submitted 30 September, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

  49. arXiv:1912.08113  [pdf, other

    cs.LG cs.CV physics.comp-ph stat.ML

    Improved Surrogates in Inertial Confinement Fusion with Manifold and Cycle Consistencies

    Authors: Rushil Anirudh, Jayaraman J. Thiagarajan, Peer-Timo Bremer, Brian K. Spears

    Abstract: Neural networks have become very popular in surrogate modeling because of their ability to characterize arbitrary, high dimensional functions in a data driven fashion. This paper advocates for the training of surrogates that are consistent with the physical manifold -- i.e., predictions are always physically meaningful, and are cyclically consistent -- i.e., when the predictions of the surrogate,… ▽ More

    Submitted 17 December, 2019; originally announced December 2019.

    Comments: 10 pages, 6 figures

  50. arXiv:1912.07748  [pdf, other

    cs.CV cs.LG stat.ML

    MimicGAN: Robust Projection onto Image Manifolds with Corruption Mimicking

    Authors: Rushil Anirudh, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Timo Bremer

    Abstract: In the past few years, Generative Adversarial Networks (GANs) have dramatically advanced our ability to represent and parameterize high-dimensional, non-linear image manifolds. As a result, they have been widely adopted across a variety of applications, ranging from challenging inverse problems like image completion, to problems such as anomaly detection and adversarial defense. A recurring theme… ▽ More

    Submitted 30 April, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

    Comments: International Journal on Computer Vision's (IJCV) Special Issue on GANs