Skip to main content

Showing 1–14 of 14 results for author: Masoomi, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.00858  [pdf, other

    cs.LG

    SmoothHess: ReLU Network Feature Interactions via Stein's Lemma

    Authors: Max Torop, Aria Masoomi, Davin Hill, Kivanc Kose, Stratis Ioannidis, Jennifer Dy

    Abstract: Several recent methods for interpretability model feature interactions by looking at the Hessian of a neural network. This poses a challenge for ReLU networks, which are piecewise-linear and thus have a zero Hessian almost everywhere. We propose SmoothHess, a method of estimating second-order interactions through Stein's Lemma. In particular, we estimate the Hessian of the network convolved with a… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted to NeurIPS 2023 as a conference paper

  2. arXiv:2304.07670  [pdf, other

    cs.LG

    Explanations of Black-Box Models based on Directional Feature Interactions

    Authors: Aria Masoomi, Davin Hill, Zhonghui Xu, Craig P Hersh, Edwin K. Silverman, Peter J. Castaldi, Stratis Ioannidis, Jennifer Dy

    Abstract: As machine learning algorithms are deployed ubiquitously to a variety of domains, it is imperative to make these often black-box models transparent. Several recent works explain black-box models by capturing the most influential features for prediction per instance; such explanation methods are univariate, as they characterize importance per feature. We extend univariate explanation to a higher-or… ▽ More

    Submitted 15 April, 2023; originally announced April 2023.

    Journal ref: International Conference on Learning Representations, 2022

  3. arXiv:2302.04411  [pdf, other

    cs.LG cs.AI

    Geometry of Score Based Generative Models

    Authors: Sandesh Ghimire, **yang Liu, Armand Comas, Davin Hill, Aria Masoomi, Octavia Camps, Jennifer Dy

    Abstract: In this work, we look at Score-based generative models (also called diffusion generative models) from a geometric perspective. From a new view point, we prove that both the forward and backward process of adding noise and generating from noise are Wasserstein gradient flow in the space of probability measures. We are the first to prove this connection. Our understanding of Score-based (and Diffusi… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

  4. arXiv:2302.02272  [pdf, other

    cs.CV

    Divide and Compose with Score Based Generative Models

    Authors: Sandesh Ghimire, Armand Comas, Davin Hill, Aria Masoomi, Octavia Camps, Jennifer Dy

    Abstract: While score based generative models, or diffusion models, have found success in image synthesis, they are often coupled with text data or image label to be able to manipulate and conditionally generate images. Even though manipulation of images by changing the text prompt is possible, our understanding of the text embedding and our ability to modify it to edit images is quite limited. Towards the… ▽ More

    Submitted 4 February, 2023; originally announced February 2023.

  5. arXiv:2211.06780  [pdf, other

    cs.LG cs.CV

    Inv-SENnet: Invariant Self Expression Network for clustering under biased data

    Authors: Ashutosh Singh, Ashish Singh, Aria Masoomi, Tales Imbiriba, Erik Learned-Miller, Deniz Erdogmus

    Abstract: Subspace clustering algorithms are used for understanding the cluster structure that explains the dataset well. These methods are extensively used for data-exploration tasks in various areas of Natural Sciences. However, most of these methods fail to handle unwanted biases in datasets. For datasets where a data sample represents multiple attributes, naively applying any clustering approach can res… ▽ More

    Submitted 12 November, 2022; originally announced November 2022.

  6. arXiv:2210.02419  [pdf, other

    cs.LG

    Boundary-Aware Uncertainty for Feature Attribution Explainers

    Authors: Davin Hill, Aria Masoomi, Max Torop, Sandesh Ghimire, Jennifer Dy

    Abstract: Post-hoc explanation methods have become a critical tool for understanding black-box classifiers in high-stakes applications. However, high-performing classifiers are often highly nonlinear and can exhibit complex behavior around the decision boundary, leading to brittle or misleading local explanations. Therefore there is an impending need to quantify the uncertainty of such explanation methods i… ▽ More

    Submitted 4 March, 2024; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS) 2024

  7. arXiv:2206.12481  [pdf, other

    cs.LG

    Analyzing Explainer Robustness via Probabilistic Lipschitzness of Prediction Functions

    Authors: Zulqarnain Khan, Davin Hill, Aria Masoomi, Joshua Bone, Jennifer Dy

    Abstract: Machine learning methods have significantly improved in their predictive capabilities, but at the same time they are becoming more complex and less transparent. As a result, explainers are often relied on to provide interpretability to these black-box prediction models. As crucial diagnostics tools, it is important that these explainers themselves are robust. In this paper we focus on one particul… ▽ More

    Submitted 16 April, 2024; v1 submitted 24 June, 2022; originally announced June 2022.

  8. arXiv:2202.01210   

    stat.ML cs.LG math.ST

    Deep Layer-wise Networks Have Closed-Form Weights

    Authors: Chieh Wu, Aria Masoomi, Arthur Gretton, Jennifer Dy

    Abstract: There is currently a debate within the neuroscience community over the likelihood of the brain performing backpropagation (BP). To better mimic the brain, training a network \textit{one layer at a time} with only a "single forward pass" has been proposed as an alternative to bypass BP; we refer to these networks as "layer-wise" networks. We continue the work on layer-wise networks by answering two… ▽ More

    Submitted 7 February, 2022; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: Since this version is similar to an older version, I should have updated the older version instead of creating a new version. I will now retract this version, and update a previous version to this. See arXiv:2006.08539

    Journal ref: AIStats 2022

  9. arXiv:2109.14688  [pdf, other

    cs.LG stat.ML

    Reliable Estimation of KL Divergence using a Discriminator in Reproducing Kernel Hilbert Space

    Authors: Sandesh Ghimire, Aria Masoomi, Jennifer Dy

    Abstract: Estimating Kullback Leibler (KL) divergence from samples of two distributions is essential in many machine learning problems. Variational methods using neural network discriminator have been proposed to achieve this task in a scalable manner. However, we noted that most of these methods using neural network discriminators suffer from high fluctuations (variance) in estimates and instability in tra… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

    Comments: 27 pages, 3 figures. arXiv admin note: text overlap with arXiv:2002.11187

    Journal ref: Advances in Neural Information Processing Systems 2021

  10. arXiv:2106.07035  [pdf, other

    cs.LG stat.ML

    Deep Bayesian Unsupervised Lifelong Learning

    Authors: Tingting Zhao, Zifeng Wang, Aria Masoomi, Jennifer Dy

    Abstract: Lifelong Learning (LL) refers to the ability to continually learn and solve new problems with incremental available information over time while retaining previous knowledge. Much attention has been given lately to Supervised Lifelong Learning (SLL) with a stream of labelled data. In contrast, we focus on resolving challenges in Unsupervised Lifelong Learning (ULL) with streaming unlabelled data wh… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

  11. arXiv:2106.02734  [pdf, other

    cs.LG

    Revisiting Hilbert-Schmidt Information Bottleneck for Adversarial Robustness

    Authors: Zifeng Wang, Tong Jian, Aria Masoomi, Stratis Ioannidis, Jennifer Dy

    Abstract: We investigate the HSIC (Hilbert-Schmidt independence criterion) bottleneck as a regularizer for learning an adversarially robust deep neural network classifier. In addition to the usual cross-entropy loss, we add regularization terms for every intermediate layer to ensure that the latent representations retain useful information for output prediction while reducing redundant information. We show… ▽ More

    Submitted 25 October, 2021; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: Published as a conference paper at NeurIPS 2021

  12. arXiv:2011.03320  [pdf, ps, other

    cs.LG stat.ML

    Kernel Dependence Network

    Authors: Chieh Wu, Aria Masoomi, Arthur Gretton, Jennifer Dy

    Abstract: We propose a greedy strategy to spectrally train a deep network for multi-class classification. Each layer is defined as a composition of linear weights with the feature map of a Gaussian kernel acting as the activation function. At each layer, the linear weights are learned by maximizing the dependence between the layer output and the labels using the Hilbert Schmidt Independence Criterion (HSIC)… ▽ More

    Submitted 9 November, 2020; v1 submitted 4 November, 2020; originally announced November 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:2006.08539

    Journal ref: NeurIPS2020 Workshop (Beyond Backprop)

  13. arXiv:2006.08539  [pdf, other

    stat.ML cs.LG

    Deep Layer-wise Networks Have Closed-Form Weights

    Authors: Chieh Wu, Aria Masoomi, Arthur Gretton, Jennifer Dy

    Abstract: There is currently a debate within the neuroscience community over the likelihood of the brain performing backpropagation (BP). To better mimic the brain, training a network $\textit{one layer at a time}$ with only a "single forward pass" has been proposed as an alternative to bypass BP; we refer to these networks as "layer-wise" networks. We continue the work on layer-wise networks by answering t… ▽ More

    Submitted 9 February, 2022; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: This version will be published in AIStats 2022

  14. arXiv:1906.03288  [pdf, other

    stat.ML cs.LG

    Streaming Adaptive Nonparametric Variational Autoencoder

    Authors: Tingting Zhao, Zifeng Wang, Aria Masoomi, Jennifer G. Dy

    Abstract: We develop a data driven approach to perform clustering and end-to-end feature learning simultaneously for streaming data that can adaptively detect novel clusters in emerging data. Our approach, Adaptive Nonparametric Variational Autoencoder (AdapVAE), learns the cluster membership through a Bayesian Nonparametric (BNP) modeling framework with Deep Neural Networks (DNNs) for feature learning. We… ▽ More

    Submitted 11 October, 2019; v1 submitted 7 June, 2019; originally announced June 2019.