Search | arXiv e-print repository

Improving the interpretability of GNN predictions through conformal-based graph sparsification

Authors: Pablo Sanchez-Martin, Kinaan Aamir Khan, Isabel Valera

Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art performance in solving graph classification tasks. However, most GNN architectures aggregate information from all nodes and edges in a graph, regardless of their relevance to the task at hand, thus hindering the interpretability of their predictions. In contrast to prior work, in this paper we propose a GNN \emph{training} approach that j… ▽ More Graph Neural Networks (GNNs) have achieved state-of-the-art performance in solving graph classification tasks. However, most GNN architectures aggregate information from all nodes and edges in a graph, regardless of their relevance to the task at hand, thus hindering the interpretability of their predictions. In contrast to prior work, in this paper we propose a GNN \emph{training} approach that jointly i) finds the most predictive subgraph by removing edges and/or nodes -- -\emph{without making assumptions about the subgraph structure} -- while ii) optimizing the performance of the graph classification task. To that end, we rely on reinforcement learning to solve the resulting bi-level optimization with a reward function based on conformal predictions to account for the current in-training uncertainty of the classifier. Our empirical results on nine different graph classification datasets show that our method competes in performance with baselines while relying on significantly sparser subgraphs, leading to more interpretable GNN-based predictions. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2403.17776 [pdf, other]

Exploring the Boundaries of Ambient Awareness in Twitter

Authors: Pablo Sanchez-Martin, Sonja Utz, Isabel Valera

Abstract: Ambient awareness refers to the ability of social media users to obtain knowledge about who knows what (i.e., users' expertise) in their network, by simply being exposed to other users' content (e.g, tweets on Twitter). Previous work, based on user surveys, reveals that individuals self-report ambient awareness only for parts of their networks. However, it is unclear whether it is their limited co… ▽ More Ambient awareness refers to the ability of social media users to obtain knowledge about who knows what (i.e., users' expertise) in their network, by simply being exposed to other users' content (e.g, tweets on Twitter). Previous work, based on user surveys, reveals that individuals self-report ambient awareness only for parts of their networks. However, it is unclear whether it is their limited cognitive capacity or the limited exposure to diagnostic tweets (i.e., online content) that prevents people from develo** ambient awareness for their complete network. In this work, we focus on in-wall ambient awareness (IWAA) in Twitter and conduct a two-step data-driven analysis, that allows us to explore to which extent IWAA is likely, or even possible. First, we rely on reactions (e.g., likes), as strong evidence of users being aware of experts in Twitter. Unfortunately, such strong evidence can be only measured for active users, which represent the minority in the network. Thus to study the boundaries of IWAA to a larger extent, in the second part of our analysis, we instead focus on the passive exposure to content generated by other users -- which we refer to as in-wall visibility. This analysis shows that (in line with \citet{levordashka2016ambient}) only for a subset of users IWAA is plausible, while for the majority it is unlikely, if even possible, to develop IWAA. We hope that our methodology paves the way for the emergence of data-driven approaches for the study of ambient awareness. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2306.05415 [pdf, other]

Causal normalizing flows: from theory to practice

Authors: Adrián Javaloy, Pablo Sánchez-Martín, Isabel Valera

Abstract: In this work, we deepen on the use of normalizing flows for causal reasoning. Specifically, we first leverage recent results on non-linear ICA to show that causal models are identifiable from observational data given a causal ordering, and thus can be recovered using autoregressive normalizing flows (NFs). Second, we analyze different design and learning choices for causal normalizing flows to cap… ▽ More In this work, we deepen on the use of normalizing flows for causal reasoning. Specifically, we first leverage recent results on non-linear ICA to show that causal models are identifiable from observational data given a causal ordering, and thus can be recovered using autoregressive normalizing flows (NFs). Second, we analyze different design and learning choices for causal normalizing flows to capture the underlying causal data-generating process. Third, we describe how to implement the do-operator in causal NFs, and thus, how to answer interventional and counterfactual questions. Finally, in our experiments, we validate our design and training choices through a comprehensive ablation study; compare causal NFs to other approaches for approximating causal models; and empirically demonstrate that causal NFs can be used to address real-world problems, where the presence of mixed discrete-continuous data and partial knowledge on the causal graph is the norm. The code for this work can be found at https://github.com/psanch21/causal-flows. △ Less

Submitted 8 December, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

Comments: 32 pages, 15 figures. Accepted as an Oral presentation at NeurIPS 2023

arXiv:2302.06223 [pdf, other]

Variational Mixture of HyperGenerators for Learning Distributions Over Functions

Authors: Batuhan Koyuncu, Pablo Sanchez-Martin, Ignacio Peis, Pablo M. Olmos, Isabel Valera

Abstract: Recent approaches build on implicit neural representations (INRs) to propose generative models over function spaces. However, they are computationally costly when dealing with inference tasks, such as missing data imputation, or directly cannot tackle them. In this work, we propose a novel deep generative model, named VAMoH. VAMoH combines the capabilities of modeling continuous functions using IN… ▽ More Recent approaches build on implicit neural representations (INRs) to propose generative models over function spaces. However, they are computationally costly when dealing with inference tasks, such as missing data imputation, or directly cannot tackle them. In this work, we propose a novel deep generative model, named VAMoH. VAMoH combines the capabilities of modeling continuous functions using INRs and the inference capabilities of Variational Autoencoders (VAEs). In addition, VAMoH relies on a normalizing flow to define the prior, and a mixture of hypernetworks to parametrize the data log-likelihood. This gives VAMoH a high expressive capability and interpretability. Through experiments on a diverse range of data types, such as images, voxels, and climate data, we show that VAMoH can effectively learn rich distributions over continuous functions. Furthermore, it can perform inference-related tasks, such as conditional super-resolution generation and in-painting, as well or better than previous approaches, while being less computationally demanding. △ Less

Submitted 20 July, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

Comments: Accepted at ICML 2023. Camera ready version

arXiv:2211.11853 [pdf, other]

Learnable Graph Convolutional Attention Networks

Authors: Adrián Javaloy, Pablo Sanchez-Martin, Amit Levi, Isabel Valera

Abstract: Existing Graph Neural Networks (GNNs) compute the message exchange between nodes by either aggregating uniformly (convolving) the features of all the neighboring nodes, or by applying a non-uniform score (attending) to the features. Recent works have shown the strengths and weaknesses of the resulting GNN architectures, respectively, GCNs and GATs. In this work, we aim at exploiting the strengths… ▽ More Existing Graph Neural Networks (GNNs) compute the message exchange between nodes by either aggregating uniformly (convolving) the features of all the neighboring nodes, or by applying a non-uniform score (attending) to the features. Recent works have shown the strengths and weaknesses of the resulting GNN architectures, respectively, GCNs and GATs. In this work, we aim at exploiting the strengths of both approaches to their full extent. To this end, we first introduce the graph convolutional attention layer (CAT), which relies on convolutions to compute the attention scores. Unfortunately, as in the case of GCNs and GATs, we show that there exists no clear winner between the three (neither theoretically nor in practice) as their performance directly depends on the nature of the data (i.e., of the graph and features). This result brings us to the main contribution of our work, the learnable graph convolutional attention network (L-CAT): a GNN architecture that automatically interpolates between GCN, GAT and CAT in each layer, by adding only two scalar parameters. Our results demonstrate that L-CAT is able to efficiently combine different GNN layers along the network, outperforming competing methods in a wide range of datasets, and resulting in a more robust model that reduces the need of cross-validating. △ Less

Submitted 28 February, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

Comments: Published as a conference paper at ICLR 2023. 35 pages, 5 figures

arXiv:2110.14690 [pdf, other]

VACA: Design of Variational Graph Autoencoders for Interventional and Counterfactual Queries

Authors: Pablo Sanchez-Martin, Miriam Rateike, Isabel Valera

Abstract: In this paper, we introduce VACA, a novel class of variational graph autoencoders for causal inference in the absence of hidden confounders, when only observational data and the causal graph are available. Without making any parametric assumptions, VACA mimics the necessary properties of a Structural Causal Model (SCM) to provide a flexible and practical framework for approximating interventions (… ▽ More In this paper, we introduce VACA, a novel class of variational graph autoencoders for causal inference in the absence of hidden confounders, when only observational data and the causal graph are available. Without making any parametric assumptions, VACA mimics the necessary properties of a Structural Causal Model (SCM) to provide a flexible and practical framework for approximating interventions (do-operator) and abduction-action-prediction steps. As a result, and as shown by our empirical results, VACA accurately approximates the interventional and counterfactual distributions on diverse SCMs. Finally, we apply VACA to evaluate counterfactual fairness in fair classification problems, as well as to learn fair classifiers without compromising performance. △ Less

Submitted 27 October, 2021; originally announced October 2021.

arXiv:1911.01425 [pdf, other]

Improved BiGAN training with marginal likelihood equalization

Authors: Pablo Sánchez-Martín, Pablo M. Olmos, Fernando Perez-Cruz

Abstract: We propose a novel training procedure for improving the performance of generative adversarial networks (GANs), especially to bidirectional GANs. First, we enforce that the empirical distribution of the inverse inference network matches the prior distribution, which favors the generator network reproducibility on the seen samples. Second, we have found that the marginal log-likelihood of the sample… ▽ More We propose a novel training procedure for improving the performance of generative adversarial networks (GANs), especially to bidirectional GANs. First, we enforce that the empirical distribution of the inverse inference network matches the prior distribution, which favors the generator network reproducibility on the seen samples. Second, we have found that the marginal log-likelihood of the samples shows a severe overrepresentation of a certain type of samples. To address this issue, we propose to train the bidirectional GAN using a non-uniform sampling for the mini-batch selection, resulting in improved quality and variety in generated samples measured quantitatively and by visual inspection. We illustrate our new procedure with the well-known CIFAR10, Fashion MNIST and CelebA datasets. △ Less

Submitted 23 May, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

arXiv:1901.09557 [pdf, other]

Out-of-Sample Testing for GANs

Authors: Pablo Sánchez-Martín, Pablo M. Olmos, Fernando Pérez-Cruz

Abstract: We propose a new method to evaluate GANs, namely EvalGAN. EvalGAN relies on a test set to directly measure the reconstruction quality in the original sample space (no auxiliary networks are necessary), and it also computes the (log)likelihood for the reconstructed samples in the test set. Further, EvalGAN is agnostic to the GAN algorithm and the dataset. We decided to test it on three state-of-the… ▽ More We propose a new method to evaluate GANs, namely EvalGAN. EvalGAN relies on a test set to directly measure the reconstruction quality in the original sample space (no auxiliary networks are necessary), and it also computes the (log)likelihood for the reconstructed samples in the test set. Further, EvalGAN is agnostic to the GAN algorithm and the dataset. We decided to test it on three state-of-the-art GANs over the well-known CIFAR-10 and CelebA datasets. △ Less

Submitted 28 January, 2019; originally announced January 2019.

Showing 1–8 of 8 results for author: Sanchez-Martin, P