Search | arXiv e-print repository

Neural Networks with Causal Graph Constraints: A New Approach for Treatment Effects Estimation

Abstract: In recent years, there has been a growing interest in using machine learning techniques for the estimation of treatment effects. Most of the best-performing methods rely on representation learning strategies that encourage shared behavior among potential outcomes to increase the precision of treatment effect estimates. In this paper we discuss and classify these models in terms of their algorithmi… ▽ More In recent years, there has been a growing interest in using machine learning techniques for the estimation of treatment effects. Most of the best-performing methods rely on representation learning strategies that encourage shared behavior among potential outcomes to increase the precision of treatment effect estimates. In this paper we discuss and classify these models in terms of their algorithmic inductive biases and present a new model, NN-CGC, that considers additional information from the causal graph. NN-CGC tackles bias resulting from spurious variable interactions by implementing novel constraints on models, and it can be integrated with other representation learning methods. We test the effectiveness of our method using three different base models on common benchmarks. Our results indicate that our model constraints lead to significant improvements, achieving new state-of-the-art results in treatment effects estimation. We also show that our method is robust to imperfect causal graphs and that using partial causal information is preferable to ignoring it. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2309.02139 [pdf, other]

doi 10.23919/MVA57639.2023.10216191

Self-Supervised Pre-Training Boosts Semantic Scene Segmentation on LiDAR Data

Authors: Mariona Carós, Ariadna Just, Santi Seguí, Jordi Vitrià

Abstract: Airborne LiDAR systems have the capability to capture the Earth's surface by generating extensive point cloud data comprised of points mainly defined by 3D coordinates. However, labeling such points for supervised learning tasks is time-consuming. As a result, there is a need to investigate techniques that can learn from unlabeled data to significantly reduce the number of annotated samples. In th… ▽ More Airborne LiDAR systems have the capability to capture the Earth's surface by generating extensive point cloud data comprised of points mainly defined by 3D coordinates. However, labeling such points for supervised learning tasks is time-consuming. As a result, there is a need to investigate techniques that can learn from unlabeled data to significantly reduce the number of annotated samples. In this work, we propose to train a self-supervised encoder with Barlow Twins and use it as a pre-trained network in the task of semantic scene segmentation. The experimental results demonstrate that our unsupervised pre-training boosts performance once fine-tuned on the supervised task, especially for under-represented categories. △ Less

Submitted 22 December, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

Comments: International conference Machine Vision Applications 2023

arXiv:2210.16081 [pdf, other]

doi 10.3233/FAIA220347

Object Segmentation of Cluttered Airborne LiDAR Point Clouds

Authors: Mariona Caros, Ariadna Just, Santi Segui, Jordi Vitria

Abstract: Airborne topographic LiDAR is an active remote sensing technology that emits near-infrared light to map objects on the Earth's surface. Derived products of LiDAR are suitable to service a wide range of applications because of their rich three-dimensional spatial information and their capacity to obtain multiple returns. However, processing point cloud data still requires a significant effort in ma… ▽ More Airborne topographic LiDAR is an active remote sensing technology that emits near-infrared light to map objects on the Earth's surface. Derived products of LiDAR are suitable to service a wide range of applications because of their rich three-dimensional spatial information and their capacity to obtain multiple returns. However, processing point cloud data still requires a significant effort in manual editing. Certain human-made objects are difficult to detect because of their variety of shapes, irregularly-distributed point clouds, and low number of class samples. In this work, we propose an efficient end-to-end deep learning framework to automatize the detection and segmentation of objects defined by an arbitrary number of LiDAR points surrounded by clutter. Our method is based on a light version of PointNet that achieves good performance on both object recognition and segmentation tasks. The results are tested against manually delineated power transmission towers and show promising accuracy. △ Less

Submitted 6 February, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

Comments: proceedings of the 24th International Conference of the Catalan Association for Artificial Intelligence (CCIA 2022)

Journal ref: Artificial Intelligence Research and Development. 356 (2022) 259-268

arXiv:2204.09773 [pdf, other]

doi 10.1016/j.compbiomed.2022.105631

Time-based Self-supervised Learning for Wireless Capsule Endoscopy

Authors: Guillem Pascual, Pablo Laiz, Albert García, Hagen Wenzek, Jordi Vitrià, Santi Seguí

Abstract: State-of-the-art machine learning models, and especially deep learning ones, are significantly data-hungry; they require vast amounts of manually labeled samples to function correctly. However, in most medical imaging fields, obtaining said data can be challenging. Not only the volume of data is a problem, but also the imbalances within its classes; it is common to have many more images of healthy… ▽ More State-of-the-art machine learning models, and especially deep learning ones, are significantly data-hungry; they require vast amounts of manually labeled samples to function correctly. However, in most medical imaging fields, obtaining said data can be challenging. Not only the volume of data is a problem, but also the imbalances within its classes; it is common to have many more images of healthy patients than of those with pathology. Computer-aided diagnostic systems suffer from these issues, usually over-designing their models to perform accurately. This work proposes using self-supervised learning for wireless endoscopy videos by introducing a custom-tailored method that does not initially need labels or appropriate balance. We prove that using the inferred inherent structure learned by our method, extracted from the temporal axis, improves the detection rate on several domain-specific applications even under severe imbalance. △ Less

Submitted 20 April, 2022; originally announced April 2022.

arXiv:2201.12848 [pdf, other]

Deep Non-Crossing Quantiles through the Partial Derivative

Authors: Axel Brando, Joan Gimeno, Jose A. Rodríguez-Serrano, Jordi Vitrià

Abstract: Quantile Regression (QR) provides a way to approximate a single conditional quantile. To have a more informative description of the conditional distribution, QR can be merged with deep learning techniques to simultaneously estimate multiple quantiles. However, the minimisation of the QR-loss function does not guarantee non-crossing quantiles, which affects the validity of such predictions and intr… ▽ More Quantile Regression (QR) provides a way to approximate a single conditional quantile. To have a more informative description of the conditional distribution, QR can be merged with deep learning techniques to simultaneously estimate multiple quantiles. However, the minimisation of the QR-loss function does not guarantee non-crossing quantiles, which affects the validity of such predictions and introduces a critical issue in certain scenarios. In this article, we propose a generic deep learning algorithm for predicting an arbitrary number of quantiles that ensures the quantile monotonicity constraint up to the machine precision and maintains its modelling performance with respect to alternative models. The presented method is evaluated over several real-world datasets obtaining state-of-the-art results as well as showing that it scales to large-size data sets. △ Less

Submitted 30 January, 2022; originally announced January 2022.

Comments: In the Proceedings of the 25th International Conference on Artificial Intelligence and Statistics (AISTATS)

arXiv:2107.04632 [pdf, other]

Algorithmic Causal Effect Identification with causaleffect

Authors: Martí Pedemonte, Jordi Vitrià, Álvaro Parafita

Abstract: Our evolution as a species made a huge step forward when we understood the relationships between causes and effects. These associations may be trivial for some events, but they are not in complex scenarios. To rigorously prove that some occurrences are caused by others, causal theory and causal inference were formalized, introducing the $do$-operator and its associated rules. The main goal of this… ▽ More Our evolution as a species made a huge step forward when we understood the relationships between causes and effects. These associations may be trivial for some events, but they are not in complex scenarios. To rigorously prove that some occurrences are caused by others, causal theory and causal inference were formalized, introducing the $do$-operator and its associated rules. The main goal of this report is to review and implement in Python some algorithms to compute conditional and non-conditional causal queries from observational data. To this end, we first present some basic background knowledge on probability and graph theory, before introducing important results on causal theory, used in the construction of the algorithms. We then thoroughly study the identification algorithms presented by Shpitser and Pearl in 2006, explaining our implementation in Python alongside. The main identification algorithm can be seen as a repeated application of the rules of $do$-calculus, and it eventually either returns an expression for the causal query from experimental probabilities or fails to identify the causal effect, in which case the effect is non-identifiable. We introduce our newly developed Python library and give some usage examples. △ Less

Submitted 9 July, 2021; originally announced July 2021.

Comments: 40 pages, 27 figures

MSC Class: 62D20 (Primary); 62H22 (Secondary) ACM Class: G.3; G.4

arXiv:2103.03587 [pdf, other]

Graph Convolutional Embeddings for Recommender Systems

Authors: Paula Gómez Duran, Alexandros Karatzoglou, Jordi Vitrià, Xin Xin, Ioannis Arapakis

Abstract: Modern recommender systems (RS) work by processing a number of signals that can be inferred from large sets of user-item interaction data. The main signal to analyze stems from the raw matrix that represents interactions. However, we can increase the performance of RS by considering other kinds of signals like the context of interactions, which could be, for example, the time or date of the intera… ▽ More Modern recommender systems (RS) work by processing a number of signals that can be inferred from large sets of user-item interaction data. The main signal to analyze stems from the raw matrix that represents interactions. However, we can increase the performance of RS by considering other kinds of signals like the context of interactions, which could be, for example, the time or date of the interaction, the user location, or sequential data corresponding to the historical interactions of the user with the system. These complex, context-based interaction signals are characterized by a rich relational structure that can be represented by a multi-partite graph. Graph Convolutional Networks (GCNs) have been used successfully in collaborative filtering with simple user-item interaction data. In this work, we generalize the use of GCNs for N-partite graphs by considering N multiple context dimensions and propose a simple way for their seamless integration in modern deep learning RS architectures. More specifically, we define a graph convolutional embedding layer for N-partite graphs that processes user-item-context interactions, and constructs node embeddings by leveraging their relational structure. Experiments on several datasets from recommender systems to drug re-purposing show the benefits of the introduced GCN embedding layer by measuring the performance of different context-enriched tasks. △ Less

Submitted 5 March, 2021; originally announced March 2021.

Comments: 10 pages, 4 figures, SIGIR July 2021

arXiv:2006.08380 [pdf, other]

Causal Inference with Deep Causal Graphs

Authors: Álvaro Parafita, Jordi Vitrià

Abstract: Parametric causal modelling techniques rarely provide functionality for counterfactual estimation, often at the expense of modelling complexity. Since causal estimations depend on the family of functions used to model the data, simplistic models could entail imprecise characterizations of the generative mechanism, and, consequently, unreliable results. This limits their applicability to real-life… ▽ More Parametric causal modelling techniques rarely provide functionality for counterfactual estimation, often at the expense of modelling complexity. Since causal estimations depend on the family of functions used to model the data, simplistic models could entail imprecise characterizations of the generative mechanism, and, consequently, unreliable results. This limits their applicability to real-life datasets, with non-linear relationships and high interaction between variables. We propose Deep Causal Graphs, an abstract specification of the required functionality for a neural network to model causal distributions, and provide a model that satisfies this contract: Normalizing Causal Flows. We demonstrate its expressive power in modelling complex interactions and showcase applications of the method to machine learning explainability and fairness, using true causal counterfactuals. △ Less

Submitted 15 June, 2020; originally announced June 2020.

Comments: Supplementary material can be found in https://github.com/aparafita/dcg-paper

arXiv:1912.12628 [pdf, other]

Dirichlet uncertainty wrappers for actionable algorithm accuracy accountability and auditability

Authors: José Mena, Oriol Pujol, Jordi Vitrià

Abstract: Nowadays, the use of machine learning models is becoming a utility in many applications. Companies deliver pre-trained models encapsulated as application programming interfaces (APIs) that developers combine with third party components and their own models and data to create complex data products to solve specific problems. The complexity of such products and the lack of control and knowledge of t… ▽ More Nowadays, the use of machine learning models is becoming a utility in many applications. Companies deliver pre-trained models encapsulated as application programming interfaces (APIs) that developers combine with third party components and their own models and data to create complex data products to solve specific problems. The complexity of such products and the lack of control and knowledge of the internals of each component used cause unavoidable effects, such as lack of transparency, difficulty in auditability, and emergence of potential uncontrolled risks. They are effectively black-boxes. Accountability of such solutions is a challenge for the auditors and the machine learning community. In this work, we propose a wrapper that given a black-box model enriches its output prediction with a measure of uncertainty. By using this wrapper, we make the black-box auditable for the accuracy risk (risk derived from low quality or uncertain decisions) and at the same time we provide an actionable mechanism to mitigate that risk in the form of decision rejection; we can choose not to issue a prediction when the risk or uncertainty in that decision is significant. Based on the resulting uncertainty measure, we advocate for a rejection system that selects the more confident predictions, discarding those more uncertain, leading to an improvement in the trustability of the resulting system. We showcase the proposed technique and methodology in a practical scenario where a simulated sentiment analysis API based on natural language processing is applied to different domains. Results demonstrate the effectiveness of the uncertainty computed by the wrapper and its high correlation to bad quality predictions and misclassifications. △ Less

Submitted 29 December, 2019; originally announced December 2019.

Comments: 13 pages, 5 figures and 1 table

MSC Class: 68T37 ACM Class: I.2.7; I.2.3

arXiv:1912.04643 [pdf, other]

WCE Polyp Detection with Triplet based Embeddings

Authors: Pablo Laiz, Jordi Vitrià, Hagen Wenzek, Carolina Malagelada, Fernando Azpiroz, Santi Seguí

Abstract: Wireless capsule endoscopy is a medical procedure used to visualize the entire gastrointestinal tract and to diagnose intestinal conditions, such as polyps or bleeding. Current analyses are performed by manually inspecting nearly each one of the frames of the video, a tedious and error-prone task. Automatic image analysis methods can be used to reduce the time needed for physicians to evaluate a c… ▽ More Wireless capsule endoscopy is a medical procedure used to visualize the entire gastrointestinal tract and to diagnose intestinal conditions, such as polyps or bleeding. Current analyses are performed by manually inspecting nearly each one of the frames of the video, a tedious and error-prone task. Automatic image analysis methods can be used to reduce the time needed for physicians to evaluate a capsule endoscopy video, however these methods are still in a research phase. In this paper we focus on computer-aided polyp detection in capsule endoscopy images. This is a challenging problem because of the diversity of polyp appearance, the imbalanced dataset structure and the scarcity of data. We have developed a new polyp computer-aided decision system that combines a deep convolutional neural network and metric learning. The key point of the method is the use of the triplet loss function with the aim of improving feature extraction from the images when having small dataset. The triplet loss function allows to train robust detectors by forcing images from the same category to be represented by similar embedding vectors while ensuring that images from different categories are represented by dissimilar vectors. Empirical results show a meaningful increase of AUC values compared to baseline methods. A good performance is not the only requirement when considering the adoption of this technology to clinical practice. Trust and explainability of decisions are as important as performance. With this purpose, we also provide a method to generate visual explanations of the outcome of our polyp detector. These explanations can be used to build a physician's trust in the system and also to convey information about the inner working of the method to the designer for debugging purposes. △ Less

Submitted 2 October, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

Comments: 19 pages, 13 figures, 9 tables, Accepted in Computerized Medical Imaging and Graphics

arXiv:1910.12288 [pdf, other]

Modelling heterogeneous distributions with an Uncountable Mixture of Asymmetric Laplacians

Authors: Axel Brando, Jose A. Rodríguez-Serrano, Jordi Vitrià, Alberto Rubio

Abstract: In regression tasks, aleatoric uncertainty is commonly addressed by considering a parametric distribution of the output variable, which is based on strong assumptions such as symmetry, unimodality or by supposing a restricted shape. These assumptions are too limited in scenarios where complex shapes, strong skews or multiple modes are present. In this paper, we propose a generic deep learning fram… ▽ More In regression tasks, aleatoric uncertainty is commonly addressed by considering a parametric distribution of the output variable, which is based on strong assumptions such as symmetry, unimodality or by supposing a restricted shape. These assumptions are too limited in scenarios where complex shapes, strong skews or multiple modes are present. In this paper, we propose a generic deep learning framework that learns an Uncountable Mixture of Asymmetric Laplacians (UMAL), which will allow us to estimate heterogeneous distributions of the output variable and shows its connections to quantile regression. Despite having a fixed number of parameters, the model can be interpreted as an infinite mixture of components, which yields a flexible approximation for heterogeneous distributions. Apart from synthetic cases, we apply this model to room price forecasting and to predict financial operations in personal bank accounts. We demonstrate that UMAL produces proper distributions, which allows us to extract richer insights and to sharpen decision-making. △ Less

Submitted 29 October, 2019; v1 submitted 27 October, 2019; originally announced October 2019.

Comments: 12 pages, 4 figures, Paper accepted as poster at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

arXiv:1909.08891 [pdf, other]

Explaining Visual Models by Causal Attribution

Authors: Álvaro Parafita, Jordi Vitrià

Abstract: Model explanations based on pure observational data cannot compute the effects of features reliably, due to their inability to estimate how each factor alteration could affect the rest. We argue that explanations should be based on the causal model of the data and the derived intervened causal models, that represent the data distribution subject to interventions. With these models, we can compute… ▽ More Model explanations based on pure observational data cannot compute the effects of features reliably, due to their inability to estimate how each factor alteration could affect the rest. We argue that explanations should be based on the causal model of the data and the derived intervened causal models, that represent the data distribution subject to interventions. With these models, we can compute counterfactuals, new samples that will inform us how the model reacts to feature changes on our input. We propose a novel explanation methodology based on Causal Counterfactuals and identify the limitations of current Image Generative Models in their application to counterfactual creation. △ Less

Submitted 19 September, 2019; originally announced September 2019.

Comments: 2019 ICCV Workshop on Interpreting and Explaining Visual Artificial Intelligence Models

arXiv:1807.09011 [pdf, other]

Uncertainty Modelling in Deep Networks: Forecasting Short and Noisy Series

Authors: Axel Brando, Jose A. Rodríguez-Serrano, Mauricio Ciprian, Roberto Maestre, Jordi Vitrià

Abstract: Deep Learning is a consolidated, state-of-the-art Machine Learning tool to fit a function when provided with large data sets of examples. However, in regression tasks, the straightforward application of Deep Learning models provides a point estimate of the target. In addition, the model does not take into account the uncertainty of a prediction. This represents a great limitation for tasks where c… ▽ More Deep Learning is a consolidated, state-of-the-art Machine Learning tool to fit a function when provided with large data sets of examples. However, in regression tasks, the straightforward application of Deep Learning models provides a point estimate of the target. In addition, the model does not take into account the uncertainty of a prediction. This represents a great limitation for tasks where communicating an erroneous prediction carries a risk. In this paper we tackle a real-world problem of forecasting impending financial expenses and incomings of customers, while displaying predictable monetary amounts on a mobile app. In this context, we investigate if we would obtain an advantage by applying Deep Learning models with a Heteroscedastic model of the variance of a network's output. Experimentally, we achieve a higher accuracy than non-trivial baselines. More importantly, we introduce a mechanism to discard low-confidence predictions, which means that they will not be visible to users. This should help enhance the user experience of our product. △ Less

Submitted 24 July, 2018; originally announced July 2018.

Comments: 17 pages, 5 figures, Applied Data Science Track of The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2018)

arXiv:1805.11348 [pdf, other]

Uncertainty Gated Network for Land Cover Segmentation

Authors: Guillem Pascual, Santi Seguí, Jordi Vitrià

Abstract: The production of thematic maps depicting land cover is one of the most common applications of remote sensing. To this end, several semantic segmentation approaches, based on deep learning, have been proposed in the literature, but land cover segmentation is still considered an open problem due to some specific problems related to remote sensing imaging. In this paper we propose a novel approach t… ▽ More The production of thematic maps depicting land cover is one of the most common applications of remote sensing. To this end, several semantic segmentation approaches, based on deep learning, have been proposed in the literature, but land cover segmentation is still considered an open problem due to some specific problems related to remote sensing imaging. In this paper we propose a novel approach to deal with the problem of modelling multiscale contexts surrounding pixels of different land cover categories. The approach leverages the computation of a heteroscedastic measure of uncertainty when classifying individual pixels in an image. This classification uncertainty measure is used to define a set of memory gates between layers that allow a principled method to select the optimal decision for each pixel. △ Less

Submitted 29 May, 2018; originally announced May 2018.

Comments: Accepted in CVPR18 workshop: "DeepGlobe: A Challenge for Parsing the Earth through Satellite Images"

arXiv:1607.07604 [pdf, other]

Generic Feature Learning for Wireless Capsule Endoscopy Analysis

Authors: Santi Seguí, Michal Drozdzal, Guillem Pascual, Petia Radeva, Carolina Malagelada, Fernando Azpiroz, Jordi Vitrià

Abstract: The interpretation and analysis of the wireless capsule endoscopy recording is a complex task which requires sophisticated computer aided decision (CAD) systems in order to help physicians with the video screening and, finally, with the diagnosis. Most of the CAD systems in the capsule endoscopy share a common system design, but use very different image and video representations. As a result, each… ▽ More The interpretation and analysis of the wireless capsule endoscopy recording is a complex task which requires sophisticated computer aided decision (CAD) systems in order to help physicians with the video screening and, finally, with the diagnosis. Most of the CAD systems in the capsule endoscopy share a common system design, but use very different image and video representations. As a result, each time a new clinical application of WCE appears, new CAD system has to be designed from scratch. This characteristic makes the design of new CAD systems a very time consuming. Therefore, in this paper we introduce a system for small intestine motility characterization, based on Deep Convolutional Neural Networks, which avoids the laborious step of designing specific features for individual motility events. Experimental results show the superiority of the learned features over alternative classifiers constructed by using state of the art hand-crafted features. In particular, it reaches a mean classification accuracy of 96% for six intestinal motility events, outperforming the other classifiers by a large margin (a 14% relative performance increase). △ Less

Submitted 26 July, 2016; originally announced July 2016.

arXiv:1505.08082 [pdf, other]

Learning to count with deep object features

Authors: Santi Seguí, Oriol Pujol, Jordi Vitrià

Abstract: Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value fr… ▽ More Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value from hand-crafted image features. In this paper we explore the features that are learned when training a counting convolutional neural network in order to understand their underlying representation. To this end we define a counting problem for MNIST data and show that the internal representation of the network is able to classify digits in spite of the fact that no direct supervision was provided for them during training. We also present preliminary results about a deep network that is able to count the number of pedestrians in a scene. △ Less

Submitted 29 May, 2015; originally announced May 2015.

Comments: This paper has been accepted at Deep Vision Workshop at CVPR 2015

Showing 1–16 of 16 results for author: Vitrià, J