Search | arXiv e-print repository

Learning to Transfer with von Neumann Conditional Divergence

Authors: Ammar Shaker, Shujian Yu, Daniel Oñoro-Rubio

Abstract: The similarity of feature representations plays a pivotal role in the success of problems related to domain adaptation. Feature similarity includes both the invariance of marginal distributions and the closeness of conditional distributions given the desired response $y$ (e.g., class labels). Unfortunately, traditional methods always learn such features without fully taking into consideration the… ▽ More The similarity of feature representations plays a pivotal role in the success of problems related to domain adaptation. Feature similarity includes both the invariance of marginal distributions and the closeness of conditional distributions given the desired response $y$ (e.g., class labels). Unfortunately, traditional methods always learn such features without fully taking into consideration the information in $y$, which in turn may lead to a mismatch of the conditional distributions or the mix-up of discriminative structures underlying data distributions. In this work, we introduce the recently proposed von Neumann conditional divergence to improve the transferability across multiple domains. We show that this new divergence is differentiable and eligible to easily quantify the functional dependence between features and $y$. Given multiple source tasks, we integrate this divergence to capture discriminative information in $y$ and design novel learning objectives assuming those source tasks are observed either simultaneously or sequentially. In both scenarios, we obtain favorable performance against state-of-the-art methods in terms of smaller generalization error on new tasks and less catastrophic forgetting on source tasks (in the sequential setup). △ Less

Submitted 6 January, 2022; v1 submitted 7 August, 2021; originally announced August 2021.

Comments: Accepted at AAAI2022

arXiv:2103.06220 [pdf, other]

A Relational-learning Perspective to Multi-label Chest X-ray Classification

Authors: Anjany Sekuboyina, Daniel Oñoro-Rubio, Jens Kleesiek, Brandon Malone

Abstract: Multi-label classification of chest X-ray images is frequently performed using discriminative approaches, i.e. learning to map an image directly to its binary labels. Such approaches make it challenging to incorporate auxiliary information such as annotation uncertainty or a dependency among the labels. Building towards this, we propose a novel knowledge graph reformulation of multi-label classifi… ▽ More Multi-label classification of chest X-ray images is frequently performed using discriminative approaches, i.e. learning to map an image directly to its binary labels. Such approaches make it challenging to incorporate auxiliary information such as annotation uncertainty or a dependency among the labels. Building towards this, we propose a novel knowledge graph reformulation of multi-label classification, which not only readily increases predictive performance of an encoder but also serves as a general framework for introducing new domain knowledge. Specifically, we construct a multi-modal knowledge graph out of the chest X-ray images and its labels and pose multi-label classification as a link prediction problem. Incorporating auxiliary information can then simply be achieved by adding additional nodes and relations among them. When tested on a publicly-available radiograph dataset (CheXpert), our relational-reformulation using a naive knowledge graph outperforms the state-of-art by achieving an area-under-ROC curve of 83.5%, an improvement of "sim 1" over a purely discriminative approach. △ Less

Submitted 10 March, 2021; originally announced March 2021.

arXiv:1903.05485 [pdf, other]

MMKG: Multi-Modal Knowledge Graphs

Authors: Ye Liu, Hui Li, Alberto Garcia-Duran, Mathias Niepert, Daniel Onoro-Rubio, David S. Rosenblum

Abstract: We present MMKG, a collection of three knowledge graphs that contain both numerical features and (links to) images for all entities as well as entity alignments between pairs of KGs. Therefore, multi-relational link prediction and entity matching communities can benefit from this resource. We believe this data set has the potential to facilitate the development of novel multi-modal learning approa… ▽ More We present MMKG, a collection of three knowledge graphs that contain both numerical features and (links to) images for all entities as well as entity alignments between pairs of KGs. Therefore, multi-relational link prediction and entity matching communities can benefit from this resource. We believe this data set has the potential to facilitate the development of novel multi-modal learning approaches for knowledge graphs.We validate the utility ofMMKG in the sameAs link prediction task with an extensive set of experiments. These experiments show that the task at hand benefits from learning of multiple feature types. △ Less

Submitted 13 March, 2019; originally announced March 2019.

Comments: ESWC 2019

arXiv:1806.04009 [pdf, other]

Contextual Hourglass Networks for Segmentation and Density Estimation

Authors: Daniel Oñoro-Rubio, Mathias Niepert

Abstract: Hourglass networks such as the U-Net and V-Net are popular neural architectures for medical image segmentation and counting problems. Typical instances of hourglass networks contain shortcut connections between mirroring layers. These shortcut connections improve the performance and it is hypothesized that this is due to mitigating effects on the vanishing gradient problem and the ability of the m… ▽ More Hourglass networks such as the U-Net and V-Net are popular neural architectures for medical image segmentation and counting problems. Typical instances of hourglass networks contain shortcut connections between mirroring layers. These shortcut connections improve the performance and it is hypothesized that this is due to mitigating effects on the vanishing gradient problem and the ability of the model to combine feature maps from earlier and later layers. We propose a method for not only combining feature maps of mirroring layers but also feature maps of layers with different spatial dimensions. For instance, the method enables the integration of the bottleneck feature map with those of the reconstruction layers. The proposed approach is applicable to any hourglass architecture. We evaluated the contextual hourglass networks on image segmentation and object counting problems in the medical domain. We achieve competitive results outperforming popular hourglass networks by up to 17 percentage points. △ Less

Submitted 8 June, 2018; originally announced June 2018.

arXiv:1805.02919 [pdf, other]

Learning Short-Cut Connections for Object Counting

Authors: Daniel Oñoro-Rubio, Mathias Niepert, Roberto J. López-Sastre

Abstract: Object counting is an important task in computer vision due to its growing demand in applications such as traffic monitoring or surveillance. In this paper, we consider object counting as a learning problem of a joint feature extraction and pixel-wise object density estimation with Convolutional-Deconvolutional networks. We introduce a novel counting model, named Gated U-Net (GU-Net). Specifically… ▽ More Object counting is an important task in computer vision due to its growing demand in applications such as traffic monitoring or surveillance. In this paper, we consider object counting as a learning problem of a joint feature extraction and pixel-wise object density estimation with Convolutional-Deconvolutional networks. We introduce a novel counting model, named Gated U-Net (GU-Net). Specifically, we propose to enrich the U-Net architecture with the concept of learnable short-cut connections. Standard short-cut connections are connections between layers in deep neural networks which skip at least one intermediate layer. Instead of simply setting short-cut connections, we propose to learn these connections from data. Therefore, our short-cuts can work as gating units, which optimize the flow of information between convolutional and deconvolutional layers in the U-Net architecture. We evaluate the introduced GU-Net architecture on three commonly used benchmark data sets for object counting. GU-Nets consistently outperform the base U-Net architecture, and achieve state-of-the-art performance. △ Less

Submitted 15 November, 2018; v1 submitted 8 May, 2018; originally announced May 2018.

arXiv:1801.10095 [pdf, other]

TransRev: Modeling Reviews as Translations from Users to Items

Authors: Alberto Garcia-Duran, Roberto Gonzalez, Daniel Onoro-Rubio, Mathias Niepert, Hui Li

Abstract: The text of a review expresses the sentiment a customer has towards a particular product. This is exploited in sentiment analysis where machine learning models are used to predict the review score from the text of the review. Furthermore, the products costumers have purchased in the past are indicative of the products they will purchase in the future. This is what recommender systems exploit by le… ▽ More The text of a review expresses the sentiment a customer has towards a particular product. This is exploited in sentiment analysis where machine learning models are used to predict the review score from the text of the review. Furthermore, the products costumers have purchased in the past are indicative of the products they will purchase in the future. This is what recommender systems exploit by learning models from purchase information to predict the items a customer might be interested in. We propose TransRev, an approach to the product recommendation problem that integrates ideas from recommender systems, sentiment analysis, and multi-relational learning into a joint learning objective. TransRev learns vector representations for users, items, and reviews. The embedding of a review is learned such that (a) it performs well as input feature of a regression model for sentiment prediction; and (b) it always translates the reviewer embedding to the embedding of the reviewed items. This allows TransRev to approximate a review embedding at test time as the difference of the embedding of each item and the user embedding. The approximated review embedding is then used with the regression model to predict the review score for each item. TransRev outperforms state of the art recommender systems on a large number of benchmark data sets. Moreover, it is able to retrieve, for each user and item, the review text from the training set whose embedding is most similar to the approximated review embedding. △ Less

Submitted 18 April, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

arXiv:1801.08110 [pdf, other]

doi 10.1016/j.imavis.2018.09.013

The challenge of simultaneous object detection and pose estimation: a comparative study

Authors: Daniel Oñoro-Rubio, Roberto J. López-Sastre, Carolina Redondo-Cabrera, Pedro Gil-Jiménez

Abstract: Detecting objects and estimating their pose remains as one of the major challenges of the computer vision research community. There exists a compromise between localizing the objects and estimating their viewpoints. The detector ideally needs to be view-invariant, while the pose estimation process should be able to generalize towards the category-level. This work is an exploration of using deep le… ▽ More Detecting objects and estimating their pose remains as one of the major challenges of the computer vision research community. There exists a compromise between localizing the objects and estimating their viewpoints. The detector ideally needs to be view-invariant, while the pose estimation process should be able to generalize towards the category-level. This work is an exploration of using deep learning models for solving both problems simultaneously. For doing so, we propose three novel deep learning architectures, which are able to perform a joint detection and pose estimation, where we gradually decouple the two tasks. We also investigate whether the pose estimation problem should be solved as a classification or regression problem, being this still an open question in the computer vision community. We detail a comparative analysis of all our solutions and the methods that currently define the state of the art for this problem. We use PASCAL3D+ and ObjectNet3D datasets to present the thorough experimental evaluation and main results. With the proposed models we achieve the state-of-the-art performance in both datasets. △ Less

Submitted 24 January, 2018; originally announced January 2018.

Journal ref: Image and Vision Computing, 2018

arXiv:1709.02314 [pdf, other]

Answering Visual-Relational Queries in Web-Extracted Knowledge Graphs

Authors: Daniel Oñoro-Rubio, Mathias Niepert, Alberto García-Durán, Roberto González, Roberto J. López-Sastre

Abstract: A visual-relational knowledge graph (KG) is a multi-relational graph whose entities are associated with images. We explore novel machine learning approaches for answering visual-relational queries in web-extracted knowledge graphs. To this end, we have created ImageGraph, a KG with 1,330 relation types, 14,870 entities, and 829,931 images crawled from the web. With visual-relational KGs such as Im… ▽ More A visual-relational knowledge graph (KG) is a multi-relational graph whose entities are associated with images. We explore novel machine learning approaches for answering visual-relational queries in web-extracted knowledge graphs. To this end, we have created ImageGraph, a KG with 1,330 relation types, 14,870 entities, and 829,931 images crawled from the web. With visual-relational KGs such as ImageGraph one can introduce novel probabilistic query types in which images are treated as first-class citizens. Both the prediction of relations between unseen images as well as multi-relational image retrieval can be expressed with specific families of visual-relational queries. We introduce novel combinations of convolutional networks and knowledge graph embedding methods to answer such queries. We also explore a zero-shot learning scenario where an image of an entirely new entity is linked with multiple relations to entities of an existing KG. The resulting multi-relational grounding of unseen entity images into a knowledge graph serves as a semantic entity representation. We conduct experiments to demonstrate that the proposed methods can answer these visual-relational queries efficiently and accurately. △ Less

Submitted 3 May, 2019; v1 submitted 7 September, 2017; originally announced September 2017.

Journal ref: AKBC2019

Showing 1–8 of 8 results for author: Onoro-Rubio, D