Search | arXiv e-print repository

On Leveraging Variational Graph Embeddings for Open World Compositional Zero-Shot Learning

Authors: Muhammad Umer Anwaar, Zhihui Pan, Martin Kleinsteuber

Abstract: Humans are able to identify and categorize novel compositions of known concepts. The task in Compositional Zero-Shot learning (CZSL) is to learn composition of primitive concepts, i.e. objects and states, in such a way that even their novel compositions can be zero-shot classified. In this work, we do not assume any prior knowledge on the feasibility of novel compositions i.e.open-world setting, w… ▽ More Humans are able to identify and categorize novel compositions of known concepts. The task in Compositional Zero-Shot learning (CZSL) is to learn composition of primitive concepts, i.e. objects and states, in such a way that even their novel compositions can be zero-shot classified. In this work, we do not assume any prior knowledge on the feasibility of novel compositions i.e.open-world setting, where infeasible compositions dominate the search space. We propose a Compositional Variational Graph Autoencoder (CVGAE) approach for learning the variational embeddings of the primitive concepts (nodes) as well as feasibility of their compositions (via edges). Such modelling makes CVGAE scalable to real-world application scenarios. This is in contrast to SOTA method, CGE, which is computationally very expensive. e.g.for benchmark C-GQA dataset, CGE requires 3.94 x 10^5 nodes, whereas CVGAE requires only 1323 nodes. We learn a map** of the graph and image embeddings onto a common embedding space. CVGAE adopts a deep metric learning approach and learns a similarity metric in this space via bi-directional contrastive loss between projected graph and image embeddings. We validate the effectiveness of our approach on three benchmark datasets.We also demonstrate via an image retrieval task that the representations learnt by CVGAE are better suited for compositional generalization. △ Less

Submitted 23 April, 2022; originally announced April 2022.

Comments: Submitted to a conference

arXiv:2101.03885 [pdf, other]

Variational Embeddings for Community Detection and Node Representation

Authors: Rayyan Ahmad Khan, Muhammad Umer Anwaar, Omran Kaddah, Martin Kleinsteuber

Abstract: In this paper, we study how to simultaneously learn two highly correlated tasks of graph analysis, i.e., community detection and node representation learning. We propose an efficient generative model called VECoDeR for jointly learning Variational Embeddings for Community Detection and node Representation. VECoDeR assumes that every node can be a member of one or more communities. The node embeddi… ▽ More In this paper, we study how to simultaneously learn two highly correlated tasks of graph analysis, i.e., community detection and node representation learning. We propose an efficient generative model called VECoDeR for jointly learning Variational Embeddings for Community Detection and node Representation. VECoDeR assumes that every node can be a member of one or more communities. The node embeddings are learned in such a way that connected nodes are not only "closer" to each other but also share similar community assignments. A joint learning framework leverages community-aware node embeddings for better community detection. We demonstrate on several graph datasets that VECoDeR effectively out-performs many competitive baselines on all three tasks i.e. node classification, overlap** community detection and non-overlap** community detection. We also show that VECoDeR is computationally efficient and has quite robust performance with varying hyperparameters. △ Less

Submitted 11 January, 2021; originally announced January 2021.

arXiv:2010.11793 [pdf, other]

Metapath- and Entity-aware Graph Neural Network for Recommendation

Authors: Muhammad Umer Anwaar, Zhiwei Han, Shyam Arumugaswamy, Rayyan Ahmad Khan, Thomas Weber, Tianming Qiu, Hao Shen, Yuanting Liu, Martin Kleinsteuber

Abstract: In graph neural networks (GNNs), message passing iteratively aggregates nodes' information from their direct neighbors while neglecting the sequential nature of multi-hop node connections. Such sequential node connections e.g., metapaths, capture critical insights for downstream tasks. Concretely, in recommender systems (RSs), disregarding these insights leads to inadequate distillation of collabo… ▽ More In graph neural networks (GNNs), message passing iteratively aggregates nodes' information from their direct neighbors while neglecting the sequential nature of multi-hop node connections. Such sequential node connections e.g., metapaths, capture critical insights for downstream tasks. Concretely, in recommender systems (RSs), disregarding these insights leads to inadequate distillation of collaborative signals. In this paper, we employ collaborative subgraphs (CSGs) and metapaths to form metapath-aware subgraphs, which explicitly capture sequential semantics in graph structures. We propose meta\textbf{P}ath and \textbf{E}ntity-\textbf{A}ware \textbf{G}raph \textbf{N}eural \textbf{N}etwork (PEAGNN), which trains multilayer GNNs to perform metapath-aware information aggregation on such subgraphs. This aggregated information from different metapaths is then fused using attention mechanism. Finally, PEAGNN gives us the representations for node and subgraph, which can be used to train MLP for predicting score for target user-item pairs. To leverage the local structure of CSGs, we present entity-awareness that acts as a contrastive regularizer on node embedding. Moreover, PEAGNN can be combined with prominent layers such as GAT, GCN and GraphSage. Our empirical evaluation shows that our proposed technique outperforms competitive baselines on several datasets for recommendation tasks. Further analysis demonstrates that PEAGNN also learns meaningful metapath combinations from a given set of metapaths. △ Less

Submitted 1 April, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

arXiv:2006.11149 [pdf, other]

Compositional Learning of Image-Text Query for Image Retrieval

Authors: Muhammad Umer Anwaar, Egor Labintcev, Martin Kleinsteuber

Abstract: In this paper, we investigate the problem of retrieving images from a database based on a multi-modal (image-text) query. Specifically, the query text prompts some modification in the query image and the task is to retrieve images with the desired modifications. For instance, a user of an E-Commerce platform is interested in buying a dress, which should look similar to her friend's dress, but the… ▽ More In this paper, we investigate the problem of retrieving images from a database based on a multi-modal (image-text) query. Specifically, the query text prompts some modification in the query image and the task is to retrieve images with the desired modifications. For instance, a user of an E-Commerce platform is interested in buying a dress, which should look similar to her friend's dress, but the dress should be of white color with a ribbon sash. In this case, we would like the algorithm to retrieve some dresses with desired modifications in the query dress. We propose an autoencoder based model, ComposeAE, to learn the composition of image and text query for retrieving images. We adopt a deep metric learning approach and learn a metric that pushes composition of source image and text query closer to the target images. We also propose a rotational symmetry constraint on the optimization problem. Our approach is able to outperform the state-of-the-art method TIRG \cite{TIRG} on three benchmark datasets, namely: MIT-States, Fashion200k and Fashion IQ. In order to ensure fair comparison, we introduce strong baselines by enhancing TIRG method. To ensure reproducibility of the results, we publish our code here: \url{https://github.com/ecom-research/ComposeAE}. △ Less

Submitted 31 May, 2021; v1 submitted 19 June, 2020; originally announced June 2020.

Comments: Published at IEEE WACV 2021

arXiv:2004.01468 [pdf, other]

Epitomic Variational Graph Autoencoder

Authors: Rayyan Ahmad Khan, Muhammad Umer Anwaar, Martin Kleinsteuber

Abstract: Variational autoencoder (VAE) is a widely used generative model for learning latent representations. Burda et al. in their seminal paper showed that learning capacity of VAE is limited by over-pruning. It is a phenomenon where a significant number of latent variables fail to capture any information about the input data and the corresponding hidden units become inactive. This adversely affects lear… ▽ More Variational autoencoder (VAE) is a widely used generative model for learning latent representations. Burda et al. in their seminal paper showed that learning capacity of VAE is limited by over-pruning. It is a phenomenon where a significant number of latent variables fail to capture any information about the input data and the corresponding hidden units become inactive. This adversely affects learning diverse and interpretable latent representations. As variational graph autoencoder (VGAE) extends VAE for graph-structured data, it inherits the over-pruning problem. In this paper, we adopt a model based approach and propose epitomic VGAE (EVGAE),a generative variational framework for graph datasets which successfully mitigates the over-pruning problem and also boosts the generative ability of VGAE. We consider EVGAE to consist of multiple sparse VGAE models, called epitomes, that are groups of latent variables sharing the latent space. This approach aids in increasing active units as epitomes compete to learn better representation of the graph data. We verify our claims via experiments on three benchmark datasets. Our experiments show that EVGAE has a better generative ability than VGAE. Moreover, EVGAE outperforms VGAE on link prediction task in citation networks. △ Less

Submitted 7 August, 2020; v1 submitted 3 April, 2020; originally announced April 2020.

arXiv:1907.10409 [pdf, other]

Mend The Learning Approach, Not the Data: Insights for Ranking E-Commerce Products

Authors: Muhammad Umer Anwaar, Dmytro Rybalko, Martin Kleinsteuber

Abstract: Improved search quality enhances users' satisfaction, which directly impacts sales growth of an E-Commerce (E-Com) platform. Traditional Learning to Rank (LTR) algorithms require relevance judgments on products. In E-Com, getting such judgments poses an immense challenge. In the literature, it is proposed to employ user feedback (such as clicks, add-to-basket (AtB) clicks and orders) to generate r… ▽ More Improved search quality enhances users' satisfaction, which directly impacts sales growth of an E-Commerce (E-Com) platform. Traditional Learning to Rank (LTR) algorithms require relevance judgments on products. In E-Com, getting such judgments poses an immense challenge. In the literature, it is proposed to employ user feedback (such as clicks, add-to-basket (AtB) clicks and orders) to generate relevance judgments. It is done in two steps: first, query-product pair data are aggregated from the logs and then order rate etc are calculated for each pair in the logs. In this paper, we advocate counterfactual risk minimization (CRM) approach which circumvents the need of relevance judgements, data aggregation and is better suited for learning from logged data, i.e. contextual bandit feedback. Due to unavailability of public E-Com LTR dataset, we provide \textit{Mercateo dataset} from our platform. It contains more than 10 million AtB click logs and 1 million order logs from a catalogue of about 3.5 million products associated with 3060 queries. To the best of our knowledge, this is the first work which examines effectiveness of CRM approach in learning ranking model from real-world logged data. Our empirical evaluation shows that our CRM approach learns effectively from logged data and beats a strong baseline ranker ($λ$-MART) by a huge margin. Our method outperforms full-information loss (e.g. cross-entropy) on various deep neural network models. These findings demonstrate that by adopting CRM approach, E-Com platforms can get better product search quality compared to full-information approach. The code and dataset can be accessed at: https://github.com/ecom-research/CRM-LTR. △ Less

Submitted 9 July, 2020; v1 submitted 24 July, 2019; originally announced July 2019.

Comments: Accepted for ECML-PKDD 2020

Showing 1–6 of 6 results for author: Anwaar, M U