grid2vec: Learning Efficient Visual Representations via Flexible Grid-Graphs

Hamdi, Ali; Kim, Du Yong; Salim, Flora D.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2007.15444v2 (cs)

[Submitted on 30 Jul 2020 (v1), revised 31 Jul 2020 (this version, v2), latest version 29 Sep 2021 (v6)]

Title:grid2vec: Learning Efficient Visual Representations via Flexible Grid-Graphs

Authors:Ali Hamdi, Du Yong Kim, Flora D. Salim

View PDF

Abstract:We propose $grid2vec$, a novel approach for image representation learning based on Graph Convolutional Network (GCN). Existing visual representation methods suffer from several issues, such as requiring high-computation, losing in-depth structures, and being restricted to specific objects. $grid2vec$ converts an image to a low-dimensional feature vector. A key component of $grid2vec$ is Flexible Grid-Graphs, a spatially-adaptive method based on the image key-points, as a flexible grid, to generate the graph representation. It represents each image with a graph of unique node locations and edge distances. Nodes, in Flexible Grid-Graphs, describe the most representative patches in the image. We develop a multi-channel Convolutional Neural Network architecture to learn local features of each patch. We implement a hybrid node-embedding method, i.e., having spectral and non-spectral components. It aggregates the products of neighbours' features and node's eigenvector centrality score. We compare the performance of $grid2vec$ with a set of state-of-the-art representation learning and visual recognition models. $grid2vec$ has only $512$ features in comparison to a range from VGG16 with $25,090$ to NASNet with $487,874$. We show the models' superior accuracy in both binary and multi-class image classification. Although we utilise imbalanced, low-size dataset, $grid2vec$ shows stable and superior results against the well-known base classifiers.

Comments:	20 pages, Journal
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2007.15444 [cs.CV]
	(or arXiv:2007.15444v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2007.15444

Submission history

From: Ali Hamdi [view email]
[v1] Thu, 30 Jul 2020 13:21:00 UTC (11,238 KB)
[v2] Fri, 31 Jul 2020 21:01:52 UTC (11,238 KB)
[v3] Mon, 21 Sep 2020 11:02:19 UTC (10,263 KB)
[v4] Thu, 24 Sep 2020 20:41:36 UTC (10,263 KB)
[v5] Mon, 26 Apr 2021 12:53:05 UTC (9,805 KB)
[v6] Wed, 29 Sep 2021 09:34:42 UTC (10,291 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:grid2vec: Learning Efficient Visual Representations via Flexible Grid-Graphs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:grid2vec: Learning Efficient Visual Representations via Flexible Grid-Graphs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators