Hyperbolic Vision Transformers: Combining Improvements in Metric Learning

Ermolov, Aleksandr; Mirvakhabova, Leyla; Khrulkov, Valentin; Sebe, Nicu; Oseledets, Ivan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.10833v1 (cs)

[Submitted on 21 Mar 2022 (this version), latest version 22 Mar 2022 (v2)]

Title:Hyperbolic Vision Transformers: Combining Improvements in Metric Learning

Authors:Aleksandr Ermolov, Leyla Mirvakhabova, Valentin Khrulkov, Nicu Sebe, Ivan Oseledets

View PDF

Abstract:Metric learning aims to learn a highly discriminative model encouraging the embeddings of similar classes to be close in the chosen metrics and pushed apart for dissimilar ones. The common recipe is to use an encoder to extract embeddings and a distance-based loss function to match the representations -- usually, the Euclidean distance is utilized. An emerging interest in learning hyperbolic data embeddings suggests that hyperbolic geometry can be beneficial for natural data. Following this line of work, we propose a new hyperbolic-based model for metric learning. At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space. These embeddings are directly optimized using modified pairwise cross-entropy loss. We evaluate the proposed model with six different formulations on four datasets achieving the new state-of-the-art performance. The source code is available at this https URL.

Comments:	CVPR 2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2203.10833 [cs.CV]
	(or arXiv:2203.10833v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.10833

Submission history

From: Aleksandr Ermolov [view email]
[v1] Mon, 21 Mar 2022 09:48:23 UTC (11,363 KB)
[v2] Tue, 22 Mar 2022 09:40:22 UTC (11,363 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Hyperbolic Vision Transformers: Combining Improvements in Metric Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Hyperbolic Vision Transformers: Combining Improvements in Metric Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators