Showing 1–2 of 2 results for author: Alkhalefi, M

Search v0.5.6 released 2020-02-24

arXiv:2403.06813 [pdf, other]

cs.CV

LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations

Authors: Mohammad Alkhalefi, Georgios Leontidis, Mingjun Zhong

Abstract: Contrastive instance discrimination outperforms supervised learning in downstream tasks like image classification and object detection. However, this approach heavily relies on data augmentation during representation learning, which may result in inferior results if not properly implemented. Random crop** followed by resizing is a common form of data augmentation used in contrastive learning, bu… ▽ More Contrastive instance discrimination outperforms supervised learning in downstream tasks like image classification and object detection. However, this approach heavily relies on data augmentation during representation learning, which may result in inferior results if not properly implemented. Random crop** followed by resizing is a common form of data augmentation used in contrastive learning, but it can lead to degraded representation learning if the two random crops contain distinct semantic content. To address this issue, this paper introduces LeOCLR (Leveraging Original Images for Contrastive Learning of Visual Representations), a framework that employs a new instance discrimination approach and an adapted loss function that ensures the shared region between positive pairs is semantically correct. The experimental results show that our approach consistently improves representation learning across different datasets compared to baseline models. For example, our approach outperforms MoCo-v2 by 5.1% on ImageNet-1K in linear evaluation and several other methods on transfer learning tasks. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: 16 pages, 5 figures, 6 tables
arXiv:2306.16122 [pdf, other]

cs.CV cs.LG

Semantic Positive Pairs for Enhancing Visual Representation Learning of Instance Discrimination methods

Authors: Mohammad Alkhalefi, Georgios Leontidis, Mingjun Zhong

Abstract: Self-supervised learning algorithms (SSL) based on instance discrimination have shown promising results, performing competitively or even outperforming supervised learning counterparts in some downstream tasks. Such approaches employ data augmentation to create two views of the same instance (i.e., positive pairs) and encourage the model to learn good representations by attracting these views clos… ▽ More Self-supervised learning algorithms (SSL) based on instance discrimination have shown promising results, performing competitively or even outperforming supervised learning counterparts in some downstream tasks. Such approaches employ data augmentation to create two views of the same instance (i.e., positive pairs) and encourage the model to learn good representations by attracting these views closer in the embedding space without collapsing to the trivial solution. However, data augmentation is limited in representing positive pairs, and the repulsion process between the instances during contrastive learning may discard important features for instances that have similar categories. To address this issue, we propose an approach to identify those images with similar semantic content and treat them as positive instances, thereby reducing the chance of discarding important features during representation learning and increasing the richness of the latent representation. Our approach is generic and could work with any self-supervised instance discrimination frameworks such as MoCo and SimSiam. To evaluate our method, we run experiments on three benchmark datasets: ImageNet, STL-10 and CIFAR-10 with different instance discrimination SSL approaches. The experimental results show that our approach consistently outperforms the baseline methods across all three datasets; for instance, we improve upon the vanilla MoCo-v2 by 4.1% on ImageNet under a linear evaluation protocol over 800 epochs. We also report results on semi-supervised learning, transfer learning on downstream tasks, and object detection. △ Less

Submitted 25 April, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

Comments: 17 pages, 6 figures, 12 tables

Journal ref: TMLR 2024 (https://openreview.net/pdf?id=z5AXLMBWdU)

Search v0.5.6 released 2020-02-24