Search | arXiv e-print repository

Attention Consistency on Visual Corruptions for Single-Source Domain Generalization

Authors: Ilke Cugu, Massimiliano Mancini, Yanbei Chen, Zeynep Akata

Abstract: Generalizing visual recognition models trained on a single distribution to unseen input distributions (i.e. domains) requires making them robust to superfluous correlations in the training set. In this work, we achieve this goal by altering the training images to simulate new domains and imposing consistent visual attention across the different views of the same sample. We discover that the first… ▽ More Generalizing visual recognition models trained on a single distribution to unseen input distributions (i.e. domains) requires making them robust to superfluous correlations in the training set. In this work, we achieve this goal by altering the training images to simulate new domains and imposing consistent visual attention across the different views of the same sample. We discover that the first objective can be simply and effectively met through visual corruptions. Specifically, we alter the content of the training images using the nineteen corruptions of the ImageNet-C benchmark and three additional transformations based on Fourier transform. Since these corruptions preserve object locations, we propose an attention consistency loss to ensure that class activation maps across original and corrupted versions of the same training sample are aligned. We name our model Attention Consistency on Visual Corruptions (ACVC). We show that ACVC consistently achieves the state of the art on three single-source domain generalization benchmarks, PACS, COCO, and the large-scale DomainNet. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: CVPRW 2022 - Camera ready version

arXiv:2102.02804 [pdf, other]

A Deeper Look into Convolutions via Eigenvalue-based Pruning

Authors: Ilke Cugu, Emre Akbas

Abstract: Convolutional neural networks (CNNs) are able to attain better visual recognition performance than fully connected neural networks despite having much fewer parameters due to their parameter sharing principle. Modern architectures usually contain a small number of fully-connected layers, often at the end, after multiple layers of convolutions. In some cases, most of the convolutions can be elimina… ▽ More Convolutional neural networks (CNNs) are able to attain better visual recognition performance than fully connected neural networks despite having much fewer parameters due to their parameter sharing principle. Modern architectures usually contain a small number of fully-connected layers, often at the end, after multiple layers of convolutions. In some cases, most of the convolutions can be eliminated without suffering any loss in recognition performance. However, there is no solid recipe to detect the hidden subset of convolutional neurons that is responsible for the majority of the recognition work. In this work, we formulate this as a pruning problem where the aim is to prune as many kernels as possible while preserving the vanilla generalization performance. To this end, we use the matrix characteristics based on eigenvalues for pruning, in comparison to the average absolute weight of a kernel which is the de facto standard in the literature to assess the importance of an individual convolutional kernel, to shed light on the internal mechanisms of a widely used family of CNNs, namely residual neural networks (ResNets), for the image classification problem using CIFAR-10, CIFAR-100 and Tiny ImageNet datasets. △ Less

Submitted 18 October, 2022; v1 submitted 4 February, 2021; originally announced February 2021.

Comments: The codes are available at https://github.com/cuguilke/psykedelic

arXiv:1711.07011 [pdf, other]

MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Face Images

Authors: İlke Çuğu, Eren Şener, Emre Akbaş

Abstract: This paper is aimed at creating extremely small and fast convolutional neural networks (CNN) for the problem of facial expression recognition (FER) from frontal face images. To this end, we employed the popular knowledge distillation (KD) method and identified two major shortcomings with its use: 1) a fine-grained grid search is needed for tuning the temperature hyperparameter and 2) to find the o… ▽ More This paper is aimed at creating extremely small and fast convolutional neural networks (CNN) for the problem of facial expression recognition (FER) from frontal face images. To this end, we employed the popular knowledge distillation (KD) method and identified two major shortcomings with its use: 1) a fine-grained grid search is needed for tuning the temperature hyperparameter and 2) to find the optimal size-accuracy balance, one needs to search for the final network size (or the compression rate). On the other hand, KD is proved to be useful for model compression for the FER problem, and we discovered that its effects gets more and more significant with the decreasing model size. In addition, we hypothesized that translation invariance achieved using max-pooling layers would not be useful for the FER problem as the expressions are sensitive to small, pixel-wise changes around the eye and the mouth. However, we have found an intriguing improvement on generalization when max-pooling is used. We conducted experiments on two widely-used FER datasets, CK+ and Oulu-CASIA. Our smallest model (MicroExpNet), obtained using knowledge distillation, is less than 1MB in size and works at 1851 frames per second on an Intel i7 CPU. Despite being less accurate than the state-of-the-art, MicroExpNet still provides significant insights for designing a microarchitecture for the FER problem. △ Less

Submitted 24 December, 2019; v1 submitted 19 November, 2017; originally announced November 2017.

Comments: International Conference on Image Processing Theory, Tools and Applications (IPTA) 2019 camera ready version. Codes are available at: https://github.com/cuguilke/microexpnet

arXiv:1701.08291 [pdf, other]

Treelogy: A Novel Tree Classifier Utilizing Deep and Hand-crafted Representations

Authors: İlke Çuğu, Eren Şener, Çağrı Erciyes, Burak Balcı, Emre Akın, Itır Önal, Ahmet Oğuz Akyüz

Abstract: We propose a novel tree classification system called Treelogy, that fuses deep representations with hand-crafted features obtained from leaf images to perform leaf-based plant classification. Key to this system are segmentation of the leaf from an untextured background, using convolutional neural networks (CNNs) for learning deep representations, extracting hand-crafted features with a number of i… ▽ More We propose a novel tree classification system called Treelogy, that fuses deep representations with hand-crafted features obtained from leaf images to perform leaf-based plant classification. Key to this system are segmentation of the leaf from an untextured background, using convolutional neural networks (CNNs) for learning deep representations, extracting hand-crafted features with a number of image processing techniques, training a linear SVM with feature vectors, merging SVM and CNN results, and identifying the species from a dataset of 57 trees. Our classification results show that fusion of deep representations with hand-crafted features leads to the highest accuracy. The proposed algorithm is embedded in a smart-phone application, which is publicly available. Furthermore, our novel dataset comprised of 5408 leaf images is also made public for use of other researchers. △ Less

Submitted 28 January, 2017; originally announced January 2017.

MSC Class: 68-06

Showing 1–4 of 4 results for author: Çuğu, İ