-
Attention Consistency on Visual Corruptions for Single-Source Domain Generalization
Authors:
Ilke Cugu,
Massimiliano Mancini,
Yanbei Chen,
Zeynep Akata
Abstract:
Generalizing visual recognition models trained on a single distribution to unseen input distributions (i.e. domains) requires making them robust to superfluous correlations in the training set. In this work, we achieve this goal by altering the training images to simulate new domains and imposing consistent visual attention across the different views of the same sample. We discover that the first…
▽ More
Generalizing visual recognition models trained on a single distribution to unseen input distributions (i.e. domains) requires making them robust to superfluous correlations in the training set. In this work, we achieve this goal by altering the training images to simulate new domains and imposing consistent visual attention across the different views of the same sample. We discover that the first objective can be simply and effectively met through visual corruptions. Specifically, we alter the content of the training images using the nineteen corruptions of the ImageNet-C benchmark and three additional transformations based on Fourier transform. Since these corruptions preserve object locations, we propose an attention consistency loss to ensure that class activation maps across original and corrupted versions of the same training sample are aligned. We name our model Attention Consistency on Visual Corruptions (ACVC). We show that ACVC consistently achieves the state of the art on three single-source domain generalization benchmarks, PACS, COCO, and the large-scale DomainNet.
△ Less
Submitted 27 April, 2022;
originally announced April 2022.
-
A Deeper Look into Convolutions via Eigenvalue-based Pruning
Authors:
Ilke Cugu,
Emre Akbas
Abstract:
Convolutional neural networks (CNNs) are able to attain better visual recognition performance than fully connected neural networks despite having much fewer parameters due to their parameter sharing principle. Modern architectures usually contain a small number of fully-connected layers, often at the end, after multiple layers of convolutions. In some cases, most of the convolutions can be elimina…
▽ More
Convolutional neural networks (CNNs) are able to attain better visual recognition performance than fully connected neural networks despite having much fewer parameters due to their parameter sharing principle. Modern architectures usually contain a small number of fully-connected layers, often at the end, after multiple layers of convolutions. In some cases, most of the convolutions can be eliminated without suffering any loss in recognition performance. However, there is no solid recipe to detect the hidden subset of convolutional neurons that is responsible for the majority of the recognition work. In this work, we formulate this as a pruning problem where the aim is to prune as many kernels as possible while preserving the vanilla generalization performance. To this end, we use the matrix characteristics based on eigenvalues for pruning, in comparison to the average absolute weight of a kernel which is the de facto standard in the literature to assess the importance of an individual convolutional kernel, to shed light on the internal mechanisms of a widely used family of CNNs, namely residual neural networks (ResNets), for the image classification problem using CIFAR-10, CIFAR-100 and Tiny ImageNet datasets.
△ Less
Submitted 18 October, 2022; v1 submitted 4 February, 2021;
originally announced February 2021.
-
MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Face Images
Authors:
İlke Çuğu,
Eren Şener,
Emre Akbaş
Abstract:
This paper is aimed at creating extremely small and fast convolutional neural networks (CNN) for the problem of facial expression recognition (FER) from frontal face images. To this end, we employed the popular knowledge distillation (KD) method and identified two major shortcomings with its use: 1) a fine-grained grid search is needed for tuning the temperature hyperparameter and 2) to find the o…
▽ More
This paper is aimed at creating extremely small and fast convolutional neural networks (CNN) for the problem of facial expression recognition (FER) from frontal face images. To this end, we employed the popular knowledge distillation (KD) method and identified two major shortcomings with its use: 1) a fine-grained grid search is needed for tuning the temperature hyperparameter and 2) to find the optimal size-accuracy balance, one needs to search for the final network size (or the compression rate). On the other hand, KD is proved to be useful for model compression for the FER problem, and we discovered that its effects gets more and more significant with the decreasing model size. In addition, we hypothesized that translation invariance achieved using max-pooling layers would not be useful for the FER problem as the expressions are sensitive to small, pixel-wise changes around the eye and the mouth. However, we have found an intriguing improvement on generalization when max-pooling is used. We conducted experiments on two widely-used FER datasets, CK+ and Oulu-CASIA. Our smallest model (MicroExpNet), obtained using knowledge distillation, is less than 1MB in size and works at 1851 frames per second on an Intel i7 CPU. Despite being less accurate than the state-of-the-art, MicroExpNet still provides significant insights for designing a microarchitecture for the FER problem.
△ Less
Submitted 24 December, 2019; v1 submitted 19 November, 2017;
originally announced November 2017.
-
Treelogy: A Novel Tree Classifier Utilizing Deep and Hand-crafted Representations
Authors:
İlke Çuğu,
Eren Şener,
Çağrı Erciyes,
Burak Balcı,
Emre Akın,
Itır Önal,
Ahmet Oğuz Akyüz
Abstract:
We propose a novel tree classification system called Treelogy, that fuses deep representations with hand-crafted features obtained from leaf images to perform leaf-based plant classification. Key to this system are segmentation of the leaf from an untextured background, using convolutional neural networks (CNNs) for learning deep representations, extracting hand-crafted features with a number of i…
▽ More
We propose a novel tree classification system called Treelogy, that fuses deep representations with hand-crafted features obtained from leaf images to perform leaf-based plant classification. Key to this system are segmentation of the leaf from an untextured background, using convolutional neural networks (CNNs) for learning deep representations, extracting hand-crafted features with a number of image processing techniques, training a linear SVM with feature vectors, merging SVM and CNN results, and identifying the species from a dataset of 57 trees. Our classification results show that fusion of deep representations with hand-crafted features leads to the highest accuracy. The proposed algorithm is embedded in a smart-phone application, which is publicly available. Furthermore, our novel dataset comprised of 5408 leaf images is also made public for use of other researchers.
△ Less
Submitted 28 January, 2017;
originally announced January 2017.