-
Precision at Scale: Domain-Specific Datasets On-Demand
Authors:
Jesús M Rodríguez-de-Vera,
Imanol G Estepa,
Ignacio Sarasúa,
Bhalaji Nagarajan,
Petia Radeva
Abstract:
In the realm of self-supervised learning (SSL), conventional wisdom has gravitated towards the utility of massive, general domain datasets for pretraining robust backbones. In this paper, we challenge this idea by exploring if it is possible to bridge the scale between general-domain datasets and (traditionally smaller) domain-specific datasets to reduce the current performance gap. More specifica…
▽ More
In the realm of self-supervised learning (SSL), conventional wisdom has gravitated towards the utility of massive, general domain datasets for pretraining robust backbones. In this paper, we challenge this idea by exploring if it is possible to bridge the scale between general-domain datasets and (traditionally smaller) domain-specific datasets to reduce the current performance gap. More specifically, we propose Precision at Scale (PaS), a novel method for the autonomous creation of domain-specific datasets on-demand. The modularity of the PaS pipeline enables leveraging state-of-the-art foundational and generative models to create a collection of images of any given size belonging to any given domain with minimal human intervention. Extensive analysis in two complex domains, proves the superiority of PaS datasets over existing traditional domain-specific datasets in terms of diversity, scale, and effectiveness in training visual transformers and convolutional neural networks. Most notably, we prove that automatically generated domain-specific datasets lead to better pretraining than large-scale supervised datasets such as ImageNet-1k and ImageNet-21k. Concretely, models trained on domain-specific datasets constructed by PaS pipeline, beat ImageNet-1k pretrained backbones by at least 12% in all the considered domains and classification tasks and lead to better food domain performance than supervised ImageNet-21k pretrain while being 12 times smaller. Code repository: https://github.com/jesusmolrdv/Precision-at-Scale/
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
All4One: Symbiotic Neighbour Contrastive Learning via Self-Attention and Redundancy Reduction
Authors:
Imanol G. Estepa,
Ignacio Sarasúa,
Bhalaji Nagarajan,
Petia Radeva
Abstract:
Nearest neighbour based methods have proved to be one of the most successful self-supervised learning (SSL) approaches due to their high generalization capabilities. However, their computational efficiency decreases when more than one neighbour is used. In this paper, we propose a novel contrastive SSL approach, which we call All4One, that reduces the distance between neighbour representations usi…
▽ More
Nearest neighbour based methods have proved to be one of the most successful self-supervised learning (SSL) approaches due to their high generalization capabilities. However, their computational efficiency decreases when more than one neighbour is used. In this paper, we propose a novel contrastive SSL approach, which we call All4One, that reduces the distance between neighbour representations using ''centroids'' created through a self-attention mechanism. We use a Centroid Contrasting objective along with single Neighbour Contrasting and Feature Contrasting objectives. Centroids help in learning contextual information from multiple neighbours whereas the neighbour contrast enables learning representations directly from the neighbours and the feature contrast allows learning representations unique to the features. This combination enables All4One to outperform popular instance discrimination approaches by more than 1% on linear classification evaluation for popular benchmark datasets and obtains state-of-the-art (SoTA) results. Finally, we show that All4One is robust towards embedding dimensionalities and augmentations, surpassing NNCLR and Barlow Twins by more than 5% on low dimensionality and weak augmentation settings. The source code would be made available soon.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
ELFIS: Expert Learning for Fine-grained Image Recognition Using Subsets
Authors:
Pablo Villacorta,
Jesús M. Rodríguez-de-Vera,
Marc Bolaños,
Ignacio Sarasúa,
Bhalaji Nagarajan,
Petia Radeva
Abstract:
Fine-Grained Visual Recognition (FGVR) tackles the problem of distinguishing highly similar categories. One of the main approaches to FGVR, namely subset learning, tries to leverage information from existing class taxonomies to improve the performance of deep neural networks. However, these methods rely on the existence of handcrafted hierarchies that are not necessarily optimal for the models. In…
▽ More
Fine-Grained Visual Recognition (FGVR) tackles the problem of distinguishing highly similar categories. One of the main approaches to FGVR, namely subset learning, tries to leverage information from existing class taxonomies to improve the performance of deep neural networks. However, these methods rely on the existence of handcrafted hierarchies that are not necessarily optimal for the models. In this paper, we propose ELFIS, an expert learning framework for FGVR that clusters categories of the dataset into meta-categories using both dataset-inherent lexical and model-specific information. A set of neural networks-based experts are trained focusing on the meta-categories and are integrated into a multi-task framework. Extensive experimentation shows improvements in the SoTA FGVR benchmarks of up to +1.3% of accuracy using both CNNs and transformer-based networks. Overall, the obtained results evidence that ELFIS can be applied on top of any classification model, enabling the obtention of SoTA results. The source code will be made public soon.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Hyper-Spectral Imaging for Overlap** Plastic Flakes Segmentation
Authors:
Guillem Martinez,
Maya Aghaei,
Martin Dijkstra,
Bhalaji Nagarajan,
Femke Jaarsma,
Jaap van de Loosdrecht,
Petia Radeva,
Klaas Dijkstra
Abstract:
Given the hyper-spectral imaging unique potentials in gras** the polymer characteristics of different materials, it is commonly used in sorting procedures. In a practical plastic sorting scenario, multiple plastic flakes may overlap which depending on their characteristics, the overlap can be reflected in their spectral signature. In this work, we use hyper-spectral imaging for the segmentation…
▽ More
Given the hyper-spectral imaging unique potentials in gras** the polymer characteristics of different materials, it is commonly used in sorting procedures. In a practical plastic sorting scenario, multiple plastic flakes may overlap which depending on their characteristics, the overlap can be reflected in their spectral signature. In this work, we use hyper-spectral imaging for the segmentation of three types of plastic flakes and their possible overlap** combinations. We propose an intuitive and simple multi-label encoding approach, bitfield encoding, to account for the overlap** regions. With our experiments, we show that the bitfield encoding improves over the baseline single-label approach and we further demonstrate its potential in predicting multiple labels for overlap** classes even when the model is only trained with non-overlap** classes.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Introducing Anisotropic Minkowski Functionals and Quantitative Anisotropy Measures for Local Structure Analysis in Biomedical Imaging
Authors:
Axel Wismueller,
Titas De,
Eva Lochmueller,
Felix Eckstein,
Mahesh B. Nagarajan
Abstract:
The ability of Minkowski Functionals to characterize local structure in different biological tissue types has been demonstrated in a variety of medical image processing tasks. We introduce anisotropic Minkowski Functionals (AMFs) as a novel variant that captures the inherent anisotropy of the underlying gray-level structures. To quantify the anisotropy characterized by our approach, we further int…
▽ More
The ability of Minkowski Functionals to characterize local structure in different biological tissue types has been demonstrated in a variety of medical image processing tasks. We introduce anisotropic Minkowski Functionals (AMFs) as a novel variant that captures the inherent anisotropy of the underlying gray-level structures. To quantify the anisotropy characterized by our approach, we further introduce a method to compute a quantitative measure motivated by a technique utilized in MR diffusion tensor imaging, namely fractional anisotropy. We showcase the applicability of our method in the research context of characterizing the local structure properties of trabecular bone micro-architecture in the proximal femur as visualized on multi-detector CT. To this end, AMFs were computed locally for each pixel of ROIs extracted from the head, neck and trochanter regions. Fractional anisotropy was then used to quantify the local anisotropy of the trabecular structures found in these ROIs and to compare its distribution in different anatomical regions. Our results suggest a significantly greater concentration of anisotropic trabecular structures in the head and neck regions when compared to the trochanter region (p < 10-4). We also evaluated the ability of such AMFs to predict bone strength in the femoral head of proximal femur specimens obtained from 50 donors. Our results suggest that such AMFs, when used in conjunction with multi-regression models, can outperform more conventional features such as BMD in predicting failure load. We conclude that such anisotropic Minkowski Functionals can capture valuable information regarding directional attributes of local structure, which may be useful in a wide scope of biomedical imaging applications.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Using anisotropic 3D Minkowski functionals for trabecular bone characterization and biomechanical strength prediction in proximal femur specimens
Authors:
Mahesh B. Nagarajan,
Titas De,
Eva-Maria Lochmueller,
Felix Eckstein,
Axel Wismueller
Abstract:
The ability of Anisotropic Minkowski Functionals (AMFs) to capture local anisotropy while evaluating topological properties of the underlying gray-level structures has been previously demonstrated. We evaluate the ability of this approach to characterize local structure properties of trabecular bone micro-architecture in ex vivo proximal femur specimens, as visualized on multi-detector CT, for pur…
▽ More
The ability of Anisotropic Minkowski Functionals (AMFs) to capture local anisotropy while evaluating topological properties of the underlying gray-level structures has been previously demonstrated. We evaluate the ability of this approach to characterize local structure properties of trabecular bone micro-architecture in ex vivo proximal femur specimens, as visualized on multi-detector CT, for purposes of biomechanical bone strength prediction. To this end, volumetric AMFs were computed locally for each voxel of volumes of interest (VOI) extracted from the femoral head of 146 specimens. The local anisotropy captured by such AMFs was quantified using a fractional anisotropy measure; the magnitude and direction of anisotropy at every pixel was stored in histograms that served as a feature vectors that characterized the VOIs. A linear multi-regression analysis algorithm was used to predict the failure load (FL) from the feature sets; the predicted FL was compared to the true FL determined through biomechanical testing. The prediction performance was measured by the root mean square error (RMSE) for each feature set. The best prediction performance was obtained from the fractional anisotropy histogram of AMF Euler Characteristic (RMSE = 1.01 +- 0.13), which was significantly better than MDCT-derived mean BMD (RMSE = 1.12 +- 0.16, p<0.05). We conclude that such anisotropic Minkowski Functionals can capture valuable information regarding regional trabecular bone quality and contribute to improved bone strength prediction, which is important for improving the clinical assessment of osteoporotic fracture risk.
△ Less
Submitted 16 February, 2020;
originally announced February 2020.
-
Deep Net Features for Complex Emotion Recognition
Authors:
Bhalaji Nagarajan,
V Ramana Murthy Oruganti
Abstract:
This paper investigates the influence of different acoustic features, audio-events based features and automatic speech translation based lexical features in complex emotion recognition such as curiosity. Pretrained networks, namely, AudioSet Net, VoxCeleb Net and Deep Speech Net trained extensively for different speech based applications are studied for this objective. Information from deep layers…
▽ More
This paper investigates the influence of different acoustic features, audio-events based features and automatic speech translation based lexical features in complex emotion recognition such as curiosity. Pretrained networks, namely, AudioSet Net, VoxCeleb Net and Deep Speech Net trained extensively for different speech based applications are studied for this objective. Information from deep layers of these networks are considered as descriptors and encoded into feature vectors. Experimental results on the EmoReact dataset consisting of 8 complex emotions show the effectiveness, yielding highest F1 score of 0.85 as against the baseline of 0.69 in the literature.
△ Less
Submitted 2 November, 2018; v1 submitted 31 October, 2018;
originally announced November 2018.
-
Deep Learning as Feature Encoding for Emotion Recognition
Authors:
Bhalaji Nagarajan,
V Ramana Murthy Oruganti
Abstract:
Deep learning is popular as an end-to-end framework extracting the prominent features and performing the classification also. In this paper, we extensively investigate deep networks as an alternate to feature encoding technique of low level descriptors for emotion recognition on the benchmark EmoDB dataset. Fusion performance with such obtained encoded features with other available features is als…
▽ More
Deep learning is popular as an end-to-end framework extracting the prominent features and performing the classification also. In this paper, we extensively investigate deep networks as an alternate to feature encoding technique of low level descriptors for emotion recognition on the benchmark EmoDB dataset. Fusion performance with such obtained encoded features with other available features is also investigated. Highest performance to date in the literature is observed.
△ Less
Submitted 14 November, 2018; v1 submitted 30 October, 2018;
originally announced October 2018.
-
A Framework for Exploring Non-Linear Functional Connectivity and Causality in the Human Brain: Mutual Connectivity Analysis (MCA) of Resting-State Functional MRI with Convergent Cross-Map** and Non-Metric Clustering
Authors:
Axel Wismüller,
Xixi Wang,
Adora M. DSouza,
Mahesh B. Nagarajan
Abstract:
We present a computational framework for analysis and visualization of non-linear functional connectivity in the human brain from resting state functional MRI (fMRI) data for purposes of recovering the underlying network community structure and exploring causality between network components. Our proposed methodology of non-linear mutual connectivity analysis (MCA) involves two computational steps.…
▽ More
We present a computational framework for analysis and visualization of non-linear functional connectivity in the human brain from resting state functional MRI (fMRI) data for purposes of recovering the underlying network community structure and exploring causality between network components. Our proposed methodology of non-linear mutual connectivity analysis (MCA) involves two computational steps. First, the pair-wise cross-prediction performance between resting state fMRI pixel time series within the brain is evaluated. The underlying network structure is subsequently recovered from the affinity matrix constructed through MCA using non-metric network partitioning/clustering with the so-called Louvain method. We demonstrate our methodology in the task of identifying regions of the motor cortex associated with hand movement on resting state fMRI data acquired from eight slice locations in four subjects. For comparison, we also localized regions of the motor cortex through a task-based fMRI sequence involving a finger-tap** stimulus paradigm. Finally, we integrate convergent cross map** (CCM) into the first step of MCA for investigating causality between regions of the motor cortex. Results regarding causation between regions of the motor cortex revealed a significant directional variability and were not readily interpretable in a consistent manner across all subjects. However, our results on whole-slice fMRI analysis demonstrate that MCA-based model-free recovery of regions associated with the primary motor cortex and supplementary motor area are in close agreement with localization of similar regions achieved with a task-based fMRI acquisition. Thus, we conclude that our computational framework MCA can extract and visualize valuable information concerning the underlying network structure and causation between different regions of the brain in resting state fMRI.
△ Less
Submitted 14 July, 2014;
originally announced July 2014.