Search | arXiv e-print repository

Precision at Scale: Domain-Specific Datasets On-Demand

Authors: Jesús M Rodríguez-de-Vera, Imanol G Estepa, Ignacio Sarasúa, Bhalaji Nagarajan, Petia Radeva

Abstract: In the realm of self-supervised learning (SSL), conventional wisdom has gravitated towards the utility of massive, general domain datasets for pretraining robust backbones. In this paper, we challenge this idea by exploring if it is possible to bridge the scale between general-domain datasets and (traditionally smaller) domain-specific datasets to reduce the current performance gap. More specifica… ▽ More In the realm of self-supervised learning (SSL), conventional wisdom has gravitated towards the utility of massive, general domain datasets for pretraining robust backbones. In this paper, we challenge this idea by exploring if it is possible to bridge the scale between general-domain datasets and (traditionally smaller) domain-specific datasets to reduce the current performance gap. More specifically, we propose Precision at Scale (PaS), a novel method for the autonomous creation of domain-specific datasets on-demand. The modularity of the PaS pipeline enables leveraging state-of-the-art foundational and generative models to create a collection of images of any given size belonging to any given domain with minimal human intervention. Extensive analysis in two complex domains, proves the superiority of PaS datasets over existing traditional domain-specific datasets in terms of diversity, scale, and effectiveness in training visual transformers and convolutional neural networks. Most notably, we prove that automatically generated domain-specific datasets lead to better pretraining than large-scale supervised datasets such as ImageNet-1k and ImageNet-21k. Concretely, models trained on domain-specific datasets constructed by PaS pipeline, beat ImageNet-1k pretrained backbones by at least 12% in all the considered domains and classification tasks and lead to better food domain performance than supervised ImageNet-21k pretrain while being 12 times smaller. Code repository: https://github.com/jesusmolrdv/Precision-at-Scale/ △ Less

Submitted 3 July, 2024; originally announced July 2024.

ACM Class: I.5.4; I.5.2; I.2.1; I.2.10

arXiv:2303.09417 [pdf, other]

All4One: Symbiotic Neighbour Contrastive Learning via Self-Attention and Redundancy Reduction

Authors: Imanol G. Estepa, Ignacio Sarasúa, Bhalaji Nagarajan, Petia Radeva

Abstract: Nearest neighbour based methods have proved to be one of the most successful self-supervised learning (SSL) approaches due to their high generalization capabilities. However, their computational efficiency decreases when more than one neighbour is used. In this paper, we propose a novel contrastive SSL approach, which we call All4One, that reduces the distance between neighbour representations usi… ▽ More Nearest neighbour based methods have proved to be one of the most successful self-supervised learning (SSL) approaches due to their high generalization capabilities. However, their computational efficiency decreases when more than one neighbour is used. In this paper, we propose a novel contrastive SSL approach, which we call All4One, that reduces the distance between neighbour representations using ''centroids'' created through a self-attention mechanism. We use a Centroid Contrasting objective along with single Neighbour Contrasting and Feature Contrasting objectives. Centroids help in learning contextual information from multiple neighbours whereas the neighbour contrast enables learning representations directly from the neighbours and the feature contrast allows learning representations unique to the features. This combination enables All4One to outperform popular instance discrimination approaches by more than 1% on linear classification evaluation for popular benchmark datasets and obtains state-of-the-art (SoTA) results. Finally, we show that All4One is robust towards embedding dimensionalities and augmentations, surpassing NNCLR and Barlow Twins by more than 5% on low dimensionality and weak augmentation settings. The source code would be made available soon. △ Less

Submitted 16 March, 2023; originally announced March 2023.

Comments: 14 pages, 9 figures

ACM Class: I.5.4; I.5.1; I.2.10

arXiv:2303.09269 [pdf, other]

ELFIS: Expert Learning for Fine-grained Image Recognition Using Subsets

Authors: Pablo Villacorta, Jesús M. Rodríguez-de-Vera, Marc Bolaños, Ignacio Sarasúa, Bhalaji Nagarajan, Petia Radeva

Abstract: Fine-Grained Visual Recognition (FGVR) tackles the problem of distinguishing highly similar categories. One of the main approaches to FGVR, namely subset learning, tries to leverage information from existing class taxonomies to improve the performance of deep neural networks. However, these methods rely on the existence of handcrafted hierarchies that are not necessarily optimal for the models. In… ▽ More Fine-Grained Visual Recognition (FGVR) tackles the problem of distinguishing highly similar categories. One of the main approaches to FGVR, namely subset learning, tries to leverage information from existing class taxonomies to improve the performance of deep neural networks. However, these methods rely on the existence of handcrafted hierarchies that are not necessarily optimal for the models. In this paper, we propose ELFIS, an expert learning framework for FGVR that clusters categories of the dataset into meta-categories using both dataset-inherent lexical and model-specific information. A set of neural networks-based experts are trained focusing on the meta-categories and are integrated into a multi-task framework. Extensive experimentation shows improvements in the SoTA FGVR benchmarks of up to +1.3% of accuracy using both CNNs and transformer-based networks. Overall, the obtained results evidence that ELFIS can be applied on top of any classification model, enabling the obtention of SoTA results. The source code will be made public soon. △ Less

Submitted 16 March, 2023; originally announced March 2023.

Comments: Pablo Villacorta and Jesús M. Rodríguez-de-Vera contributed equally to this work. 16 pages, 10 figures

ACM Class: I.5.4; I.5.1; I.2.10

arXiv:2203.12350 [pdf, other]

Hyper-Spectral Imaging for Overlap** Plastic Flakes Segmentation

Authors: Guillem Martinez, Maya Aghaei, Martin Dijkstra, Bhalaji Nagarajan, Femke Jaarsma, Jaap van de Loosdrecht, Petia Radeva, Klaas Dijkstra

Abstract: Given the hyper-spectral imaging unique potentials in gras** the polymer characteristics of different materials, it is commonly used in sorting procedures. In a practical plastic sorting scenario, multiple plastic flakes may overlap which depending on their characteristics, the overlap can be reflected in their spectral signature. In this work, we use hyper-spectral imaging for the segmentation… ▽ More Given the hyper-spectral imaging unique potentials in gras** the polymer characteristics of different materials, it is commonly used in sorting procedures. In a practical plastic sorting scenario, multiple plastic flakes may overlap which depending on their characteristics, the overlap can be reflected in their spectral signature. In this work, we use hyper-spectral imaging for the segmentation of three types of plastic flakes and their possible overlap** combinations. We propose an intuitive and simple multi-label encoding approach, bitfield encoding, to account for the overlap** regions. With our experiments, we show that the bitfield encoding improves over the baseline single-label approach and we further demonstrate its potential in predicting multiple labels for overlap** classes even when the model is only trained with non-overlap** classes. △ Less

Submitted 23 March, 2022; originally announced March 2022.

Comments: Submitted to ICIP2022

arXiv:2004.01185 [pdf]

doi 10.1117/86720I-86720I-8

Introducing Anisotropic Minkowski Functionals and Quantitative Anisotropy Measures for Local Structure Analysis in Biomedical Imaging

Authors: Axel Wismueller, Titas De, Eva Lochmueller, Felix Eckstein, Mahesh B. Nagarajan

Abstract: The ability of Minkowski Functionals to characterize local structure in different biological tissue types has been demonstrated in a variety of medical image processing tasks. We introduce anisotropic Minkowski Functionals (AMFs) as a novel variant that captures the inherent anisotropy of the underlying gray-level structures. To quantify the anisotropy characterized by our approach, we further int… ▽ More The ability of Minkowski Functionals to characterize local structure in different biological tissue types has been demonstrated in a variety of medical image processing tasks. We introduce anisotropic Minkowski Functionals (AMFs) as a novel variant that captures the inherent anisotropy of the underlying gray-level structures. To quantify the anisotropy characterized by our approach, we further introduce a method to compute a quantitative measure motivated by a technique utilized in MR diffusion tensor imaging, namely fractional anisotropy. We showcase the applicability of our method in the research context of characterizing the local structure properties of trabecular bone micro-architecture in the proximal femur as visualized on multi-detector CT. To this end, AMFs were computed locally for each pixel of ROIs extracted from the head, neck and trochanter regions. Fractional anisotropy was then used to quantify the local anisotropy of the trabecular structures found in these ROIs and to compare its distribution in different anatomical regions. Our results suggest a significantly greater concentration of anisotropic trabecular structures in the head and neck regions when compared to the trochanter region (p < 10-4). We also evaluated the ability of such AMFs to predict bone strength in the femoral head of proximal femur specimens obtained from 50 donors. Our results suggest that such AMFs, when used in conjunction with multi-regression models, can outperform more conventional features such as BMD in predicting failure load. We conclude that such anisotropic Minkowski Functionals can capture valuable information regarding directional attributes of local structure, which may be useful in a wide scope of biomedical imaging applications. △ Less

Submitted 2 April, 2020; originally announced April 2020.

Comments: SPIE Medical Imaging 2013. arXiv admin note: text overlap with arXiv:2002.07156

arXiv:2002.07156 [pdf]

Using anisotropic 3D Minkowski functionals for trabecular bone characterization and biomechanical strength prediction in proximal femur specimens

Authors: Mahesh B. Nagarajan, Titas De, Eva-Maria Lochmueller, Felix Eckstein, Axel Wismueller

Abstract: The ability of Anisotropic Minkowski Functionals (AMFs) to capture local anisotropy while evaluating topological properties of the underlying gray-level structures has been previously demonstrated. We evaluate the ability of this approach to characterize local structure properties of trabecular bone micro-architecture in ex vivo proximal femur specimens, as visualized on multi-detector CT, for pur… ▽ More The ability of Anisotropic Minkowski Functionals (AMFs) to capture local anisotropy while evaluating topological properties of the underlying gray-level structures has been previously demonstrated. We evaluate the ability of this approach to characterize local structure properties of trabecular bone micro-architecture in ex vivo proximal femur specimens, as visualized on multi-detector CT, for purposes of biomechanical bone strength prediction. To this end, volumetric AMFs were computed locally for each voxel of volumes of interest (VOI) extracted from the femoral head of 146 specimens. The local anisotropy captured by such AMFs was quantified using a fractional anisotropy measure; the magnitude and direction of anisotropy at every pixel was stored in histograms that served as a feature vectors that characterized the VOIs. A linear multi-regression analysis algorithm was used to predict the failure load (FL) from the feature sets; the predicted FL was compared to the true FL determined through biomechanical testing. The prediction performance was measured by the root mean square error (RMSE) for each feature set. The best prediction performance was obtained from the fractional anisotropy histogram of AMF Euler Characteristic (RMSE = 1.01 +- 0.13), which was significantly better than MDCT-derived mean BMD (RMSE = 1.12 +- 0.16, p<0.05). We conclude that such anisotropic Minkowski Functionals can capture valuable information regarding regional trabecular bone quality and contribute to improved bone strength prediction, which is important for improving the clinical assessment of osteoporotic fracture risk. △ Less

Submitted 16 February, 2020; originally announced February 2020.

Comments: SPIE Medical Imaging Conference 2014

arXiv:1811.00003

Deep Net Features for Complex Emotion Recognition

Authors: Bhalaji Nagarajan, V Ramana Murthy Oruganti

Abstract: This paper investigates the influence of different acoustic features, audio-events based features and automatic speech translation based lexical features in complex emotion recognition such as curiosity. Pretrained networks, namely, AudioSet Net, VoxCeleb Net and Deep Speech Net trained extensively for different speech based applications are studied for this objective. Information from deep layers… ▽ More This paper investigates the influence of different acoustic features, audio-events based features and automatic speech translation based lexical features in complex emotion recognition such as curiosity. Pretrained networks, namely, AudioSet Net, VoxCeleb Net and Deep Speech Net trained extensively for different speech based applications are studied for this objective. Information from deep layers of these networks are considered as descriptors and encoded into feature vectors. Experimental results on the EmoReact dataset consisting of 8 complex emotions show the effectiveness, yielding highest F1 score of 0.85 as against the baseline of 0.69 in the literature. △ Less

Submitted 2 November, 2018; v1 submitted 31 October, 2018; originally announced November 2018.

Comments: Conflict of interest

arXiv:1810.12613

Deep Learning as Feature Encoding for Emotion Recognition

Authors: Bhalaji Nagarajan, V Ramana Murthy Oruganti

Abstract: Deep learning is popular as an end-to-end framework extracting the prominent features and performing the classification also. In this paper, we extensively investigate deep networks as an alternate to feature encoding technique of low level descriptors for emotion recognition on the benchmark EmoDB dataset. Fusion performance with such obtained encoded features with other available features is als… ▽ More Deep learning is popular as an end-to-end framework extracting the prominent features and performing the classification also. In this paper, we extensively investigate deep networks as an alternate to feature encoding technique of low level descriptors for emotion recognition on the benchmark EmoDB dataset. Fusion performance with such obtained encoded features with other available features is also investigated. Highest performance to date in the literature is observed. △ Less

Submitted 14 November, 2018; v1 submitted 30 October, 2018; originally announced October 2018.

Comments: Issues pertaining with experimental results reported in paper

arXiv:1407.3809 [pdf]

A Framework for Exploring Non-Linear Functional Connectivity and Causality in the Human Brain: Mutual Connectivity Analysis (MCA) of Resting-State Functional MRI with Convergent Cross-Map** and Non-Metric Clustering

Authors: Axel Wismüller, Xixi Wang, Adora M. DSouza, Mahesh B. Nagarajan

Abstract: We present a computational framework for analysis and visualization of non-linear functional connectivity in the human brain from resting state functional MRI (fMRI) data for purposes of recovering the underlying network community structure and exploring causality between network components. Our proposed methodology of non-linear mutual connectivity analysis (MCA) involves two computational steps.… ▽ More We present a computational framework for analysis and visualization of non-linear functional connectivity in the human brain from resting state functional MRI (fMRI) data for purposes of recovering the underlying network community structure and exploring causality between network components. Our proposed methodology of non-linear mutual connectivity analysis (MCA) involves two computational steps. First, the pair-wise cross-prediction performance between resting state fMRI pixel time series within the brain is evaluated. The underlying network structure is subsequently recovered from the affinity matrix constructed through MCA using non-metric network partitioning/clustering with the so-called Louvain method. We demonstrate our methodology in the task of identifying regions of the motor cortex associated with hand movement on resting state fMRI data acquired from eight slice locations in four subjects. For comparison, we also localized regions of the motor cortex through a task-based fMRI sequence involving a finger-tap** stimulus paradigm. Finally, we integrate convergent cross map** (CCM) into the first step of MCA for investigating causality between regions of the motor cortex. Results regarding causation between regions of the motor cortex revealed a significant directional variability and were not readily interpretable in a consistent manner across all subjects. However, our results on whole-slice fMRI analysis demonstrate that MCA-based model-free recovery of regions associated with the primary motor cortex and supplementary motor area are in close agreement with localization of similar regions achieved with a task-based fMRI acquisition. Thus, we conclude that our computational framework MCA can extract and visualize valuable information concerning the underlying network structure and causation between different regions of the brain in resting state fMRI. △ Less

Submitted 14 July, 2014; originally announced July 2014.

Comments: Axel Wismüller and Mahesh B. Nagarajan contributed equally to the preparation of this manuscript. Pre-publication draft: 18 pages, 6 figures, 1 table

Showing 1–9 of 9 results for author: Nagarajan, B