-
CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging
Authors:
Raza Imam,
Mohammed Talha Alam,
Umaima Rahman,
Mohsen Guizani,
Fakhri Karray
Abstract:
Existing vision-text contrastive learning models enhance representation transferability and support zero-shot prediction by matching paired image and caption embeddings while pushing unrelated pairs apart. However, astronomical image-label datasets are significantly smaller compared to general image and label datasets available from the internet. We introduce CosmoCLIP, an astronomical image-text…
▽ More
Existing vision-text contrastive learning models enhance representation transferability and support zero-shot prediction by matching paired image and caption embeddings while pushing unrelated pairs apart. However, astronomical image-label datasets are significantly smaller compared to general image and label datasets available from the internet. We introduce CosmoCLIP, an astronomical image-text contrastive learning framework precisely fine-tuned on the pre-trained CLIP model using SpaceNet and BLIP-based captions. SpaceNet, attained via FLARE, constitutes ~13k optimally distributed images, while BLIP acts as a rich knowledge extractor. The rich semantics derived from this SpaceNet and BLIP descriptions, when learned contrastively, enable CosmoCLIP to achieve superior generalization across various in-domain and out-of-domain tasks. Our results demonstrate that CosmoCLIP is a straightforward yet powerful framework, significantly outperforming CLIP in zero-shot classification and image-text retrieval tasks.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
AstroSpy: On detecting Fake Images in Astronomy via Joint Image-Spectral Representations
Authors:
Mohammed Talha Alam,
Raza Imam,
Mohsen Guizani,
Fakhri Karray
Abstract:
The prevalence of AI-generated imagery has raised concerns about the authenticity of astronomical images, especially with advanced text-to-image models like Stable Diffusion producing highly realistic synthetic samples. Existing detection methods, primarily based on convolutional neural networks (CNNs) or spectral analysis, have limitations when used independently. We present AstroSpy, a hybrid mo…
▽ More
The prevalence of AI-generated imagery has raised concerns about the authenticity of astronomical images, especially with advanced text-to-image models like Stable Diffusion producing highly realistic synthetic samples. Existing detection methods, primarily based on convolutional neural networks (CNNs) or spectral analysis, have limitations when used independently. We present AstroSpy, a hybrid model that integrates both spectral and image features to distinguish real from synthetic astronomical images. Trained on a unique dataset of real NASA images and AI-generated fakes (approximately 18k samples), AstroSpy utilizes a dual-pathway architecture to fuse spatial and spectral information. This approach enables AstroSpy to achieve superior performance in identifying authentic astronomical images. Extensive evaluations demonstrate AstroSpy's effectiveness and robustness, significantly outperforming baseline models in both in-domain and cross-domain tasks, highlighting its potential to combat misinformation in astronomy.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
FLARE up your data: Diffusion-based Augmentation Method in Astronomical Imaging
Authors:
Mohammed Talha Alam,
Raza Imam,
Mohsen Guizani,
Fakhri Karray
Abstract:
The intersection of Astronomy and AI encounters significant challenges related to issues such as noisy backgrounds, lower resolution (LR), and the intricate process of filtering and archiving images from advanced telescopes like the James Webb. Given the dispersion of raw images in feature space, we have proposed a \textit{two-stage augmentation framework} entitled as \textbf{FLARE} based on \unde…
▽ More
The intersection of Astronomy and AI encounters significant challenges related to issues such as noisy backgrounds, lower resolution (LR), and the intricate process of filtering and archiving images from advanced telescopes like the James Webb. Given the dispersion of raw images in feature space, we have proposed a \textit{two-stage augmentation framework} entitled as \textbf{FLARE} based on \underline{f}eature \underline{l}earning and \underline{a}ugmented \underline{r}esolution \underline{e}nhancement. We first apply lower (LR) to higher resolution (HR) conversion followed by standard augmentations. Secondly, we integrate a diffusion approach to synthetically generate samples using class-concatenated prompts. By merging these two stages using weighted percentiles, we realign the feature space distribution, enabling a classification model to establish a distinct decision boundary and achieve superior generalization on various in-domain and out-of-domain tasks. We conducted experiments on several downstream cosmos datasets and on our optimally distributed \textbf{SpaceNet} dataset across 8-class fine-grained and 4-class macro classification tasks. FLARE attains the highest performance gain of 20.78\% for fine-grained tasks compared to similar baselines, while across different classification models, FLARE shows a consistent increment of an average of +15\%. This outcome underscores the effectiveness of the FLARE method in enhancing the precision of image classification, ultimately bolstering the reliability of astronomical research outcomes. % Our code and SpaceNet dataset will be released to the public soon. Our code and SpaceNet dataset is available at \href{https://github.com/Razaimam45/PlanetX_Dxb}{\textit{https://github.com/Razaimam45/PlanetX\_Dxb}}.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Practically adaptable CPABE based Health-Records sharing framework
Authors:
Raza Imam,
Faisal Anwer
Abstract:
With recent elevated adaptation of cloud services in almost every major public sector, the health sector emerges as a vulnerable segment, particularly in data exchange of sensitive Health records, as determining the retention, exchange, and efficient use of patient records without jeopardizing patient privacy, particularly on mobile-applications remains an area to expand. In the existing scenarios…
▽ More
With recent elevated adaptation of cloud services in almost every major public sector, the health sector emerges as a vulnerable segment, particularly in data exchange of sensitive Health records, as determining the retention, exchange, and efficient use of patient records without jeopardizing patient privacy, particularly on mobile-applications remains an area to expand. In the existing scenarios of cloud-mobile services, several vulnerabilities can be found including trap** of data within a single cloud-service-provider and loss of resource control being the significant ones. In this study, we have suggested a CPABE and OAuth2.0 based framework for efficient access-control and authorization respectively to improve the practicality of EHR sharing across a single client-application. In addition to solving issues like practicality, data entrapment, and resource control loss, the suggested framework also aims to provide two significant functionalities simultaneously, the specific operation of client application itself, and straightforward access of data to institutions, governments, and organizations seeking delicate EHRs. Our implementation of the suggested framework along with its analytical comparison signifies its potential in terms of efficient performance and minimal latency as this study would have a considerable impact on the recent literature as it intends to bridge the pragmatic deficit in CPABE-based EHR services.
△ Less
Submitted 10 March, 2024;
originally announced March 2024.
-
Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance
Authors:
Raza Imam,
Muhammad Huzaifa,
Nabil Mansour,
Shaher Bano Mirza,
Fouad Lamghari
Abstract:
In this study, we propose an automated framework for camel farm monitoring, introducing two key contributions: the Unified Auto-Annotation framework and the Fine-Tune Distillation framework. The Unified Auto-Annotation approach combines two models, GroundingDINO (GD), and Segment-Anything-Model (SAM), to automatically annotate raw datasets extracted from surveillance videos. Building upon this fou…
▽ More
In this study, we propose an automated framework for camel farm monitoring, introducing two key contributions: the Unified Auto-Annotation framework and the Fine-Tune Distillation framework. The Unified Auto-Annotation approach combines two models, GroundingDINO (GD), and Segment-Anything-Model (SAM), to automatically annotate raw datasets extracted from surveillance videos. Building upon this foundation, the Fine-Tune Distillation framework conducts fine-tuning of student models using the auto-annotated dataset. This process involves transferring knowledge from a large teacher model to a student model, resembling a variant of Knowledge Distillation. The Fine-Tune Distillation framework aims to be adaptable to specific use cases, enabling the transfer of knowledge from the large models to the small models, making it suitable for domain-specific applications. By leveraging our raw dataset collected from Al-Marmoom Camel Farm in Dubai, UAE, and a pre-trained teacher model, GroundingDINO, the Fine-Tune Distillation framework produces a lightweight deployable model, YOLOv8. This framework demonstrates high performance and computational efficiency, facilitating efficient real-time object detection. Our code is available at \href{https://github.com/Razaimam45/Fine-Tune-Distillation}{https://github.com/Razaimam45/Fine-Tune-Distillation}
△ Less
Submitted 10 February, 2024;
originally announced February 2024.
-
EVOKE: Emotion Enabled Virtual Avatar Map** Using Optimized Knowledge Distillation
Authors:
Maryam Nadeem,
Raza Imam,
Rouqaiah Al-Refai,
Meriem Chkir,
Mohamad Hoda,
Abdulmotaleb El Saddik
Abstract:
As virtual environments continue to advance, the demand for immersive and emotionally engaging experiences has grown. Addressing this demand, we introduce Emotion enabled Virtual avatar map** using Optimized KnowledgE distillation (EVOKE), a lightweight emotion recognition framework designed for the seamless integration of emotion recognition into 3D avatars within virtual environments. Our appr…
▽ More
As virtual environments continue to advance, the demand for immersive and emotionally engaging experiences has grown. Addressing this demand, we introduce Emotion enabled Virtual avatar map** using Optimized KnowledgE distillation (EVOKE), a lightweight emotion recognition framework designed for the seamless integration of emotion recognition into 3D avatars within virtual environments. Our approach leverages knowledge distillation involving multi-label classification on the publicly available DEAP dataset, which covers valence, arousal, and dominance as primary emotional classes. Remarkably, our distilled model, a CNN with only two convolutional layers and 18 times fewer parameters than the teacher model, achieves competitive results, boasting an accuracy of 87% while demanding far less computational resources. This equilibrium between performance and deployability positions our framework as an ideal choice for virtual environment systems. Furthermore, the multi-label classification outcomes are utilized to map emotions onto custom-designed 3D avatars.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
SEDA: Self-Ensembling ViT with Defensive Distillation and Adversarial Training for robust Chest X-rays Classification
Authors:
Raza Imam,
Ibrahim Almakky,
Salma Alrashdi,
Baketah Alrashdi,
Mohammad Yaqub
Abstract:
Deep Learning methods have recently seen increased adoption in medical imaging applications. However, elevated vulnerabilities have been explored in recent Deep Learning solutions, which can hinder future adoption. Particularly, the vulnerability of Vision Transformer (ViT) to adversarial, privacy, and confidentiality attacks raise serious concerns about their reliability in medical settings. This…
▽ More
Deep Learning methods have recently seen increased adoption in medical imaging applications. However, elevated vulnerabilities have been explored in recent Deep Learning solutions, which can hinder future adoption. Particularly, the vulnerability of Vision Transformer (ViT) to adversarial, privacy, and confidentiality attacks raise serious concerns about their reliability in medical settings. This work aims to enhance the robustness of self-ensembling ViTs for the tuberculosis chest x-ray classification task. We propose Self-Ensembling ViT with defensive Distillation and Adversarial training (SEDA). SEDA utilizes efficient CNN blocks to learn spatial features with various levels of abstraction from feature representations extracted from intermediate ViT blocks, that are largely unaffected by adversarial perturbations. Furthermore, SEDA leverages adversarial training in combination with defensive distillation for improved robustness against adversaries. Training using adversarial examples leads to better model generalizability and improves its ability to handle perturbations. Distillation using soft probabilities introduces uncertainty and variation into the output probabilities, making it more difficult for adversarial and privacy attacks. Extensive experiments performed with the proposed architecture and training paradigm on publicly available Tuberculosis x-ray dataset shows SOTA efficacy of SEDA compared to SEViT in terms of computational efficiency with 70x times lighter framework and enhanced robustness of +9%.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Optimizing Brain Tumor Classification: A Comprehensive Study on Transfer Learning and Imbalance Handling in Deep Learning Models
Authors:
Raza Imam,
Mohammed Talha Alam
Abstract:
Deep learning has emerged as a prominent field in recent literature, showcasing the introduction of models that utilize transfer learning to achieve remarkable accuracies in the classification of brain tumor MRI images. However, the majority of these proposals primarily focus on balanced datasets, neglecting the inherent data imbalance present in real-world scenarios. Consequently, there is a pres…
▽ More
Deep learning has emerged as a prominent field in recent literature, showcasing the introduction of models that utilize transfer learning to achieve remarkable accuracies in the classification of brain tumor MRI images. However, the majority of these proposals primarily focus on balanced datasets, neglecting the inherent data imbalance present in real-world scenarios. Consequently, there is a pressing need for approaches that not only address the data imbalance but also prioritize precise classification of brain cancer. In this work, we present a novel deep learning-based approach, called Transfer Learning-CNN, for brain tumor classification using MRI data. The proposed model leverages the predictive capabilities of existing publicly available models by utilizing their pre-trained weights and transferring those weights to the CNN. By leveraging a publicly available Brain MRI dataset, the experiment evaluated various transfer learning models for classifying different tumor types, including meningioma, glioma, and pituitary tumors. We investigate the impact of different loss functions, including focal loss, and oversampling methods, such as SMOTE and ADASYN, in addressing the data imbalance issue. Notably, the proposed strategy, which combines VGG-16 and CNN, achieved an impressive accuracy rate of 96%, surpassing alternative approaches significantly.
△ Less
Submitted 13 August, 2023;
originally announced August 2023.
-
Filter Bubbles in Recommender Systems: Fact or Fallacy -- A Systematic Review
Authors:
Qazi Mohammad Areeb,
Mohammad Nadeem,
Shahab Saquib Sohail,
Raza Imam,
Faiyaz Doctor,
Yassine Himeur,
Amir Hussain,
Abbes Amira
Abstract:
A filter bubble refers to the phenomenon where Internet customization effectively isolates individuals from diverse opinions or materials, resulting in their exposure to only a select set of content. This can lead to the reinforcement of existing attitudes, beliefs, or conditions. In this study, our primary focus is to investigate the impact of filter bubbles in recommender systems. This pioneerin…
▽ More
A filter bubble refers to the phenomenon where Internet customization effectively isolates individuals from diverse opinions or materials, resulting in their exposure to only a select set of content. This can lead to the reinforcement of existing attitudes, beliefs, or conditions. In this study, our primary focus is to investigate the impact of filter bubbles in recommender systems. This pioneering research aims to uncover the reasons behind this problem, explore potential solutions, and propose an integrated tool to help users avoid filter bubbles in recommender systems. To achieve this objective, we conduct a systematic literature review on the topic of filter bubbles in recommender systems. The reviewed articles are carefully analyzed and classified, providing valuable insights that inform the development of an integrated approach. Notably, our review reveals evidence of filter bubbles in recommendation systems, highlighting several biases that contribute to their existence. Moreover, we propose mechanisms to mitigate the impact of filter bubbles and demonstrate that incorporating diversity into recommendations can potentially help alleviate this issue. The findings of this timely review will serve as a benchmark for researchers working in interdisciplinary fields such as privacy, artificial intelligence ethics, and recommendation systems. Furthermore, it will open new avenues for future research in related domains, prompting further exploration and advancement in this critical area.
△ Less
Submitted 2 July, 2023;
originally announced July 2023.
-
On enhancing the robustness of Vision Transformers: Defensive Diffusion
Authors:
Raza Imam,
Muhammad Huzaifa,
Mohammed El-Amine Azz
Abstract:
Privacy and confidentiality of medical data are of utmost importance in healthcare settings. ViTs, the SOTA vision model, rely on large amounts of patient data for training, which raises concerns about data security and the potential for unauthorized access. Adversaries may exploit vulnerabilities in ViTs to extract sensitive patient information and compromising patient privacy. This work address…
▽ More
Privacy and confidentiality of medical data are of utmost importance in healthcare settings. ViTs, the SOTA vision model, rely on large amounts of patient data for training, which raises concerns about data security and the potential for unauthorized access. Adversaries may exploit vulnerabilities in ViTs to extract sensitive patient information and compromising patient privacy. This work address these vulnerabilities to ensure the trustworthiness and reliability of ViTs in medical applications. In this work, we introduced a defensive diffusion technique as an adversarial purifier to eliminate adversarial noise introduced by attackers in the original image. By utilizing the denoising capabilities of the diffusion model, we employ a reverse diffusion process to effectively eliminate the adversarial noise from the attack sample, resulting in a cleaner image that is then fed into the ViT blocks. Our findings demonstrate the effectiveness of the diffusion model in eliminating attack-agnostic adversarial noise from images. Additionally, we propose combining knowledge distillation with our framework to obtain a lightweight student model that is both computationally efficient and robust against gray box attacks. Comparison of our method with a SOTA baseline method, SEViT, shows that our work is able to outperform the baseline. Extensive experiments conducted on a publicly available Tuberculosis X-ray dataset validate the computational efficiency and improved robustness achieved by our proposed architecture.
△ Less
Submitted 13 May, 2023;
originally announced May 2023.