Search | arXiv e-print repository

arXiv:2406.13129 [pdf, other]

M3T: Multi-Modal Medical Transformer to bridge Clinical Context with Visual Insights for Retinal Image Medical Description Generation

Authors: Nagur Shareef Shaik, Teja Krishna Cherukuri, Dong Hye Ye

Abstract: Automated retinal image medical description generation is crucial for streamlining medical diagnosis and treatment planning. Existing challenges include the reliance on learned retinal image representations, difficulties in handling multiple imaging modalities, and the lack of clinical context in visual representations. Addressing these issues, we propose the Multi-Modal Medical Transformer (M3T),… ▽ More Automated retinal image medical description generation is crucial for streamlining medical diagnosis and treatment planning. Existing challenges include the reliance on learned retinal image representations, difficulties in handling multiple imaging modalities, and the lack of clinical context in visual representations. Addressing these issues, we propose the Multi-Modal Medical Transformer (M3T), a novel deep learning architecture that integrates visual representations with diagnostic keywords. Unlike previous studies focusing on specific aspects, our approach efficiently learns contextual information and semantics from both modalities, enabling the generation of precise and coherent medical descriptions for retinal images. Experimental studies on the DeepEyeNet dataset validate the success of M3T in meeting ophthalmologists' standards, demonstrating a substantial 13.5% improvement in BLEU@4 over the best-performing baseline model. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: This paper has been accepted for presentation at the IEEE International Conference on Image Processing (ICIP 2024)

arXiv:2406.13126 [pdf, other]

Guided Context Gating: Learning to leverage salient lesions in retinal fundus images

Authors: Teja Krishna Cherukuri, Nagur Shareef Shaik, Dong Hye Ye

Abstract: Effectively representing medical images, especially retinal images, presents a considerable challenge due to variations in appearance, size, and contextual information of pathological signs called lesions. Precise discrimination of these lesions is crucial for diagnosing vision-threatening issues such as diabetic retinopathy. While visual attention-based neural networks have been introduced to lea… ▽ More Effectively representing medical images, especially retinal images, presents a considerable challenge due to variations in appearance, size, and contextual information of pathological signs called lesions. Precise discrimination of these lesions is crucial for diagnosing vision-threatening issues such as diabetic retinopathy. While visual attention-based neural networks have been introduced to learn spatial context and channel correlations from retinal images, they often fall short in capturing localized lesion context. Addressing this limitation, we propose a novel attention mechanism called Guided Context Gating, an unique approach that integrates Context Formulation, Channel Correlation, and Guided Gating to learn global context, spatial correlations, and localized lesion context. Our qualitative evaluation against existing attention mechanisms emphasize the superiority of Guided Context Gating in terms of explainability. Notably, experiments on the Zenodo-DR-7 dataset reveal a substantial 2.63% accuracy boost over advanced attention mechanisms & an impressive 6.53% improvement over the state-of-the-art Vision Transformer for assessing the severity grade of retinopathy, even with imbalanced and limited training samples for each class. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: This paper has been accepted for presentation at the IEEE International Conference on Image Processing (ICIP 2024)

arXiv:2406.12683 [pdf, other]

Spatial Sequence Attention Network for Schizophrenia Classification from Structural Brain MR Images

Authors: Nagur Shareef Shaik, Teja Krishna Cherukuri, Vince Calhoun, Dong Hye Ye

Abstract: Schizophrenia is a debilitating, chronic mental disorder that significantly impacts an individual's cognitive abilities, behavior, and social interactions. It is characterized by subtle morphological changes in the brain, particularly in the gray matter. These changes are often imperceptible through manual observation, demanding an automated approach to diagnosis. This study introduces a deep lear… ▽ More Schizophrenia is a debilitating, chronic mental disorder that significantly impacts an individual's cognitive abilities, behavior, and social interactions. It is characterized by subtle morphological changes in the brain, particularly in the gray matter. These changes are often imperceptible through manual observation, demanding an automated approach to diagnosis. This study introduces a deep learning methodology for the classification of individuals with Schizophrenia. We achieve this by implementing a diversified attention mechanism known as Spatial Sequence Attention (SSA) which is designed to extract and emphasize significant feature representations from structural MRI (sMRI). Initially, we employ the transfer learning paradigm by leveraging pre-trained DenseNet to extract initial feature maps from the final convolutional block which contains morphological alterations associated with Schizophrenia. These features are further processed by the proposed SSA to capture and emphasize intricate spatial interactions and relationships across volumes within the brain. Our experimental studies conducted on a clinical dataset have revealed that the proposed attention mechanism outperforms the existing Squeeze & Excitation Network for Schizophrenia classification. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: This paper has been accepted for the 21st IEEE International Symposium on Biomedical Imaging (ISBI 2024)

arXiv:2309.16263 [pdf]

Cooperation Dynamics in Multi-Agent Systems: Exploring Game-Theoretic Scenarios with Mean-Field Equilibria

Authors: Vaigarai Sathi, Sabahat Shaik, Jaswanth Nidamanuri

Abstract: Cooperation is fundamental in Multi-Agent Systems (MAS) and Multi-Agent Reinforcement Learning (MARL), often requiring agents to balance individual gains with collective rewards. In this regard, this paper aims to investigate strategies to invoke cooperation in game-theoretic scenarios, namely the Iterated Prisoner's Dilemma, where agents must optimize both individual and group outcomes. Existing… ▽ More Cooperation is fundamental in Multi-Agent Systems (MAS) and Multi-Agent Reinforcement Learning (MARL), often requiring agents to balance individual gains with collective rewards. In this regard, this paper aims to investigate strategies to invoke cooperation in game-theoretic scenarios, namely the Iterated Prisoner's Dilemma, where agents must optimize both individual and group outcomes. Existing cooperative strategies are analyzed for their effectiveness in promoting group-oriented behavior in repeated games. Modifications are proposed where encouraging group rewards will also result in a higher individual gain, addressing real-world dilemmas seen in distributed systems. The study extends to scenarios with exponentially growing agent populations ($N \longrightarrow +\infty$), where traditional computation and equilibrium determination are challenging. Leveraging mean-field game theory, equilibrium solutions and reward structures are established for infinitely large agent sets in repeated games. Finally, practical insights are offered through simulations using the Multi Agent-Posthumous Credit Assignment trainer, and the paper explores adapting simulation algorithms to create scenarios favoring cooperation for group rewards. These practical implementations bridge theoretical concepts with real-world applications. △ Less

Submitted 3 May, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: Accepted for MADGames: Multi-Agent Dynamic Games Workshop at IROS 2023, see details at https://iros2023-madgames.f1tenth.org/proceedings.html

arXiv:2205.00880 [pdf]

The Application of Energy and Laplacian Energy of Hesitancy Fuzzy Graph Based on Similarity Measures in Decision Making Problems

Authors: Rajagopal Reddy N, Sharief Basha Shaik

Abstract: In this article, a new hesitancy fuzzy similarity measure is defined and then used to develop the matrix of hesitancy fuzzy similarity measures, which is subsequently used to classify hesitancy fuzzy graph using the working procedure. We build a working procedure (Algorithm) for estimating the eligible reputation scores values of experts by applying hesitancy fuzzy preference relationships (HFPRs)… ▽ More In this article, a new hesitancy fuzzy similarity measure is defined and then used to develop the matrix of hesitancy fuzzy similarity measures, which is subsequently used to classify hesitancy fuzzy graph using the working procedure. We build a working procedure (Algorithm) for estimating the eligible reputation scores values of experts by applying hesitancy fuzzy preference relationships (HFPRs) and the usual similarity degree of one distinct HFPRs to each other's. As the last step, we provide real time numerical examples to demonstrate and validate our working procedure. △ Less

Submitted 28 April, 2022; originally announced May 2022.

arXiv:2012.04153 [pdf, other]

Learning Portrait Style Representations

Authors: Sadat Shaik, Bernadette Bucher, Nephele Agrafiotis, Stephen Phillips, Kostas Daniilidis, William Schmenner

Abstract: Style analysis of artwork in computer vision predominantly focuses on achieving results in target image generation through optimizing understanding of low level style characteristics such as brush strokes. However, fundamentally different techniques are required to computationally understand and control qualities of art which incorporate higher level style characteristics. We study style represent… ▽ More Style analysis of artwork in computer vision predominantly focuses on achieving results in target image generation through optimizing understanding of low level style characteristics such as brush strokes. However, fundamentally different techniques are required to computationally understand and control qualities of art which incorporate higher level style characteristics. We study style representations learned by neural network architectures incorporating these higher level characteristics. We find variation in learned style features from incorporating triplets annotated by art historians as supervision for style similarity. Networks leveraging statistical priors or pretrained on photo collections such as ImageNet can also derive useful visual representations of artwork. We align the impact of these expert human knowledge, statistical, and photo realism priors on style representations with art historical research and use these representations to perform zero-shot classification of artists. To facilitate this work, we also present the first large-scale dataset of portraits prepared for computational analysis. △ Less

Submitted 7 December, 2020; originally announced December 2020.

Comments: Sadat Shaik and Bernadette Bucher contributed equally

arXiv:1504.01139 [pdf]

Challenges in transforming, engaging and improving m-learning in Higher Educational Institutions: Oman perspective

Authors: Ramkumar Lakshminarayanan, Rajasekar Ramalingam, Shimaz Khan Shaik

Abstract: Nowadays, the student community is growing up with mobile devices and it has becomes an integral part of their life. Devices such as smartphones, tablets, and e-book readers connect users to access information and enabling instant communication with others. The enormous growth and affordability of mobile devices influenced their learning practices. Mobile technologies are playing a significant rol… ▽ More Nowadays, the student community is growing up with mobile devices and it has becomes an integral part of their life. Devices such as smartphones, tablets, and e-book readers connect users to access information and enabling instant communication with others. The enormous growth and affordability of mobile devices influenced their learning practices. Mobile technologies are playing a significant role in students' academic activities. The factors like convenience, flexibility, engagement, interactivity and easy-to-use enable mobile learning more attractive to students. With these trends in mind, it is important for the educators to inherit the mobile technologies in effective teaching and learning. Our study explores the challenges that exist in implementing the m-learning technologies in the teaching and learning practices of higher educational institutions of Oman. Our study also addressed various issue like adoption of technology, transition to new technology and issues related to engaging students. Based on the outcomes of the study, a framework has been formulated to address all the challenges that are identified for the successful implementation of m-learning. △ Less

Submitted 5 April, 2015; originally announced April 2015.

Comments: The Third International Conference of Educational Technology 24-26 March 2015

Showing 1–7 of 7 results for author: Shaik, S