Search | arXiv e-print repository

SemioLLM: Assessing Large Language Models for Semiological Analysis in Epilepsy Research

Authors: Meghal Dani, Muthu Jeyanthi Prakash, Zeynep Akata, Stefanie Liebe

Abstract: Large Language Models have shown promising results in their ability to encode general medical knowledge in standard medical question-answering datasets. However, their potential application in clinical practice requires evaluation in domain-specific tasks, where benchmarks are largely missing. In this study semioLLM, we test the ability of state-of-the-art LLMs (GPT-3.5, GPT-4, Mixtral 8x7B, and Q… ▽ More Large Language Models have shown promising results in their ability to encode general medical knowledge in standard medical question-answering datasets. However, their potential application in clinical practice requires evaluation in domain-specific tasks, where benchmarks are largely missing. In this study semioLLM, we test the ability of state-of-the-art LLMs (GPT-3.5, GPT-4, Mixtral 8x7B, and Qwen-72chat) to leverage their internal knowledge and reasoning for epilepsy diagnosis. Specifically, we obtain likelihood estimates linking unstructured text descriptions of seizures to seizure-generating brain regions, using an annotated clinical database containing 1269 entries. We evaluate the LLM's performance, confidence, reasoning, and citation abilities in comparison to clinical evaluation. Models achieve above-chance classification performance with prompt engineering significantly improving their outcome, with some models achieving close-to-clinical performance and reasoning. However, our analyses also reveal significant pitfalls with several models being overly confident while showing poor performance, as well as exhibiting citation errors and hallucinations. In summary, our work provides the first extensive benchmark comparing current SOTA LLMs in the medical domain of epilepsy and highlights their ability to leverage unstructured texts from patients' medical history to aid diagnostic processes in health care. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.01049 [pdf, other]

SE(3)-Hyena Operator for Scalable Equivariant Learning

Authors: Artem Moskalev, Mangal Prakash, Rui Liao, Tommaso Mansi

Abstract: Modeling global geometric context while maintaining equivariance is crucial for accurate predictions in many fields such as biology, chemistry, or vision. Yet, this is challenging due to the computational demands of processing high-dimensional data at scale. Existing approaches such as equivariant self-attention or distance-based message passing, suffer from quadratic complexity with respect to se… ▽ More Modeling global geometric context while maintaining equivariance is crucial for accurate predictions in many fields such as biology, chemistry, or vision. Yet, this is challenging due to the computational demands of processing high-dimensional data at scale. Existing approaches such as equivariant self-attention or distance-based message passing, suffer from quadratic complexity with respect to sequence length, while localized methods sacrifice global information. Inspired by the recent success of state-space and long-convolutional models, in this work, we introduce SE(3)-Hyena operator, an equivariant long-convolutional model based on the Hyena operator. The SE(3)-Hyena captures global geometric context at sub-quadratic complexity while maintaining equivariance to rotations and translations. Evaluated on equivariant associative recall and n-body modeling, SE(3)-Hyena matches or outperforms equivariant self-attention while requiring significantly less memory and computational resources for long sequences. Our model processes the geometric context of 20k tokens x3.5 times faster than the equivariant transformer and allows x175 longer a context within the same memory budget. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2311.13878 [pdf, other]

Minimizing Factual Inconsistency and Hallucination in Large Language Models

Authors: Muneeswaran I, Shreya Saxena, Siva Prasad, M V Sai Prakash, Advaith Shankar, Varun V, Vishal Vaddina, Saisubramaniam Gopalakrishnan

Abstract: Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generat… ▽ More Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generates the rationale first, verifies and refines incorrect ones, and uses them as supporting references to generate the answer. The generated rationale enhances the transparency of the answer and our framework provides insights into how the model arrived at this answer, by using this rationale and the references to the context. In this paper, we demonstrate its effectiveness in improving the quality of responses to drug-related inquiries in the life sciences industry. Our framework improves traditional Retrieval Augmented Generation (RAG) by enabling OpenAI GPT-3.5-turbo to be 14-25% more faithful and 16-22% more accurate on two datasets. Furthermore, fine-tuning samples based on our framework improves the accuracy of smaller open-access LLMs by 33-42% and competes with RAG on commercial models. △ Less

Submitted 23 November, 2023; originally announced November 2023.

arXiv:2310.03027 [pdf, other]

Synergistic Fusion of Graph and Transformer Features for Enhanced Molecular Property Prediction

Authors: M V Sai Prakash, Siddartha Reddy N, Ganesh Parab, Varun V, Vishal Vaddina, Saisubramaniam Gopalakrishnan

Abstract: Molecular property prediction is a critical task in computational drug discovery. While recent advances in Graph Neural Networks (GNNs) and Transformers have shown to be effective and promising, they face the following limitations: Transformer self-attention does not explicitly consider the underlying molecule structure while GNN feature representation alone is not sufficient to capture granular a… ▽ More Molecular property prediction is a critical task in computational drug discovery. While recent advances in Graph Neural Networks (GNNs) and Transformers have shown to be effective and promising, they face the following limitations: Transformer self-attention does not explicitly consider the underlying molecule structure while GNN feature representation alone is not sufficient to capture granular and hidden interactions and characteristics that distinguish similar molecules. To address these limitations, we propose SYN- FUSION, a novel approach that synergistically combines pre-trained features from GNNs and Transformers. This approach provides a comprehensive molecular representation, capturing both the global molecule structure and the individual atom characteristics. Experimental results on MoleculeNet benchmarks demonstrate superior performance, surpassing previous models in 5 out of 7 classification datasets and 4 out of 6 regression datasets. The performance of SYN-FUSION has been compared with other Graph-Transformer models that have been jointly trained using a combination of transformer and graph features, and it is found that our approach is on par with those models in terms of performance. Extensive analysis of the learned fusion model across aspects such as loss, latent space, and weight distribution further validates the effectiveness of SYN-FUSION. Finally, an ablation study unequivocally demonstrates that the synergy achieved by SYN-FUSION surpasses the performance of its individual model components and their ensemble, offering a substantial improvement in predicting molecular properties. △ Less

Submitted 25 August, 2023; originally announced October 2023.

arXiv:2303.03766 [pdf, other]

Benchmarking and Security Considerations of Wi-Fi FTM for Ranging in IoT Devices

Authors: Govind Singh, Anshul Pandey, Monika Prakash, Martin Andreoni, Michael Baddeley

Abstract: The IEEE 802.11mc standard introduces fine time measurement (Wi-Fi FTM), allowing high-precision synchronization between peers and round-trip time calculation (Wi-Fi RTT) for location estimation - typically with a precision of one to two meters. This has considerable advantages over received signal strength (RSS)-based trilateration, which is prone to errors due to multipath reflections. We examin… ▽ More The IEEE 802.11mc standard introduces fine time measurement (Wi-Fi FTM), allowing high-precision synchronization between peers and round-trip time calculation (Wi-Fi RTT) for location estimation - typically with a precision of one to two meters. This has considerable advantages over received signal strength (RSS)-based trilateration, which is prone to errors due to multipath reflections. We examine different commercial radios which support Wi-Fi RTT and benchmark Wi-Fi FTM ranging over different spectrums and bandwidths. Importantly, we find that while Wi-Fi FTM supports localization accuracy to within one to two meters in ideal conditions during outdoor line-of-sight experiments, for indoor environments at short ranges similar accuracy was only achievable on chipsets supporting Wi-Fi FTM on wider (VHT80) channel bandwidths rather than narrower (HT20) channel bandwidths. Finally, we explore the security implications of Wi-Fi FTM and use an on-air sniffer to demonstrate that Wi-Fi FTM messages are unprotected. We consequently propose a threat model with possible mitigations and directions for further research. △ Less

Submitted 7 March, 2023; originally announced March 2023.

arXiv:2109.10266 [pdf, other]

Comparison of single and multitask learning for predicting cognitive decline based on MRI data

Authors: Vandad Imani, Mithilesh Prakash, Marzieh Zare, Jussi Tohka

Abstract: The Alzheimer's Disease Assessment Scale-Cognitive subscale (ADAS-Cog) is a neuropsychological tool that has been designed to assess the severity of cognitive symptoms of dementia. Personalized prediction of the changes in ADAS-Cog scores could help in timing therapeutic interventions in dementia and at-risk populations. In the present work, we compared single and multitask learning approaches to… ▽ More The Alzheimer's Disease Assessment Scale-Cognitive subscale (ADAS-Cog) is a neuropsychological tool that has been designed to assess the severity of cognitive symptoms of dementia. Personalized prediction of the changes in ADAS-Cog scores could help in timing therapeutic interventions in dementia and at-risk populations. In the present work, we compared single and multitask learning approaches to predict the changes in ADAS-Cog scores based on T1-weighted anatomical magnetic resonance imaging (MRI). In contrast to most machine learning-based prediction methods ADAS-Cog changes, we stratified the subjects based on their baseline diagnoses and evaluated the prediction performances in each group. Our experiments indicated a positive relationship between the predicted and observed ADAS-Cog score changes in each diagnostic group, suggesting that T1-weighted MRI has a predictive value for evaluating cognitive decline in the entire AD continuum. We further studied whether correction of the differences in the magnetic field strength of MRI would improve the ADAS-Cog score prediction. The partial least square-based domain adaptation slightly improved the prediction performance, but the improvement was marginal. In summary, this study demonstrated that ADAS-Cog change could be, to some extent, predicted based on anatomical MRI. Based on this study, the recommended method for learning the predictive models is a single-task regularized linear regression due to its simplicity and good performance. It appears important to combine the training data across all subject groups for the most effective predictive models. △ Less

Submitted 21 September, 2021; originally announced September 2021.

arXiv:2104.01374 [pdf, other]

Interpretable Unsupervised Diversity Denoising and Artefact Removal

Authors: Mangal Prakash, Mauricio Delbracio, Peyman Milanfar, Florian Jug

Abstract: Image denoising and artefact removal are complex inverse problems admitting multiple valid solutions. Unsupervised diversity restoration, that is, obtaining a diverse set of possible restorations given a corrupted image, is important for ambiguity removal in many applications such as microscopy where paired data for supervised training are often unobtainable. In real world applications, imaging no… ▽ More Image denoising and artefact removal are complex inverse problems admitting multiple valid solutions. Unsupervised diversity restoration, that is, obtaining a diverse set of possible restorations given a corrupted image, is important for ambiguity removal in many applications such as microscopy where paired data for supervised training are often unobtainable. In real world applications, imaging noise and artefacts are typically hard to model, leading to unsatisfactory performance of existing unsupervised approaches. This work presents an interpretable approach for unsupervised and diverse image restoration. To this end, we introduce a capable architecture called Hierarchical DivNoising (HDN) based on hierarchical Variational Autoencoder. We show that HDN learns an interpretable multi-scale representation of artefacts and we leverage this interpretability to remove imaging artefacts commonly occurring in microscopy data. Our method achieves state-of-the-art results on twelve benchmark image denoising datasets while providing access to a whole distribution of sensibly restored solutions. Additionally, we demonstrate on three real microscopy datasets that HDN removes artefacts without supervision, being the first method capable of doing so while generating multiple plausible restorations all consistent with the given corrupted image. △ Less

Submitted 24 February, 2022; v1 submitted 3 April, 2021; originally announced April 2021.

arXiv:2102.01530 [pdf, other]

doi 10.3390/jimaging7040066

Transfer Learning in Magnetic Resonance Brain Imaging: a Systematic Review

Authors: Juan Miguel Valverde, Vandad Imani, Ali Abdollahzadeh, Riccardo De Feo, Mithilesh Prakash, Robert Ciszek, Jussi Tohka

Abstract: Transfer learning refers to machine learning techniques that focus on acquiring knowledge from related tasks to improve generalization in the tasks of interest. In MRI, transfer learning is important for develo** strategies that address the variation in MR images. Additionally, transfer learning is beneficial to re-utilize machine learning models that were trained to solve related tasks to the t… ▽ More Transfer learning refers to machine learning techniques that focus on acquiring knowledge from related tasks to improve generalization in the tasks of interest. In MRI, transfer learning is important for develo** strategies that address the variation in MR images. Additionally, transfer learning is beneficial to re-utilize machine learning models that were trained to solve related tasks to the task of interest. Our goal is to identify research directions, gaps of knowledge, applications, and widely used strategies among the transfer learning approaches applied in MR brain imaging. We performed a systematic literature search for articles that applied transfer learning to MR brain imaging. We screened 433 studies and we categorized and extracted relevant information, including task type, application, and machine learning methods. Furthermore, we closely examined brain MRI-specific transfer learning approaches and other methods that tackled privacy, unseen target domains, and unlabeled data. We found 129 articles that applied transfer learning to brain MRI tasks. The most frequent applications were dementia related classification tasks and brain tumor segmentation. A majority of articles utilized transfer learning on convolutional neural networks (CNNs). Only few approaches were clearly brain MRI specific, considered privacy issues, unseen target domains or unlabeled data. We proposed a new categorization to group specific, widely-used approaches. There is an increasing interest in transfer learning within brain MRI. Public datasets have contributed to the popularity of Alzheimer's diagnostics/prognostics and tumor segmentation. Likewise, the availability of pretrained CNNs has promoted their utilization. Finally, the majority of the surveyed studies did not examine in detail the interpretation of their strategies after applying transfer learning, and did not compare to other approaches. △ Less

Submitted 1 April, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

Comments: Accepted in Journal of Imaging

arXiv:2006.06072 [pdf, other]

Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders

Authors: Mangal Prakash, Alexander Krull, Florian Jug

Abstract: Deep Learning based methods have emerged as the indisputable leaders for virtually all image restoration tasks. Especially in the domain of microscopy images, various content-aware image restoration (CARE) approaches are now used to improve the interpretability of acquired data. Naturally, there are limitations to what can be restored in corrupted images, and like for all inverse problems, many po… ▽ More Deep Learning based methods have emerged as the indisputable leaders for virtually all image restoration tasks. Especially in the domain of microscopy images, various content-aware image restoration (CARE) approaches are now used to improve the interpretability of acquired data. Naturally, there are limitations to what can be restored in corrupted images, and like for all inverse problems, many potential solutions exist, and one of them must be chosen. Here, we propose DivNoising, a denoising approach based on fully convolutional variational autoencoders (VAEs), overcoming the problem of having to choose a single solution by predicting a whole distribution of denoised images. First we introduce a principled way of formulating the unsupervised denoising problem within the VAE framework by explicitly incorporating imaging noise models into the decoder. Our approach is fully unsupervised, only requiring noisy images and a suitable description of the imaging noise distribution. We show that such a noise model can either be measured, bootstrapped from noisy data, or co-learned during training. If desired, consensus predictions can be inferred from a set of DivNoising predictions, leading to competitive results with other unsupervised methods and, on occasion, even with the supervised state-of-the-art. DivNoising samples from the posterior enable a plethora of useful applications. We are (i) showing denoising results for 13 datasets, (ii) discussing how optical character recognition (OCR) applications can benefit from diverse predictions, and are (iii) demonstrating how instance cell segmentation improves when using diverse DivNoising predictions. △ Less

Submitted 1 March, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

Comments: 44 pages including supplement

arXiv:2005.02987 [pdf, other]

DenoiSeg: Joint Denoising and Segmentation

Authors: Tim-Oliver Buchholz, Mangal Prakash, Alexander Krull, Florian Jug

Abstract: Microscopy image analysis often requires the segmentation of objects, but training data for this task is typically scarce and hard to obtain. Here we propose DenoiSeg, a new method that can be trained end-to-end on only a few annotated ground truth segmentations. We achieve this by extending Noise2Void, a self-supervised denoising scheme that can be trained on noisy images alone, to also predict d… ▽ More Microscopy image analysis often requires the segmentation of objects, but training data for this task is typically scarce and hard to obtain. Here we propose DenoiSeg, a new method that can be trained end-to-end on only a few annotated ground truth segmentations. We achieve this by extending Noise2Void, a self-supervised denoising scheme that can be trained on noisy images alone, to also predict dense 3-class segmentations. The reason for the success of our method is that segmentation can profit from denoising, especially when performed jointly within the same network. The network becomes a denoising expert by seeing all available raw data, while co-learning to segment, even if only a few segmentation labels are available. This hypothesis is additionally fueled by our observation that the best segmentation results on high quality (very low noise) raw data are obtained when moderate amounts of synthetic noise are added. This renders the denoising-task non-trivial and unleashes the desired co-learning effect. We believe that DenoiSeg offers a viable way to circumvent the tremendous hunger for high quality training data and effectively enables few-shot learning of dense segmentations. △ Less

Submitted 10 June, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

Comments: 10 pages, 4 figures, 2 pages supplement (4 figures)

arXiv:2004.06375 [pdf, other]

A Primal-Dual Solver for Large-Scale Tracking-by-Assignment

Authors: Stefan Haller, Mangal Prakash, Lisa Hutschenreiter, Tobias Pietzsch, Carsten Rother, Florian Jug, Paul Swoboda, Bogdan Savchynskyy

Abstract: We propose a fast approximate solver for the combinatorial problem known as tracking-by-assignment, which we apply to cell tracking. The latter plays a key role in discovery in many life sciences, especially in cell and developmental biology. So far, in the most general setting this problem was addressed by off-the-shelf solvers like Gurobi, whose run time and memory requirements rapidly grow with… ▽ More We propose a fast approximate solver for the combinatorial problem known as tracking-by-assignment, which we apply to cell tracking. The latter plays a key role in discovery in many life sciences, especially in cell and developmental biology. So far, in the most general setting this problem was addressed by off-the-shelf solvers like Gurobi, whose run time and memory requirements rapidly grow with the size of the input. In contrast, for our method this growth is nearly linear. Our contribution consists of a new (1) decomposable compact representation of the problem; (2) dual block-coordinate ascent method for optimizing the decomposition-based dual; and (3) primal heuristics that reconstructs a feasible integer solution based on the dual information. Compared to solving the problem with Gurobi, we observe an up to~60~times speed-up, while reducing the memory footprint significantly. We demonstrate the efficacy of our method on real-world tracking problems. △ Less

Submitted 14 April, 2020; originally announced April 2020.

Comments: 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), 2020

arXiv:1911.12291 [pdf, other]

Fully Unsupervised Probabilistic Noise2Void

Authors: Mangal Prakash, Manan Lalit, Pavel Tomancak, Alexander Krull, Florian Jug

Abstract: Image denoising is the first step in many biomedical image analysis pipelines and Deep Learning (DL) based methods are currently best performing. A new category of DL methods such as Noise2Void or Noise2Self can be used fully unsupervised, requiring nothing but the noisy data. However, this comes at the price of reduced reconstruction quality. The recently proposed Probabilistic Noise2Void (PN2V)… ▽ More Image denoising is the first step in many biomedical image analysis pipelines and Deep Learning (DL) based methods are currently best performing. A new category of DL methods such as Noise2Void or Noise2Self can be used fully unsupervised, requiring nothing but the noisy data. However, this comes at the price of reduced reconstruction quality. The recently proposed Probabilistic Noise2Void (PN2V) improves results, but requires an additional noise model for which calibration data needs to be acquired. Here, we present improvements to PN2V that (i) replace histogram based noise models by parametric noise models, and (ii) show how suitable noise models can be created even in the absence of calibration data. This is a major step since it actually renders PN2V fully unsupervised. We demonstrate that all proposed improvements are not only academic but indeed relevant. △ Less

Submitted 19 March, 2020; v1 submitted 27 November, 2019; originally announced November 2019.

Comments: Accepted at ISBI 2020

arXiv:1911.12239 [pdf, other]

Leveraging Self-supervised Denoising for Image Segmentation

Authors: Mangal Prakash, Tim-Oliver Buchholz, Manan Lalit, Pavel Tomancak, Florian Jug, Alexander Krull

Abstract: Deep learning (DL) has arguably emerged as the method of choice for the detection and segmentation of biological structures in microscopy images. However, DL typically needs copious amounts of annotated training data that is for biomedical projects typically not available and excessively expensive to generate. Additionally, tasks become harder in the presence of noise, requiring even more high-qua… ▽ More Deep learning (DL) has arguably emerged as the method of choice for the detection and segmentation of biological structures in microscopy images. However, DL typically needs copious amounts of annotated training data that is for biomedical projects typically not available and excessively expensive to generate. Additionally, tasks become harder in the presence of noise, requiring even more high-quality training data. Hence, we propose to use denoising networks to improve the performance of other DL-based image segmentation methods. More specifically, we present ideas on how state-of-the-art self-supervised CARE networks can improve cell/nuclei segmentation in microscopy data. Using two state-of-the-art baseline methods, U-Net and StarDist, we show that our ideas consistently improve the quality of resulting segmentations, especially when only limited training data for noisy micrographs are available. △ Less

Submitted 19 March, 2020; v1 submitted 27 November, 2019; originally announced November 2019.

Comments: accepted at ISBI 2020

arXiv:1312.4634 [pdf]

Implementation of WSN which can simultaneously monitor Temperature conditions and control robot for positional accuracy

Authors: Sharul Agrawal, Mr. Ravi Prakash, Prof. Zunnun Narmawala

Abstract: Sensor networks and robots are both quickly evolving fields, the union of two fields seems inherently symbiotic. Collecting data from stationary sensors can be time consuming task and thus can be automated by adding wireless communication capabilities to the sensors. This proposed project takes advantage of wireless sensor networks in remote handling environment which can send signals over far dis… ▽ More Sensor networks and robots are both quickly evolving fields, the union of two fields seems inherently symbiotic. Collecting data from stationary sensors can be time consuming task and thus can be automated by adding wireless communication capabilities to the sensors. This proposed project takes advantage of wireless sensor networks in remote handling environment which can send signals over far distances by using a mesh topology, transfers the data wirelessly and also consumes low power. In this paper a testbed is created for wireless sensor network using custom build sensor nodes for temperature monitoring in labs and to control a robot moving in another lab. The two temperature sensor nodes used here consists of a Arduino microcontroller and XBee wireless communication module based on IEEE 802.15.4 standard while the robot has inherent FPGA board as a processing unit with xbee module connected via Rs-2332 cable for serial communication between zigbee device and FPGA. A simple custom packet is designed so that uniformity is maintained while collection of data from temperature nodes and a moving robot and passing to a remote terminal. The coordinator Zigbee is connected to remote terminal (PC) through its USB port where Graphical user interface (GUI) can be run to monitor Temperature readings and position of Robot dynamically and save those readings in database. △ Less

Submitted 16 December, 2013; originally announced December 2013.

Showing 1–14 of 14 results for author: Prakash, M