-
gnss_lib_py: Analyzing GNSS Data with Python
Authors:
Derek Knowles,
Ashwin Vivek Kanhere,
Daniel Neamati,
Grace Gao
Abstract:
This paper presents gnss_lib_py, a Python library used to parse, analyze, and visualize data from a variety of GNSS (Global Navigation Satellite Systems) data sources. The gnss_lib_py library's ease of use, modular capabilities, testing coverage, and extensive documentation make it an attractive tool not only for scientific and industry users wanting a quick, out-of-the-box solution but also for a…
▽ More
This paper presents gnss_lib_py, a Python library used to parse, analyze, and visualize data from a variety of GNSS (Global Navigation Satellite Systems) data sources. The gnss_lib_py library's ease of use, modular capabilities, testing coverage, and extensive documentation make it an attractive tool not only for scientific and industry users wanting a quick, out-of-the-box solution but also for advanced GNSS users develo** new GNSS algorithms. gnss_lib_py has already demonstrated its usefulness and impact through presentation in academic conferences, use in research papers, and adoption in graduate-level university course curricula.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Improving Multi-Center Generalizability of GAN-Based Fat Suppression using Federated Learning
Authors:
Pranav Kulkarni,
Adway Kanhere,
Harshita Kukreja,
Vivian Zhang,
Paul H. Yi,
Vishwa S. Parekh
Abstract:
Generative Adversarial Network (GAN)-based synthesis of fat suppressed (FS) MRIs from non-FS proton density sequences has the potential to accelerate acquisition of knee MRIs. However, GANs trained on single-site data have poor generalizability to external data. We show that federated learning can improve multi-center generalizability of GANs for synthesizing FS MRIs, while facilitating privacy-pr…
▽ More
Generative Adversarial Network (GAN)-based synthesis of fat suppressed (FS) MRIs from non-FS proton density sequences has the potential to accelerate acquisition of knee MRIs. However, GANs trained on single-site data have poor generalizability to external data. We show that federated learning can improve multi-center generalizability of GANs for synthesizing FS MRIs, while facilitating privacy-preserving multi-institutional collaborations.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations
Authors:
Pranav Kulkarni,
Adway Kanhere,
Dharmam Savani,
Andrew Chan,
Devina Chatterjee,
Paul H. Yi,
Vishwa S. Parekh
Abstract:
Curating annotations for medical image segmentation is a labor-intensive and time-consuming task that requires domain expertise, resulting in "narrowly" focused deep learning (DL) models with limited translational utility. Recently, foundation models like the Segment Anything Model (SAM) have revolutionized semantic segmentation with exceptional zero-shot generalizability across various domains, i…
▽ More
Curating annotations for medical image segmentation is a labor-intensive and time-consuming task that requires domain expertise, resulting in "narrowly" focused deep learning (DL) models with limited translational utility. Recently, foundation models like the Segment Anything Model (SAM) have revolutionized semantic segmentation with exceptional zero-shot generalizability across various domains, including medical imaging, and hold a lot of promise for streamlining the annotation process. However, SAM has yet to be evaluated in a crowd-sourced setting to curate annotations for training 3D DL segmentation models. In this work, we explore the potential of SAM for crowd-sourcing "sparse" annotations from non-experts to generate "dense" segmentation masks for training 3D nnU-Net models, a state-of-the-art DL segmentation model. Our results indicate that while SAM-generated annotations exhibit high mean Dice scores compared to ground-truth annotations, nnU-Net models trained on SAM-generated annotations perform significantly worse than nnU-Net models trained on ground-truth annotations ($p<0.001$, all).
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Spoofing-Resilient LiDAR-GPS Factor Graph Localization with Chimera Authentication
Authors:
Adam Dai,
Tara Minda,
Ashwin Kanhere,
Grace Gao
Abstract:
Many vehicle platforms typically use sensors such as LiDAR or camera for locally-referenced navigation with GPS for globally-referenced navigation. However, due to the unencrypted nature of GPS signals, all civilian users are vulner-able to spoofing attacks, where a malicious spoofer broadcasts fabricated signals and causes the user to track a false position fix. To protect against such GPS spoofi…
▽ More
Many vehicle platforms typically use sensors such as LiDAR or camera for locally-referenced navigation with GPS for globally-referenced navigation. However, due to the unencrypted nature of GPS signals, all civilian users are vulner-able to spoofing attacks, where a malicious spoofer broadcasts fabricated signals and causes the user to track a false position fix. To protect against such GPS spoofing attacks, Chips-Message Robust Authentication (Chimera) has been developed and will be tested on the Navigation Technology Satellite 3 (NTS-3) satellite being launched later this year. However, Chimera authentication is not continuously available and may not provide sufficient protection for vehicles which rely on more frequent GPS measurements. In this paper, we propose a factor graph-based state estimation framework which integrates LiDAR and GPS while simultaneously detecting and mitigating spoofing attacks experienced between consecutive Chimera authentications. Our proposed framework combines GPS pseudorange measurements with LiDAR odometry to provide a robust navigation solution. A chi-squared detector, based on pseudorange residuals, is used to detect and mitigate any potential GPS spoofing attacks. We evaluate our method using real-world LiDAR data from the KITTI dataset and simulated GPS measurements, both nominal and with spoofing. Across multiple trajectories and Monte Carlo runs, our method consistently achieves position errors under 5 m during nominal conditions, and successfully bounds positioning error to within odometry drift levels during spoofed conditions.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
One Copy Is All You Need: Resource-Efficient Streaming of Medical Imaging Data at Scale
Authors:
Pranav Kulkarni,
Adway Kanhere,
Eliot Siegel,
Paul H. Yi,
Vishwa S. Parekh
Abstract:
Large-scale medical imaging datasets have accelerated development of artificial intelligence tools for clinical decision support. However, the large size of these datasets is a bottleneck for users with limited storage and bandwidth. Many users may not even require such large datasets as AI models are often trained on lower resolution images. If users could directly download at their desired resol…
▽ More
Large-scale medical imaging datasets have accelerated development of artificial intelligence tools for clinical decision support. However, the large size of these datasets is a bottleneck for users with limited storage and bandwidth. Many users may not even require such large datasets as AI models are often trained on lower resolution images. If users could directly download at their desired resolution, storage and bandwidth requirements would significantly decrease. However, it is impossible to anticipate every users' requirements and impractical to store the data at multiple resolutions. What if we could store images at a single resolution but send them at different ones? We propose MIST, an open-source framework to operationalize progressive resolution for streaming medical images at multiple resolutions from a single high-resolution copy. We demonstrate that MIST can dramatically reduce imaging infrastructure inefficiencies for hosting and streaming medical images by >90%, while maintaining diagnostic quality for deep learning applications.
△ Less
Submitted 1 July, 2023;
originally announced July 2023.
-
ISLE: An Intelligent Streaming Framework for High-Throughput AI Inference in Medical Imaging
Authors:
Pranav Kulkarni,
Sean Garin,
Adway Kanhere,
Eliot Siegel,
Paul H. Yi,
Vishwa S. Parekh
Abstract:
As the adoption of Artificial Intelligence (AI) systems within the clinical environment grows, limitations in bandwidth and compute can create communication bottlenecks when streaming imaging data, leading to delays in patient care and increased cost. As such, healthcare providers and AI vendors will require greater computational infrastructure, therefore dramatically increasing costs. To that end…
▽ More
As the adoption of Artificial Intelligence (AI) systems within the clinical environment grows, limitations in bandwidth and compute can create communication bottlenecks when streaming imaging data, leading to delays in patient care and increased cost. As such, healthcare providers and AI vendors will require greater computational infrastructure, therefore dramatically increasing costs. To that end, we developed ISLE, an intelligent streaming framework for high-throughput, compute- and bandwidth- optimized, and cost effective AI inference for clinical decision making at scale. In our experiments, ISLE on average reduced data transmission by 98.02% and decoding time by 98.09%, while increasing throughput by 2,730%. We show that ISLE results in faster turnaround times, and reduced overall cost of data, transmission, and compute, without negatively impacting clinical decision making using AI systems.
△ Less
Submitted 25 November, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Text2Cohort: Facilitating Intuitive Access to Biomedical Data with Natural Language Cohort Discovery
Authors:
Pranav Kulkarni,
Adway Kanhere,
Paul H. Yi,
Vishwa S. Parekh
Abstract:
The Imaging Data Commons (IDC) is a cloud-based database that provides researchers with open access to cancer imaging data, with the goal of facilitating collaboration. However, cohort discovery within the IDC database has a significant technical learning curve. Recently, large language models (LLM) have demonstrated exceptional utility for natural language processing tasks. We developed Text2Coho…
▽ More
The Imaging Data Commons (IDC) is a cloud-based database that provides researchers with open access to cancer imaging data, with the goal of facilitating collaboration. However, cohort discovery within the IDC database has a significant technical learning curve. Recently, large language models (LLM) have demonstrated exceptional utility for natural language processing tasks. We developed Text2Cohort, a LLM-powered toolkit to facilitate user-friendly natural language cohort discovery in the IDC. Our method translates user input into IDC queries using grounding techniques and returns the query's response. We evaluate Text2Cohort on 50 natural language inputs, from information extraction to cohort discovery. Our toolkit successfully generated responses with an 88% accuracy and 0.94 F1 score. We demonstrate that Text2Cohort can enable researchers to discover and curate cohorts on IDC with high levels of accuracy using natural language in a more intuitive and user-friendly way.
△ Less
Submitted 25 November, 2023; v1 submitted 12 May, 2023;
originally announced May 2023.
-
Optimizing Federated Learning for Medical Image Classification on Distributed Non-iid Datasets with Partial Labels
Authors:
Pranav Kulkarni,
Adway Kanhere,
Paul H. Yi,
Vishwa S. Parekh
Abstract:
Numerous large-scale chest x-ray datasets have spearheaded expert-level detection of abnormalities using deep learning. However, these datasets focus on detecting a subset of disease labels that could be present, thus making them distributed and non-iid with partial labels. Recent literature has indicated the impact of batch normalization layers on the convergence of federated learning due to doma…
▽ More
Numerous large-scale chest x-ray datasets have spearheaded expert-level detection of abnormalities using deep learning. However, these datasets focus on detecting a subset of disease labels that could be present, thus making them distributed and non-iid with partial labels. Recent literature has indicated the impact of batch normalization layers on the convergence of federated learning due to domain shift associated with non-iid data with partial labels. To that end, we propose FedFBN, a federated learning framework that draws inspiration from transfer learning by using pretrained networks as the model backend and freezing the batch normalization layers throughout the training process. We evaluate FedFBN with current FL strategies using synthetic iid toy datasets and large-scale non-iid datasets across scenarios with partial and complete labels. Our results demonstrate that FedFBN outperforms current aggregation strategies for training global models using distributed and non-iid data with partial labels.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
SegViz: A federated-learning based framework for multi-organ segmentation on heterogeneous data sets with partial annotations
Authors:
Adway U. Kanhere,
Pranav Kulkarni,
Paul H. Yi,
Vishwa S. Parekh
Abstract:
Segmentation is one of the most primary tasks in deep learning for medical imaging, owing to its multiple downstream clinical applications. However, generating manual annotations for medical images is time-consuming, requires high skill, and is an expensive effort, especially for 3D images. One potential solution is to aggregate knowledge from partially annotated datasets from multiple groups to c…
▽ More
Segmentation is one of the most primary tasks in deep learning for medical imaging, owing to its multiple downstream clinical applications. However, generating manual annotations for medical images is time-consuming, requires high skill, and is an expensive effort, especially for 3D images. One potential solution is to aggregate knowledge from partially annotated datasets from multiple groups to collaboratively train global models using Federated Learning. To this end, we propose SegViz, a federated learning-based framework to train a segmentation model from distributed non-i.i.d datasets with partial annotations. The performance of SegViz was compared against training individual models separately on each dataset as well as centrally aggregating all the datasets in one place and training a single model. The SegViz framework using FedBN as the aggregation strategy demonstrated excellent performance on the external BTCV set with dice scores of 0.93, 0.83, 0.55, and 0.75 for segmentation of liver, spleen, pancreas, and kidneys, respectively, significantly ($p<0.05$) better (except spleen) than the dice scores of 0.87, 0.83, 0.42, and 0.48 for the baseline models. In contrast, the central aggregation model significantly ($p<0.05$) performed poorly on the test dataset with dice scores of 0.65, 0, 0.55, and 0.68. Our results demonstrate the potential of the SegViz framework to train multi-task models from distributed datasets with partial labels. All our implementations are open-source and available at https://anonymous.4open.science/r/SegViz-B746
△ Less
Submitted 13 March, 2023; v1 submitted 17 January, 2023;
originally announced January 2023.
-
Surgical Aggregation: Federated Class-Heterogeneous Learning
Authors:
Pranav Kulkarni,
Adway Kanhere,
Paul H. Yi,
Vishwa S. Parekh
Abstract:
The release of numerous chest x-ray datasets has spearheaded the development of deep learning models with expert-level performance. However, they have limited interoperability due to class-heterogeneity -- a result of inconsistent labeling schemes and partial annotations. Therefore, it is challenging to leverage these datasets in aggregate to train models with a complete representation of abnormal…
▽ More
The release of numerous chest x-ray datasets has spearheaded the development of deep learning models with expert-level performance. However, they have limited interoperability due to class-heterogeneity -- a result of inconsistent labeling schemes and partial annotations. Therefore, it is challenging to leverage these datasets in aggregate to train models with a complete representation of abnormalities that may occur within the thorax. In this work, we propose surgical aggregation, a federated learning framework for aggregating knowledge from class-heterogeneous datasets and learn a model that can simultaneously predict the presence of all disease labels present across the datasets. We evaluate our method using simulated and real-world class-heterogeneous datasets across both independent and identically distributed (iid) and non-iid settings. Our results show that surgical aggregation outperforms current methods, has better generalizability, and is a crucial first step towards tackling class-heterogeneity in federated learning to facilitate the development of clinically-useful models using previously non-interoperable chest x-ray datasets.
△ Less
Submitted 5 January, 2024; v1 submitted 16 January, 2023;
originally announced January 2023.
-
From Competition to Collaboration: Making Toy Datasets on Kaggle Clinically Useful for Chest X-Ray Diagnosis Using Federated Learning
Authors:
Pranav Kulkarni,
Adway Kanhere,
Paul H. Yi,
Vishwa S. Parekh
Abstract:
Chest X-ray (CXR) datasets hosted on Kaggle, though useful from a data science competition standpoint, have limited utility in clinical use because of their narrow focus on diagnosing one specific disease. In real-world clinical use, multiple diseases need to be considered since they can co-exist in the same patient. In this work, we demonstrate how federated learning (FL) can be used to make thes…
▽ More
Chest X-ray (CXR) datasets hosted on Kaggle, though useful from a data science competition standpoint, have limited utility in clinical use because of their narrow focus on diagnosing one specific disease. In real-world clinical use, multiple diseases need to be considered since they can co-exist in the same patient. In this work, we demonstrate how federated learning (FL) can be used to make these toy CXR datasets from Kaggle clinically useful. Specifically, we train a single FL classification model (`global`) using two separate CXR datasets -- one annotated for presence of pneumonia and the other for presence of pneumothorax (two common and life-threatening conditions) -- capable of diagnosing both. We compare the performance of the global FL model with models trained separately on both datasets (`baseline`) for two different model architectures. On a standard, naive 3-layer CNN architecture, the global FL model achieved AUROC of 0.84 and 0.81 for pneumonia and pneumothorax, respectively, compared to 0.85 and 0.82, respectively, for both baseline models (p>0.05). Similarly, on a pretrained DenseNet121 architecture, the global FL model achieved AUROC of 0.88 and 0.91 for pneumonia and pneumothorax, respectively, compared to 0.89 and 0.91, respectively, for both baseline models (p>0.05). Our results suggest that FL can be used to create global `meta` models to make toy datasets from Kaggle clinically useful, a step forward towards bridging the gap from bench to bedside.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Improving GNSS Positioning using Neural Network-based Corrections
Authors:
Ashwin V. Kanhere,
Shubh Gupta,
Akshay Shetty,
Grace Gao
Abstract:
Deep Neural Networks (DNNs) are a promising tool for Global Navigation Satellite System (GNSS) positioning in the presence of multipath and non-line-of-sight errors, owing to their ability to model complex errors using data. However, develo** a DNN for GNSS positioning presents various challenges, such as 1) poor numerical conditioning caused by large variations in measurements and position valu…
▽ More
Deep Neural Networks (DNNs) are a promising tool for Global Navigation Satellite System (GNSS) positioning in the presence of multipath and non-line-of-sight errors, owing to their ability to model complex errors using data. However, develo** a DNN for GNSS positioning presents various challenges, such as 1) poor numerical conditioning caused by large variations in measurements and position values across the globe, 2) varying number and order within the set of measurements due to changing satellite visibility, and 3) overfitting to available data. In this work, we address the aforementioned challenges and propose an approach for GNSS positioning by applying DNN-based corrections to an initial position guess. Our DNN learns to output the position correction using the set of pseudorange residuals and satellite line-of-sight vectors as inputs. The limited variation in these input and output values improves the numerical conditioning for our DNN. We design our DNN architecture to combine information from the available GNSS measurements, which vary both in number and order, by leveraging recent advancements in set-based deep learning methods. Furthermore, we present a data augmentation strategy for reducing overfitting in the DNN by randomizing the initial position guesses. We first perform simulations and show an improvement in the initial positioning error when our DNN-based corrections are applied. After this, we demonstrate that our approach outperforms a WLS baseline on real-world data. Our implementation is available at github.com/Stanford-NavLab/deep_gnss.
△ Less
Submitted 21 June, 2022; v1 submitted 18 October, 2021;
originally announced October 2021.
-
Blockchain for the Internet of Things: Present and Future
Authors:
Francesco Restuccia,
Salvatore D'Oro andSalil S. Kanhere,
Tommaso Melodia,
Sajal K. Das
Abstract:
One of the key challenges to the IoT's success is how to secure and anonymize billions of IoT transactions and devices per day, an issue that still lingers despite significant research efforts over the last few years. On the other hand, technologies based on blockchain algorithms are disrupting today's cryptocurrency markets and showing tremendous potential, since they provide a distributed transa…
▽ More
One of the key challenges to the IoT's success is how to secure and anonymize billions of IoT transactions and devices per day, an issue that still lingers despite significant research efforts over the last few years. On the other hand, technologies based on blockchain algorithms are disrupting today's cryptocurrency markets and showing tremendous potential, since they provide a distributed transaction ledger that cannot be tampered with or controlled by a single entity. Although the blockchain may present itself as a cure-all for the IoT's security and privacy challenges, significant research efforts still need to be put forth to adapt the computation-intensive blockchain algorithms to the stringent energy and processing constraints of today's IoT devices. In this paper, we provide an overview of existing literature on the topic of blockchain for IoT, and present a roadmap of research challenges that will need to be addressed to enable the usage of blockchain technologies in the IoT.
△ Less
Submitted 18 March, 2019;
originally announced March 2019.