-
NUMSnet: Nested-U Multi-class Segmentation network for 3D Medical Image Stacks
Authors:
Sohini Roychowdhury
Abstract:
Semantic segmentation for medical 3D image stacks enables accurate volumetric reconstructions, computer-aided diagnostics and follow up treatment planning. In this work, we present a novel variant of the Unet model called the NUMSnet that transmits pixel neighborhood features across scans through nested layers to achieve accurate multi-class semantic segmentations with minimal training data. We an…
▽ More
Semantic segmentation for medical 3D image stacks enables accurate volumetric reconstructions, computer-aided diagnostics and follow up treatment planning. In this work, we present a novel variant of the Unet model called the NUMSnet that transmits pixel neighborhood features across scans through nested layers to achieve accurate multi-class semantic segmentations with minimal training data. We analyze the semantic segmentation performance of the NUMSnet model in comparison with several Unet model variants to segment 3-7 regions of interest using only 10% of images for training per Lung-CT and Heart-CT volumetric image stacks. The proposed NUMSnet model achieves up to 20% improvement in segmentation recall with 4-9% improvement in Dice scores for Lung-CT stacks and 2.5-10% improvement in Dice scores for Heart-CT stacks when compared to the Unet++ model. The NUMSnet model needs to be trained by ordered images around the central scan of each volumetric stack. Propagation of image feature information from the 6 nested layers of the Unet++ model are found to have better computation and segmentation performances than propagation of all up-sampling layers in a Unet++ model. The NUMSnet model achieves comparable segmentation performances to existing works, while being trained on as low as 5\% of the training images. Also, transfer learning allows faster convergence of the NUMSnet model for multi-class semantic segmentation from pathology in Lung-CT images to cardiac segmentations in Heart-CT stacks. Thus, the proposed model can standardize multi-class semantic segmentation on a variety of volumetric image stacks with minimal training dataset. This can significantly reduce the cost, time and inter-observer variabilities associated with computer-aided detections and treatment.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
In situ process quality monitoring and defect detection for direct metal laser melting
Authors:
Sarah Felix,
Saikat Ray Majumder,
H. Kirk Mathews,
Michael Lexa,
Gabriel Lipsa,
Xiaohu **,
Subhrajit Roychowdhury,
Thomas Spears
Abstract:
Quality control and quality assurance are challenges in Direct Metal Laser Melting (DMLM). Intermittent machine diagnostics and downstream part inspections catch problems after undue cost has been incurred processing defective parts. In this paper we demonstrate two methodologies for in-process fault detection and part quality prediction that can be readily deployed on existing commercial DMLM sys…
▽ More
Quality control and quality assurance are challenges in Direct Metal Laser Melting (DMLM). Intermittent machine diagnostics and downstream part inspections catch problems after undue cost has been incurred processing defective parts. In this paper we demonstrate two methodologies for in-process fault detection and part quality prediction that can be readily deployed on existing commercial DMLM systems with minimal hardware modification. Novel features were derived from the time series of common photodiode sensors along with standard machine control signals. A Bayesian approach attributes measurements to one of multiple process states and a least squares regression model predicts severity of certain material defects.
△ Less
Submitted 3 December, 2021;
originally announced December 2021.
-
QU-net++: Image Quality Detection Framework for Segmentation of Medical 3D Image Stacks
Authors:
Sohini Roychowdhury
Abstract:
Automated segmentation of pathological regions of interest aids medical image diagnostics and follow-up care. However, accurate pathological segmentations require high quality of annotated data that can be both cost and time intensive to generate. In this work, we propose an automated two-step method that detects a minimal image subset required to train segmentation models by evaluating the qualit…
▽ More
Automated segmentation of pathological regions of interest aids medical image diagnostics and follow-up care. However, accurate pathological segmentations require high quality of annotated data that can be both cost and time intensive to generate. In this work, we propose an automated two-step method that detects a minimal image subset required to train segmentation models by evaluating the quality of medical images from 3D image stacks using a U-net++ model. These images that represent a lack of quality training can then be annotated and used to fully train a U-net-based segmentation model. The proposed QU-net++ model detects this lack of quality training based on the disagreement in segmentations produced from the final two output layers. The proposed model isolates around 10% of the slices per 3D image stack and can scale across imaging modalities to segment cysts in OCT images and ground glass opacity (GGO) in lung CT images with Dice scores in the range 0.56-0.72. Thus, the proposed method can be applied for cost effective multi-modal pathology segmentation tasks.
△ Less
Submitted 12 April, 2022; v1 submitted 27 October, 2021;
originally announced October 2021.
-
Video-Data Pipelines for Machine Learning Applications
Authors:
Sohini Roychowdhury,
James Y. Sato
Abstract:
Data pipelines are an essential component for end-to-end solutions that take machine learning algorithms to production. Engineering data pipelines for video-sequences poses several challenges including isolation of key-frames from video sequences that are high quality and represent significant variations in the scene. Manual isolation of such quality key-frames can take hours of sifting through ho…
▽ More
Data pipelines are an essential component for end-to-end solutions that take machine learning algorithms to production. Engineering data pipelines for video-sequences poses several challenges including isolation of key-frames from video sequences that are high quality and represent significant variations in the scene. Manual isolation of such quality key-frames can take hours of sifting through hours worth of video data. In this work, we present a data pipeline framework that can automate this process of manual frame sifting in video sequences by controlling the fraction of frames that can be removed based on image quality and content type. Additionally, the frames that are retained can be automatically tagged per sequence, thereby simplifying the process of automated data retrieval for future ML model deployments. We analyze the performance of the proposed video-data pipeline for versioned deployment and monitoring for object detection algorithms that are trained on outdoor autonomous driving video sequences. The proposed video-data pipeline can retain anywhere between 0.1-20% of the all input frames that are representative of high image quality and high variations in content. This frame selection, automated scene tagging followed by model verification can be completed in under 30 seconds for 22 video-sequences under analysis in this work. Thus, the proposed framework can be scaled to additional video-sequence data sets for automating ML versioned deployments.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
FuSSI-Net: Fusion of Spatio-temporal Skeletons for Intention Prediction Network
Authors:
Francesco Piccoli,
Rajarathnam Balakrishnan,
Maria Jesus Perez,
Moraldeepsingh Sachdeo,
Carlos Nunez,
Matthew Tang,
Kajsa Andreasson,
Kalle Bjurek,
Ria Dass Raj,
Ebba Davidsson,
Colin Eriksson,
Victor Hagman,
Jonas Sjoberg,
Ying Li,
L. Srikar Muppirisetty,
Sohini Roychowdhury
Abstract:
Pedestrian intention recognition is very important to develop robust and safe autonomous driving (AD) and advanced driver assistance systems (ADAS) functionalities for urban driving. In this work, we develop an end-to-end pedestrian intention framework that performs well on day- and night- time scenarios. Our framework relies on objection detection bounding boxes combined with skeletal features of…
▽ More
Pedestrian intention recognition is very important to develop robust and safe autonomous driving (AD) and advanced driver assistance systems (ADAS) functionalities for urban driving. In this work, we develop an end-to-end pedestrian intention framework that performs well on day- and night- time scenarios. Our framework relies on objection detection bounding boxes combined with skeletal features of human pose. We study early, late, and combined (early and late) fusion mechanisms to exploit the skeletal features and reduce false positives as well to improve the intention prediction performance. The early fusion mechanism results in AP of 0.89 and precision/recall of 0.79/0.89 for pedestrian intention classification. Furthermore, we propose three new metrics to properly evaluate the pedestrian intention systems. Under these new evaluation metrics for the intention prediction, the proposed end-to-end network offers accurate pedestrian intention up to half a second ahead of the actual risky maneuver.
△ Less
Submitted 15 May, 2020;
originally announced May 2020.