Search | arXiv e-print repository

arXiv:2403.18826 [pdf]

SAM-dPCR: Real-Time and High-throughput Absolute Quantification of Biological Samples Using Zero-Shot Segment Anything Model

Authors: Yuanyuan Wei, Shanhang Luo, Changran Xu, Yingqi Fu, Qingyue Dong, Yi Zhang, Fuyang Qu, Guangyao Cheng, Yi-** Ho, Ho-Pui Ho, Wu Yuan

Abstract: Digital PCR (dPCR) has revolutionized nucleic acid diagnostics by enabling absolute quantification of rare mutations and target sequences. However, current detection methodologies face challenges, as flow cytometers are costly and complex, while fluorescence imaging methods, relying on software or manual counting, are time-consuming and prone to errors. To address these limitations, we present SAM… ▽ More Digital PCR (dPCR) has revolutionized nucleic acid diagnostics by enabling absolute quantification of rare mutations and target sequences. However, current detection methodologies face challenges, as flow cytometers are costly and complex, while fluorescence imaging methods, relying on software or manual counting, are time-consuming and prone to errors. To address these limitations, we present SAM-dPCR, a novel self-supervised learning-based pipeline that enables real-time and high-throughput absolute quantification of biological samples. Leveraging the zero-shot SAM model, SAM-dPCR efficiently analyzes diverse microreactors with over 97.7% accuracy within a rapid processing time of 3.16 seconds. By utilizing commonly available lab fluorescence microscopes, SAM-dPCR facilitates the quantification of sample concentrations. The accuracy of SAM-dPCR is validated by the strong linear relationship observed between known and inferred sample concentrations. Additionally, SAM-dPCR demonstrates versatility through comprehensive verification using various samples and reactor morphologies. This accessible, cost-effective tool transcends the limitations of traditional detection methods or fully supervised AI models, marking the first application of SAM in nucleic acid detection or molecular diagnostics. By eliminating the need for annotated training data, SAM-dPCR holds great application potential for nucleic acid quantification in resource-limited settings. △ Less

Submitted 22 January, 2024; originally announced March 2024.

Comments: 23 pages, 6 figures

arXiv:2401.00173 [pdf, other]

Variability of morphology in beat-to-beat photoplethysmographic waveform quantified with unsupervised wave-shape manifold learning for clinical assessment

Authors: Yu-Chieh Ho, Te-Sheng Lin, She-Chih Wang, Chen-Shi Chang, Yu-Ting Lin

Abstract: We investigated the beat-to-beat fluctuation of the photoplethysmography (PPG) waveform. The motivation is that morphology variability extracted from the arterial blood pressure (ABP) has been found to correlate with baseline condition and short-term surgical outcome of the patients undergoing liver transplant surgery. Numerous interactions of physiological mechanisms regulating the cardiovascular… ▽ More We investigated the beat-to-beat fluctuation of the photoplethysmography (PPG) waveform. The motivation is that morphology variability extracted from the arterial blood pressure (ABP) has been found to correlate with baseline condition and short-term surgical outcome of the patients undergoing liver transplant surgery. Numerous interactions of physiological mechanisms regulating the cardiovascular system could underlie the variability of morphology. We used the unsupervised manifold learning algorithm, Dynamic Diffusion Map, to quantify the multivariate waveform morphological variation. Due to the physical principle of light absorption, PPG waveform signals are more susceptible to artifact and are nominally used only for visual inspection of data quality in clinical medical environment. But on the other hand, the noninvasive, easy-to-use nature of PPG grants a wider range of biomedical application, which inspired us to investigate the variability of morphology information from PPG waveform signal. We developed data analysis techniques to improve the performance and validated with the real-life clinical database. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2309.01384 [pdf]

Deep Learning Approach for Large-Scale, Real-Time Quantification of Green Fluorescent Protein-Labeled Biological Samples in Microreactors

Authors: Yuanyuan Wei, Sai Mu Dalike Abaxi, Nawaz Mehmood, Luoquan Li, Fuyang Qu, Guangyao Cheng, Dehua Hu, Yi-** Ho, Scott Wu Yuan, Ho-Pui Ho

Abstract: Absolute quantification of biological samples entails determining expression levels in precise numerical copies, offering enhanced accuracy and superior performance for rare templates. However, existing methodologies suffer from significant limitations: flow cytometers are both costly and intricate, while fluorescence imaging relying on software tools or manual counting is time-consuming and prone… ▽ More Absolute quantification of biological samples entails determining expression levels in precise numerical copies, offering enhanced accuracy and superior performance for rare templates. However, existing methodologies suffer from significant limitations: flow cytometers are both costly and intricate, while fluorescence imaging relying on software tools or manual counting is time-consuming and prone to inaccuracies. In this study, we have devised a comprehensive deep-learning-enabled pipeline that enables the automated segmentation and classification of GFP (green fluorescent protein)-labeled microreactors, facilitating real-time absolute quantification. Our findings demonstrate the efficacy of this technique in accurately predicting the sizes and occupancy status of microreactors using standard laboratory fluorescence microscopes, thereby providing precise measurements of template concentrations. Notably, our approach exhibits an analysis speed of quantifying over 2,000 microreactors (across 10 images) within remarkably 2.5 seconds, and a dynamic range spanning from 56.52 to 1569.43 copies per micron-liter. Furthermore, our Deep-dGFP algorithm showcases remarkable generalization capabilities, as it can be directly applied to various GFP-labeling scenarios, including droplet-based, microwell-based, and agarose-based biological applications. To the best of our knowledge, this represents the first successful implementation of an all-in-one image analysis algorithm in droplet digital PCR (polymerase chain reaction), microwell digital PCR, droplet single-cell sequencing, agarose digital PCR, and bacterial quantification, without necessitating any transfer learning steps, modifications, or retraining procedures. We firmly believe that our Deep-dGFP technique will be readily embraced by biomedical laboratories and holds potential for further development in related clinical applications. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: 23 pages, 6 figures, 1 table

arXiv:2302.09445 [pdf, other]

Partial differential equation-based inference of migration and proliferation mechanisms in cancer cell populations

Authors: Patrick C. Kinnunen, Siddhartha Srivastava, Zhenlin Wang, Kenneth K. Y. Ho, Brock A. Humphries, Siyi Chen, Jennifer J. Linderman, Gary D. Luker, Kathryn E. Luker, Krishna Garikipati

Abstract: Targeting signaling pathways that drive cancer cell migration or proliferation is a common therapeutic approach. A popular experimental technique, the scratch assay, measures the migration and proliferation-driven cell monolayer formation. Scratch assay analyses do not differentiate between migration and proliferation effects and do not attempt to measure dynamic effects. To improve upon these met… ▽ More Targeting signaling pathways that drive cancer cell migration or proliferation is a common therapeutic approach. A popular experimental technique, the scratch assay, measures the migration and proliferation-driven cell monolayer formation. Scratch assay analyses do not differentiate between migration and proliferation effects and do not attempt to measure dynamic effects. To improve upon these methods, we combine high-throughput scratch assays, continuous video microscopy, and variational system identification (VSI) to infer partial differential equation (PDE) models of cell migration and proliferation. We capture the evolution of cell density fields over time using live cell microscopy and automated image processing. We employ VSI techniques to identify cell density dynamics modeled with first-order kinetics of advection-diffusion-reaction systems. We present a comparison of our methods to results obtained using traditional inference approaches on previously analyzed 1-dimensional scratch assay data. We demonstrate the application of this pipeline on high throughput 2-dimensional scratch assays and find that decreasing serum levels can decrease random cell migration by approximately 20%. Our integrated experimental and computational pipeline can be adapted for automatically quantifying the effect of biological perturbations on cell migration and proliferation in various cell lines. △ Less

Submitted 18 February, 2023; originally announced February 2023.

arXiv:1608.01031 [pdf]

Meraculous2: fast accurate short-read assembly of large polymorphic genomes

Authors: Jarrod A. Chapman, Isaac Y. Ho, Eugene Goltsman, Daniel S. Rokhsar

Abstract: We present Meraculous2, an update to the Meraculous short-read assembler that includes (1) handling of allelic variation using "bubble" structures within the de Bruijn graph, (2) improved gap closing, and (3) an improved scaffolding algorithm that produces more complete assemblies without compromising scaffolding accuracy. The speed and bandwidth efficiency of the new parallel implementation have… ▽ More We present Meraculous2, an update to the Meraculous short-read assembler that includes (1) handling of allelic variation using "bubble" structures within the de Bruijn graph, (2) improved gap closing, and (3) an improved scaffolding algorithm that produces more complete assemblies without compromising scaffolding accuracy. The speed and bandwidth efficiency of the new parallel implementation have also been substantially improved, allowing the assembly of a human genome to be accomplished in 24 hours on the JGI/NERSC Genepool system. To highlight the features of Meraculous2 we present here the assembly of the diploid human genome NA12878, and compare it with previously published assemblies of the same data using other algorithms. The Meraculous2 assemblies are shown to have better completeness, contiguity, and accuracy than other published assemblies for these data. Practical considerations including pre-assembly analyses of polymorphism and repetitiveness are described. △ Less

Submitted 7 November, 2017; v1 submitted 2 August, 2016; originally announced August 2016.

Comments: Supplementary notes included with the manuscript

arXiv:1301.5406 [pdf]

doi 10.1186/2047-217X-2-10

Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

Authors: Keith R. Bradnam, Joseph N. Fass, Anton Alexandrov, Paul Baranay, Michael Bechner, İnanç Birol, Sébastien Boisvert, Jarrod A. Chapman, Guillaume Chapuis, Rayan Chikhi, Hamidreza Chitsaz, Wen-Chi Chou, Jacques Corbeil, Cristian Del Fabbro, T. Roderick Docking, Richard Durbin, Dent Earl, Scott Emrich, Pavel Fedotov, Nuno A. Fonseca, Ganeshkumar Ganapathy, Richard A. Gibbs, Sante Gnerre, Élénie Godzaridis, Steve Goldstein , et al. (66 additional authors not shown)

Abstract: Background - The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and… ▽ More Background - The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. Results - In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. Conclusions - Many current genome assemblers produced useful assemblies, containing a significant representation of their genes, regulatory sequences, and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another. △ Less

Submitted 27 June, 2013; v1 submitted 23 January, 2013; originally announced January 2013.

Comments: Additional files available at http://korflab.ucdavis.edu/Datasets/Assemblathon/Assemblathon2/Additional_files/ Major changes 1. Accessions for the 3 read data sets have now been included 2. New file: spreadsheet containing details of all Study, Sample, Run, & Experiment identifiers 3. Made miscellaneous changes to address reviewers comments. DOIs added to GigaDB datasets

Journal ref: GigaScience 2:10 (2013)

Showing 1–6 of 6 results for author: Ho, Y