Search | arXiv e-print repository

Gene-Level Representation Learning via Interventional Style Transfer in Optical Pooled Screening

Authors: Mahtab Bigverdi, Burkhard Hockendorf, Heming Yao, Phil Hanslovsky, Romain Lopez, David Richmond

Abstract: Optical pooled screening (OPS) combines automated microscopy and genetic perturbations to systematically study gene function in a scalable and cost-effective way. Leveraging the resulting data requires extracting biologically informative representations of cellular perturbation phenotypes from images. We employ a style-transfer approach to learn gene-level feature representations from images of ge… ▽ More Optical pooled screening (OPS) combines automated microscopy and genetic perturbations to systematically study gene function in a scalable and cost-effective way. Leveraging the resulting data requires extracting biologically informative representations of cellular perturbation phenotypes from images. We employ a style-transfer approach to learn gene-level feature representations from images of genetically perturbed cells obtained via OPS. Our method outperforms widely used engineered features in clustering gene representations according to gene function, demonstrating its utility for uncovering latent biological relationships. This approach offers a promising alternative to investigate the role of genes in health and disease. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 11 pages, 5 figures, CVPR workshop paper

arXiv:2403.04945 [pdf, other]

MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation

Authors: Zhongwei Wan, Che Liu, Xin Wang, Chaofan Tao, Hui Shen, Zhenwu Peng, Jie Fu, Rossella Arcucci, Huaxiu Yao, Mi Zhang

Abstract: Electrocardiogram (ECG) is the primary non-invasive diagnostic tool for monitoring cardiac conditions and is crucial in assisting clinicians. Recent studies have concentrated on classifying cardiac conditions using ECG data but have overlooked ECG report generation, which is time-consuming and requires clinical expertise. To automate ECG report generation and ensure its versatility, we propose the… ▽ More Electrocardiogram (ECG) is the primary non-invasive diagnostic tool for monitoring cardiac conditions and is crucial in assisting clinicians. Recent studies have concentrated on classifying cardiac conditions using ECG data but have overlooked ECG report generation, which is time-consuming and requires clinical expertise. To automate ECG report generation and ensure its versatility, we propose the Multimodal ECG Instruction Tuning (MEIT) framework, the first attempt to tackle ECG report generation with LLMs and multimodal instructions. To facilitate future research, we establish a benchmark to evaluate MEIT with various LLMs backbones across two large-scale ECG datasets. Our approach uniquely aligns the representations of the ECG signal and the report, and we conduct extensive experiments to benchmark MEIT with nine open-source LLMs using more than 800,000 ECG reports. MEIT's results underscore the superior performance of instruction-tuned LLMs, showcasing their proficiency in quality report generation, zero-shot capabilities, and resilience to signal perturbation. These findings emphasize the efficacy of our MEIT framework and its potential for real-world clinical application. △ Less

Submitted 18 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

Comments: Under review

arXiv:2402.06841 [pdf]

Point cloud-based registration and image fusion between cardiac SPECT MPI and CTA

Authors: Shaojie Tang, Penpen Miao, Xingyu Gao, Yu Zhong, Dantong Zhu, Haixing Wen, Zhihui Xu, Qiuyue Wei, Hong** Yao, Xin Huang, Rui Gao, Chen Zhao, Weihua Zhou

Abstract: A method was proposed for the point cloud-based registration and image fusion between cardiac single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) and cardiac computed tomography angiograms (CTA). Firstly, the left ventricle (LV) epicardial regions (LVERs) in SPECT and CTA images were segmented by using different U-Net neural networks trained to generate the point c… ▽ More A method was proposed for the point cloud-based registration and image fusion between cardiac single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) and cardiac computed tomography angiograms (CTA). Firstly, the left ventricle (LV) epicardial regions (LVERs) in SPECT and CTA images were segmented by using different U-Net neural networks trained to generate the point clouds of the LV epicardial contours (LVECs). Secondly, according to the characteristics of cardiac anatomy, the special points of anterior and posterior interventricular grooves (APIGs) were manually marked in both SPECT and CTA image volumes. Thirdly, we developed an in-house program for coarsely registering the special points of APIGs to ensure a correct cardiac orientation alignment between SPECT and CTA images. Fourthly, we employed ICP, SICP or CPD algorithm to achieve a fine registration for the point clouds (together with the special points of APIGs) of the LV epicardial surfaces (LVERs) in SPECT and CTA images. Finally, the image fusion between SPECT and CTA was realized after the fine registration. The experimental results showed that the cardiac orientation was aligned well and the mean distance error of the optimal registration method (CPD with affine transform) was consistently less than 3 mm. The proposed method could effectively fuse the structures from cardiac CTA and SPECT functional images, and demonstrated a potential in assisting in accurate diagnosis of cardiac diseases by combining complementary advantages of the two imaging modalities. △ Less

Submitted 9 February, 2024; originally announced February 2024.

arXiv:2312.12599 [pdf, other]

Unsupervised Segmentation of Colonoscopy Images

Authors: Heming Yao, Jérôme Lüscher, Benjamin Gutierrez Becker, Josep Arús-Pous, Tommaso Biancalani, Amelie Bigorgne, David Richmond

Abstract: Colonoscopy plays a crucial role in the diagnosis and prognosis of various gastrointestinal diseases. Due to the challenges of collecting large-scale high-quality ground truth annotations for colonoscopy images, and more generally medical images, we explore using self-supervised features from vision transformers in three challenging tasks for colonoscopy images. Our results indicate that image-lev… ▽ More Colonoscopy plays a crucial role in the diagnosis and prognosis of various gastrointestinal diseases. Due to the challenges of collecting large-scale high-quality ground truth annotations for colonoscopy images, and more generally medical images, we explore using self-supervised features from vision transformers in three challenging tasks for colonoscopy images. Our results indicate that image-level features learned from DINO models achieve image classification performance comparable to fully supervised models, and patch-level features contain rich semantic information for object detection. Furthermore, we demonstrate that self-supervised features combined with unsupervised segmentation can be used to discover multiple clinically relevant structures in a fully unsupervised manner, demonstrating the tremendous potential of applying these methods in medical image analysis. △ Less

Submitted 19 December, 2023; originally announced December 2023.

arXiv:2309.07185 [pdf]

A Health Monitoring System Based on Flexible Triboelectric Sensors for Intelligence Medical Internet of Things and its Applications in Virtual Reality

Authors: Junqi Mao, Puen Zhou, Xiaoyao Wang, Hongbo Yao, Liuyang Liang, Yiqiao Zhao, Jiawei Zhang, Dayan Ban, Haiwu Zheng

Abstract: The Internet of Medical Things (IoMT) is a platform that combines Internet of Things (IoT) technology with medical applications, enabling the realization of precision medicine, intelligent healthcare, and telemedicine in the era of digitalization and intelligence. However, the IoMT faces various challenges, including sustainable power supply, human adaptability of sensors and the intelligence of s… ▽ More The Internet of Medical Things (IoMT) is a platform that combines Internet of Things (IoT) technology with medical applications, enabling the realization of precision medicine, intelligent healthcare, and telemedicine in the era of digitalization and intelligence. However, the IoMT faces various challenges, including sustainable power supply, human adaptability of sensors and the intelligence of sensors. In this study, we designed a robust and intelligent IoMT system through the synergistic integration of flexible wearable triboelectric sensors and deep learning-assisted data analytics. We embedded four triboelectric sensors into a wristband to detect and analyze limb movements in patients suffering from Parkinson's Disease (PD). By further integrating deep learning-assisted data analytics, we actualized an intelligent healthcare monitoring system for the surveillance and interaction of PD patients, which includes location/trajectory tracking, heart monitoring and identity recognition. This innovative approach enabled us to accurately capture and scrutinize the subtle movements and fine motor of PD patients, thus providing insightful feedback and comprehensive assessment of the patients conditions. This monitoring system is cost-effective, easily fabricated, highly sensitive, and intelligent, consequently underscores the immense potential of human body sensing technology in a Health 4.0 society. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2307.02148 [pdf]

Compound Attention and Neighbor Matching Network for Multi-contrast MRI Super-resolution

Authors: Wenxuan Chen, Sirui Wu, Shuai Wang, Zhongsen Li, Jia Yang, Huifeng Yao, Xiaolei Song

Abstract: Multi-contrast magnetic resonance imaging (MRI) reflects information about human tissue from different perspectives and has many clinical applications. By utilizing the complementary information among different modalities, multi-contrast super-resolution (SR) of MRI can achieve better results than single-image super-resolution. However, existing methods of multi-contrast MRI SR have the following… ▽ More Multi-contrast magnetic resonance imaging (MRI) reflects information about human tissue from different perspectives and has many clinical applications. By utilizing the complementary information among different modalities, multi-contrast super-resolution (SR) of MRI can achieve better results than single-image super-resolution. However, existing methods of multi-contrast MRI SR have the following shortcomings that may limit their performance: First, existing methods either simply concatenate the reference and degraded features or exploit global feature-matching between them, which are unsuitable for multi-contrast MRI SR. Second, although many recent methods employ transformers to capture long-range dependencies in the spatial dimension, they neglect that self-attention in the channel dimension is also important for low-level vision tasks. To address these shortcomings, we proposed a novel network architecture with compound-attention and neighbor matching (CANM-Net) for multi-contrast MRI SR: The compound self-attention mechanism effectively captures the dependencies in both spatial and channel dimension; the neighborhood-based feature-matching modules are exploited to match degraded features and adjacent reference features and then fuse them to obtain the high-quality images. We conduct experiments of SR tasks on the IXI, fastMRI, and real-world scanning datasets. The CANM-Net outperforms state-of-the-art approaches in both retrospective and prospective experiments. Moreover, the robustness study in our work shows that the CANM-Net still achieves good performance when the reference and degraded images are imperfectly registered, proving good potential in clinical applications. △ Less

Submitted 16 September, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2209.15451 [pdf, other]

doi 10.1007/978-3-031-23443-9_35

Semi-Supervised Domain Generalization for Cardiac Magnetic Resonance Image Segmentation with High Quality Pseudo Labels

Authors: Wanqin Ma, Huifeng Yao, Yiqun Lin, Jiarong Guo, Xiaomeng Li

Abstract: Develo** a deep learning method for medical segmentation tasks heavily relies on a large amount of labeled data. However, the annotations require professional knowledge and are limited in number. Recently, semi-supervised learning has demonstrated great potential in medical segmentation tasks. Most existing methods related to cardiac magnetic resonance images only focus on regular images with si… ▽ More Develo** a deep learning method for medical segmentation tasks heavily relies on a large amount of labeled data. However, the annotations require professional knowledge and are limited in number. Recently, semi-supervised learning has demonstrated great potential in medical segmentation tasks. Most existing methods related to cardiac magnetic resonance images only focus on regular images with similar domains and high image quality. A semi-supervised domain generalization method was developed in [2], which enhances the quality of pseudo labels on varied datasets. In this paper, we follow the strategy in [2] and present a domain generalization method for semi-supervised medical segmentation. Our main goal is to improve the quality of pseudo labels under extreme MRI Analysis with various domains. We perform Fourier transformation on input images to learn low-level statistics and cross-domain information. Then we feed the augmented images as input to the double cross pseudo supervision networks to calculate the variance among pseudo labels. We evaluate our method on the CMRxMotion dataset [1]. With only partially labeled data and without domain labels, our approach consistently generates accurate segmentation results of cardiac magnetic resonance images with different respiratory motions. Code is available at: https://github.com/MAWanqin2002/STACOM2022Ma △ Less

Submitted 1 December, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

Comments: Accepted by International Workshop on Statistical Atlases and Computational Models of the Heart (STACOM2022) of MICCAI2022

Journal ref: STACOM2022.13593(2022)383-391

arXiv:2209.14102 [pdf]

Segmentation method of U-net sheet metal engineering drawing based on CBAM attention mechanism

Authors: Zhiwei Song, Hui Yao

Abstract: In the manufacturing process of heavy industrial equipment, the specific unit in the welding diagram is first manually redrawn and then the corresponding sheet metal parts are cut, which is inefficient. To this end, this paper proposes a U-net-based method for the segmentation and extraction of specific units in welding engineering drawings. This method enables the cutting device to automatically… ▽ More In the manufacturing process of heavy industrial equipment, the specific unit in the welding diagram is first manually redrawn and then the corresponding sheet metal parts are cut, which is inefficient. To this end, this paper proposes a U-net-based method for the segmentation and extraction of specific units in welding engineering drawings. This method enables the cutting device to automatically segment specific graphic units according to visual information and automatically cut out sheet metal parts of corresponding shapes according to the segmentation results. This process is more efficient than traditional human-assisted cutting. Two weaknesses in the U-net network will lead to a decrease in segmentation performance: first, the focus on global semantic feature information is weak, and second, there is a large dimensional difference between shallow encoder features and deep decoder features. Based on the CBAM (Convolutional Block Attention Module) attention mechanism, this paper proposes a U-net jump structure model with an attention mechanism to improve the network's global semantic feature extraction ability. In addition, a U-net attention mechanism model with dual pooling convolution fusion is designed, the deep encoder's maximum pooling + convolution features and the shallow encoder's average pooling + convolution features are fused vertically to reduce the dimension difference between the shallow encoder and deep decoder. The dual-pool convolutional attention jump structure replaces the traditional U-net jump structure, which can effectively improve the specific unit segmentation performance of the welding engineering drawing. Using vgg16 as the backbone network, experiments have verified that the IoU, mAP, and Accu of our model in the welding engineering drawing dataset segmentation task are 84.72%, 86.84%, and 99.42%, respectively. △ Less

Submitted 27 April, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

arXiv:2011.13645 [pdf]

Numerical and experimental study of tonal noise sources at the outlet of an isolated centrifugal fan

Authors: Martin Ottersten, Hua-Dong Yao, Lars Davidson

Abstract: In this study, tonal noise produced by an isolated centrifugal fan is investigated using unsteady Reynolds-averaged Navier-Stokes (URANS) equations. This type of fans is used in ventilation systems. As the fan propagates tonal noise in the system, it can severely affect the life quality of people that reside in the buildings. Our simulation shows that turbulence kinetic energy (TKE) is unevenly di… ▽ More In this study, tonal noise produced by an isolated centrifugal fan is investigated using unsteady Reynolds-averaged Navier-Stokes (URANS) equations. This type of fans is used in ventilation systems. As the fan propagates tonal noise in the system, it can severely affect the life quality of people that reside in the buildings. Our simulation shows that turbulence kinetic energy (TKE) is unevenly distributed around the rotation axis. Large TKE exists near the shroud at the pressure sides of the blades. It is caused by the recirculating flow. Moreover, the position of the largest TKE periodically varies among the blades. The period corresponds to approximately 4 times the fan rotation period, it was also found in acoustic measurements. The magnitude of the tonal noise at the blade passing frequencies agrees well with experimental data. By analyzing the wall-pressure fluctuations, it is found that the recirculating flow regions with large TKE are dominant sources of the tonal noise. △ Less

Submitted 27 November, 2020; originally announced November 2020.

arXiv:2010.16211 [pdf, other]

Statistical Analysis of Signal-Dependent Noise: Application in Blind Localization of Image Splicing Forgery

Authors: Mian Zou, Heng Yao, Chuan Qin, Xinpeng Zhang

Abstract: Visual noise is often regarded as a disturbance in image quality, whereas it can also provide a crucial clue for image-based forensic tasks. Conventionally, noise is assumed to comprise an additive Gaussian model to be estimated and then used to reveal anomalies. However, for real sensor noise, it should be modeled as signal-dependent noise (SDN). In this work, we apply SDN to splicing forgery loc… ▽ More Visual noise is often regarded as a disturbance in image quality, whereas it can also provide a crucial clue for image-based forensic tasks. Conventionally, noise is assumed to comprise an additive Gaussian model to be estimated and then used to reveal anomalies. However, for real sensor noise, it should be modeled as signal-dependent noise (SDN). In this work, we apply SDN to splicing forgery localization tasks. Through statistical analysis of the SDN model, we assume that noise can be modeled as a Gaussian approximation for a certain brightness and propose a likelihood model for a noise level function. By building a maximum a posterior Markov random field (MAP-MRF) framework, we exploit the likelihood of noise to reveal the alien region of spliced objects, with a probability combination refinement strategy. To ensure a completely blind detection, an iterative alternating method is adopted to estimate the MRF parameters. Experimental results demonstrate that our method is effective and provides a comparative localization performance. △ Less

Submitted 2 November, 2020; v1 submitted 30 October, 2020; originally announced October 2020.

arXiv:2006.03761 [pdf, other]

GRNet: Gridding Residual Network for Dense Point Cloud Completion

Authors: Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao, Sheng** Zhang, Wenxiu Sun

Abstract: Estimating the complete 3D point cloud from an incomplete one is a key problem in many vision and robotics applications. Mainstream methods (e.g., PCN and TopNet) use Multi-layer Perceptrons (MLPs) to directly process point clouds, which may cause the loss of details because the structural and context of point clouds are not fully considered. To solve this problem, we introduce 3D grids as interme… ▽ More Estimating the complete 3D point cloud from an incomplete one is a key problem in many vision and robotics applications. Mainstream methods (e.g., PCN and TopNet) use Multi-layer Perceptrons (MLPs) to directly process point clouds, which may cause the loss of details because the structural and context of point clouds are not fully considered. To solve this problem, we introduce 3D grids as intermediate representations to regularize unordered point clouds. We therefore propose a novel Gridding Residual Network (GRNet) for point cloud completion. In particular, we devise two novel differentiable layers, named Gridding and Gridding Reverse, to convert between point clouds and 3D grids without losing structural information. We also present the differentiable Cubic Feature Sampling layer to extract features of neighboring points, which preserves context information. In addition, we design a new loss function, namely Gridding Loss, to calculate the L1 distance between the 3D grids of the predicted and ground truth point clouds, which is helpful to recover details. Experimental results indicate that the proposed GRNet performs favorably against state-of-the-art methods on the ShapeNet, Completion3D, and KITTI benchmarks. △ Less

Submitted 20 July, 2020; v1 submitted 5 June, 2020; originally announced June 2020.

Comments: ECCV 2020

arXiv:1908.11056 [pdf, other]

Targeted Source Detection for Environmental Data

Authors: Guanjie Zheng, Mengqi Liu, Tao Wen, Hongjian Wang, Huaxiu Yao, Susan L. Brantley, Zhenhui Li

Abstract: In the face of growing needs for water and energy, a fundamental understanding of the environmental impacts of human activities becomes critical for managing water and energy resources, remedying water pollution, and making regulatory policy wisely. Among activities that impact the environment, oil and gas production, wastewater transport, and urbanization are included. In addition to the occurren… ▽ More In the face of growing needs for water and energy, a fundamental understanding of the environmental impacts of human activities becomes critical for managing water and energy resources, remedying water pollution, and making regulatory policy wisely. Among activities that impact the environment, oil and gas production, wastewater transport, and urbanization are included. In addition to the occurrence of anthropogenic contamination, the presence of some contaminants (e.g., methane, salt, and sulfate) of natural origin is not uncommon. Therefore, scientists sometimes find it difficult to identify the sources of contaminants in the coupled natural and human systems. In this paper, we propose a technique to simultaneously conduct source detection and prediction, which outperforms other approaches in the interdisciplinary case study of the identification of potential groundwater contamination within a region of high-density shale gas development. △ Less

Submitted 29 August, 2019; originally announced August 2019.

Comments: 8 pages, 4 figures, 1 table

arXiv:1902.02829 [pdf, other]

Low-cost Measurement of Industrial Shock Signals via Deep Learning Calibration

Authors: Houpu Yao, **g**g Wen, Yi Ren, Bin Wu, Ze Ji

Abstract: Special high-end sensors with expensive hardware are usually needed to measure shock signals with high accuracy. In this paper, we show that cheap low-end sensors calibrated by deep neural networks are also capable to measure high-g shocks accurately. Firstly we perform drop shock tests to collect a dataset of shock signals measured by sensors of different fidelity. Secondly, we propose a novel ne… ▽ More Special high-end sensors with expensive hardware are usually needed to measure shock signals with high accuracy. In this paper, we show that cheap low-end sensors calibrated by deep neural networks are also capable to measure high-g shocks accurately. Firstly we perform drop shock tests to collect a dataset of shock signals measured by sensors of different fidelity. Secondly, we propose a novel network to effectively learn both the signal peak and overall shape. The results show that the proposed network is capable to map low-end shock signals to its high-end counterparts with satisfactory accuracy. To the best of our knowledge, this is the first work to apply deep learning techniques to calibrate shock sensors. △ Less

Submitted 7 February, 2019; originally announced February 2019.

Showing 1–13 of 13 results for author: Yao, H