Search | arXiv e-print repository

arXiv:2405.10993 [pdf]

No winners: Performance of lung cancer prediction models depends on screening-detected, incidental, and biopsied pulmonary nodule use cases

Authors: Thomas Z. Li, Kaiwen Xu, Aravind Krishnan, Riqiang Gao, Michael N. Kammer, Sanja Antic, David Xiao, Michael Knight, Yency Martinez, Rafael Paez, Robert J. Lentz, Stephen Deppen, Eric L. Grogan, Thomas A. Lasko, Kim L. Sandler, Fabien Maldonado, Bennett A. Landman

Abstract: Statistical models for predicting lung cancer have the potential to facilitate earlier diagnosis of malignancy and avoid invasive workup of benign disease. Many models have been published, but comparative studies of their utility in different clinical settings in which patients would arguably most benefit are scarce. This study retrospectively evaluated promising predictive models for lung cancer… ▽ More Statistical models for predicting lung cancer have the potential to facilitate earlier diagnosis of malignancy and avoid invasive workup of benign disease. Many models have been published, but comparative studies of their utility in different clinical settings in which patients would arguably most benefit are scarce. This study retrospectively evaluated promising predictive models for lung cancer prediction in three clinical settings: lung cancer screening with low-dose computed tomography, incidentally detected pulmonary nodules, and nodules deemed suspicious enough to warrant a biopsy. We leveraged 9 cohorts (n=898, 896, 882, 219, 364, 117, 131, 115, 373) from multiple institutions to assess the area under the receiver operating characteristic curve (AUC) of validated models including logistic regressions on clinical variables and radiologist nodule characterizations, artificial intelligence on chest CTs, longitudinal imaging AI, and multi-modal approaches. We implemented each model from their published literature, re-training the models if necessary, and curated each cohort from primary data sources. We observed that model performance varied greatly across clinical use cases. No single predictive model emerged as a clear winner across all cohorts, but certain models excelled in specific clinical contexts. Single timepoint chest CT AI performed well in lung screening, but struggled to generalize to other clinical settings. Longitudinal imaging and multimodal models demonstrated comparatively promising performance on incidentally-detected nodules. However, when applied to nodules that underwent biopsy, all models underperformed. These results underscore the strengths and limitations of 8 validated predictive models and highlight promising directions towards personalized, noninvasive lung cancer diagnosis. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: Submitted to Radiology: AI

arXiv:2404.17454 [pdf, other]

Domain Adaptive and Fine-grained Anomaly Detection for Single-cell Sequencing Data and Beyond

Authors: Kaichen Xu, Yueyang Ding, Suyang Hou, Weiqiang Zhan, Nisang Chen, Jun Wang, Xiaobo Sun

Abstract: Fined-grained anomalous cell detection from affected tissues is critical for clinical diagnosis and pathological research. Single-cell sequencing data provide unprecedented opportunities for this task. However, current anomaly detection methods struggle to handle domain shifts prevalent in multi-sample and multi-domain single-cell sequencing data, leading to suboptimal performance. Moreover, these… ▽ More Fined-grained anomalous cell detection from affected tissues is critical for clinical diagnosis and pathological research. Single-cell sequencing data provide unprecedented opportunities for this task. However, current anomaly detection methods struggle to handle domain shifts prevalent in multi-sample and multi-domain single-cell sequencing data, leading to suboptimal performance. Moreover, these methods fall short of distinguishing anomalous cells into pathologically distinct subtypes. In response, we propose ACSleuth, a novel, reconstruction deviation-guided generative framework that integrates the detection, domain adaptation, and fine-grained annotating of anomalous cells into a methodologically cohesive workflow. Notably, we present the first theoretical analysis of using reconstruction deviations output by generative models for anomaly detection in lieu of domain shifts. This analysis informs us to develop a novel and superior maximum mean discrepancy-based anomaly scorer in ACSleuth. Extensive benchmarks over various single-cell data and other types of tabular data demonstrate ACSleuth's superiority over the state-of-the-art methods in identifying and subty** anomalies in multi-sample and multi-domain contexts. Our code is available at https://github.com/Catchxu/ACsleuth. △ Less

Submitted 29 April, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

Comments: 17 pages, 2 figures. Accepted by IJCAI 2024

arXiv:2211.02234 [pdf, other]

A Latent Space Model for HLA Compatibility Networks in Kidney Transplantation

Authors: Zhipeng Huang, Kevin S. Xu

Abstract: Kidney transplantation is the preferred treatment for people suffering from end-stage renal disease. Successful kidney transplants still fail over time, known as graft failure; however, the time to graft failure, or graft survival time, can vary significantly between different recipients. A significant biological factor affecting graft survival times is the compatibility between the human leukocyt… ▽ More Kidney transplantation is the preferred treatment for people suffering from end-stage renal disease. Successful kidney transplants still fail over time, known as graft failure; however, the time to graft failure, or graft survival time, can vary significantly between different recipients. A significant biological factor affecting graft survival times is the compatibility between the human leukocyte antigens (HLAs) of the donor and recipient. We propose to model HLA compatibility using a network, where the nodes denote different HLAs of the donor and recipient, and edge weights denote compatibilities of the HLAs, which can be positive or negative. The network is indirectly observed, as the edge weights are estimated from transplant outcomes rather than directly observed. We propose a latent space model for such indirectly-observed weighted and signed networks. We demonstrate that our latent space model can not only result in more accurate estimates of HLA compatibilities, but can also be incorporated into survival analysis models to improve accuracy for the downstream task of predicting graft survival times. △ Less

Submitted 3 November, 2022; originally announced November 2022.

Comments: This work has been accepted to BIBM 2022

arXiv:2209.01676 [pdf]

Time-distance vision transformers in lung cancer diagnosis from longitudinal computed tomography

Authors: Thomas Z. Li, Kaiwen Xu, Riqiang Gao, Yucheng Tang, Thomas A. Lasko, Fabien Maldonado, Kim Sandler, Bennett A. Landman

Abstract: Features learned from single radiologic images are unable to provide information about whether and how much a lesion may be changing over time. Time-dependent features computed from repeated images can capture those changes and help identify malignant lesions by their temporal behavior. However, longitudinal medical imaging presents the unique challenge of sparse, irregular time intervals in data… ▽ More Features learned from single radiologic images are unable to provide information about whether and how much a lesion may be changing over time. Time-dependent features computed from repeated images can capture those changes and help identify malignant lesions by their temporal behavior. However, longitudinal medical imaging presents the unique challenge of sparse, irregular time intervals in data acquisition. While self-attention has been shown to be a versatile and efficient learning mechanism for time series and natural images, its potential for interpreting temporal distance between sparse, irregularly sampled spatial features has not been explored. In this work, we propose two interpretations of a time-distance vision transformer (ViT) by using (1) vector embeddings of continuous time and (2) a temporal emphasis model to scale self-attention weights. The two algorithms are evaluated based on benign versus malignant lung cancer discrimination of synthetic pulmonary nodules and lung screening computed tomography studies from the National Lung Screening Trial (NLST). Experiments evaluating the time-distance ViTs on synthetic nodules show a fundamental improvement in classifying irregularly sampled longitudinal images when compared to standard ViTs. In cross-validation on screening chest CTs from the NLST, our methods (0.785 and 0.786 AUC respectively) significantly outperform a cross-sectional approach (0.734 AUC) and match the discriminative performance of the leading longitudinal medical imaging algorithm (0.779 AUC) on benign versus malignant classification. This work represents the first self-attention-based framework for classifying longitudinal medical images. Our code is available at https://github.com/tom1193/time-distance-transformer. △ Less

Submitted 4 September, 2022; originally announced September 2022.

Comments: Summited to SPIE 2023 - Medical Imaging. 10 pages

arXiv:2204.11840 [pdf, other]

Dynamic Ensemble Bayesian Filter for Robust Control of a Human Brain-machine Interface

Authors: Yu Qi, Xinyun Zhu, Kedi Xu, Feixiao Ren, Hongjie Jiang, Junming Zhu, Jianmin Zhang, Gang Pan, Yueming Wang

Abstract: Objective: Brain-machine interfaces (BMIs) aim to provide direct brain control of devices such as prostheses and computer cursors, which have demonstrated great potential for mobility restoration. One major limitation of current BMIs lies in the unstable performance in online control due to the variability of neural signals, which seriously hinders the clinical availability of BMIs. Method: To dea… ▽ More Objective: Brain-machine interfaces (BMIs) aim to provide direct brain control of devices such as prostheses and computer cursors, which have demonstrated great potential for mobility restoration. One major limitation of current BMIs lies in the unstable performance in online control due to the variability of neural signals, which seriously hinders the clinical availability of BMIs. Method: To deal with the neural variability in online BMI control, we propose a dynamic ensemble Bayesian filter (DyEnsemble). DyEnsemble extends Bayesian filters with a dynamic measurement model, which adjusts its parameters in time adaptively with neural changes. This is achieved by learning a pool of candidate functions and dynamically weighting and assembling them according to neural signals. In this way, DyEnsemble copes with variability in signals and improves the robustness of online control. Results: Online BMI experiments with a human participant demonstrate that, compared with the velocity Kalman filter, DyEnsemble significantly improves the control accuracy (increases the success rate by 13.9% and reduces the reach time by 13.5% in the random target pursuit task) and robustness (performs more stably over different experiment days). Conclusion: Our results demonstrate the superiority of DyEnsemble in online BMI control. Significance: DyEnsemble frames a novel and flexible framework for robust neural decoding, which is beneficial to different neural decoding applications. △ Less

Submitted 22 April, 2022; originally announced April 2022.

arXiv:2111.04738 [pdf]

doi 10.3390/jimaging8080213

HEROHE Challenge: assessing HER2 status in breast cancer without immunohistochemistry or in situ hybridization

Authors: Eduardo Conde-Sousa, João Vale, Ming Feng, Kele Xu, Yin Wang, Vincenzo Della Mea, David La Barbera, Ehsan Montahaei, Mahdieh Soleymani Baghshah, Andreas Turzynski, Jacob Gildenblat, Eldad Klaiman, Yiyu Hong, Guilherme Aresta, Teresa Araújo, Paulo Aguiar, Catarina Eloy, António Polónia

Abstract: Breast cancer is the most common malignancy in women, being responsible for more than half a million deaths every year. As such, early and accurate diagnosis is of paramount importance. Human expertise is required to diagnose and correctly classify breast cancer and define appropriate therapy, which depends on the evaluation of the expression of different biomarkers such as the transmembrane prote… ▽ More Breast cancer is the most common malignancy in women, being responsible for more than half a million deaths every year. As such, early and accurate diagnosis is of paramount importance. Human expertise is required to diagnose and correctly classify breast cancer and define appropriate therapy, which depends on the evaluation of the expression of different biomarkers such as the transmembrane protein receptor HER2. This evaluation requires several steps, including special techniques such as immunohistochemistry or in situ hybridization to assess HER2 status. With the goal of reducing the number of steps and human bias in diagnosis, the HEROHE Challenge was organized, as a parallel event of the 16th European Congress on Digital Pathology, aiming to automate the assessment of the HER2 status based only on hematoxylin and eosin stained tissue sample of invasive breast cancer. Methods to assess HER2 status were presented by 21 teams worldwide and the results achieved by some of the proposed methods open potential perspectives to advance the state-of-the-art. △ Less

Submitted 8 November, 2021; originally announced November 2021.

arXiv:2110.14853 [pdf, other]

Targeted Neural Dynamical Modeling

Authors: Cole Hurwitz, Akash Srivastava, Kai Xu, Justin Jude, Matthew G. Perich, Lee E. Miller, Matthias H. Hennig

Abstract: Latent dynamics models have emerged as powerful tools for modeling and interpreting neural population activity. Recently, there has been a focus on incorporating simultaneously measured behaviour into these models to further disentangle sources of neural variability in their latent space. These approaches, however, are limited in their ability to capture the underlying neural dynamics (e.g. linear… ▽ More Latent dynamics models have emerged as powerful tools for modeling and interpreting neural population activity. Recently, there has been a focus on incorporating simultaneously measured behaviour into these models to further disentangle sources of neural variability in their latent space. These approaches, however, are limited in their ability to capture the underlying neural dynamics (e.g. linear) and in their ability to relate the learned dynamics back to the observed behaviour (e.g. no time lag). To this end, we introduce Targeted Neural Dynamical Modeling (TNDM), a nonlinear state-space model that jointly models the neural activity and external behavioural variables. TNDM decomposes neural dynamics into behaviourally relevant and behaviourally irrelevant dynamics; the relevant dynamics are used to reconstruct the behaviour through a flexible linear decoder and both sets of dynamics are used to reconstruct the neural activity through a linear decoder with no time lag. We implement TNDM as a sequential variational autoencoder and validate it on simulated recordings and recordings taken from the premotor and motor cortex of a monkey performing a center-out reaching task. We show that TNDM is able to learn low-dimensional latent dynamics that are highly predictive of behaviour without sacrificing its fit to the neural data. △ Less

Submitted 27 October, 2021; originally announced October 2021.

arXiv:2011.02791 [pdf]

doi 10.1080/14786435.2021.1925770

A geometry-based relaxation algorithm for equilibrating a trivalent polygonal network in two dimensions and its implications

Authors: Kai Xu

Abstract: The equilibration of a trivalent polygonal network in two dimensions (2D) is a universal phenomenon in nature, but the underlying mathematical mechanism remains unclear. In this study, a relaxation algorithm based on a simple geometrical rule was developed to simulate the equilibration. The proposed algorithm was implemented in Python language. The simulated relaxation changed the polygonal cell o… ▽ More The equilibration of a trivalent polygonal network in two dimensions (2D) is a universal phenomenon in nature, but the underlying mathematical mechanism remains unclear. In this study, a relaxation algorithm based on a simple geometrical rule was developed to simulate the equilibration. The proposed algorithm was implemented in Python language. The simulated relaxation changed the polygonal cell of the Voronoi network from an ellipse's inscribed polygon toward the ellipse's maximal inscribed polygon. Meanwhile, the Aboav-Weaire's law, which describes the neighboring relationship between cells, still holds statistically. The succeed of simulation strongly supports the ellipse packing hypothesis that was proposed to explain the dynamic behaviors of a trivalent 2D structure. The simulation results also showed that the edge of large cells tends to be shorter than edges of small cells, and vice versa. In addition, the relaxation increases the area and edge length of large cells, and it decreases the area and edge length of small cells. The pattern of changes in the area of different-edged cells due to relaxation is almost the same as the growth pattern described by the von-Neumann-Mullins law. The results presented in this work can help to understand the mathematical mechanisms of the dynamic behaviors of trivalent 2D structures. △ Less

Submitted 4 January, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

Journal ref: Philosophical Magazine, 2021, 101(14): 1632-1653

arXiv:2005.04073 [pdf, other]

doi 10.1109/EMBC44109.2020.9175293

Multi-Instance Multi-Label Learning for Gene Mutation Prediction in Hepatocellular Carcinoma

Authors: Kaixin Xu, Ziyuan Zhao, Jiapan Gu, Zeng Zeng, Chan Wan Ying, Lim Kheng Choon, Thng Choon Hua, Pierce KH Chow

Abstract: Gene mutation prediction in hepatocellular carcinoma (HCC) is of great diagnostic and prognostic value for personalized treatments and precision medicine. In this paper, we tackle this problem with multi-instance multi-label learning to address the difficulties on label correlations, label representations, etc. Furthermore, an effective oversampling strategy is applied for data imbalance. Experime… ▽ More Gene mutation prediction in hepatocellular carcinoma (HCC) is of great diagnostic and prognostic value for personalized treatments and precision medicine. In this paper, we tackle this problem with multi-instance multi-label learning to address the difficulties on label correlations, label representations, etc. Furthermore, an effective oversampling strategy is applied for data imbalance. Experimental results have shown the superiority of the proposed approach. △ Less

Submitted 8 May, 2020; originally announced May 2020.

Comments: Accepted version to be published in the 42nd IEEE Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2020, Montreal, Canada

Journal ref: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)

arXiv:1911.02301 [pdf, other]

doi 10.1007/s11071-021-06704-9

Diversity of neuronal activity is provided by hybrid synapses

Authors: Kesheng Xu, Jean Paul Maidana, Patricio Orio

Abstract: Many experiments have evidenced that electrical and chemical synapses -- hybrid synapses -- coexist in most organisms and brain structures. The role of electrical and chemical synapse connection in diversity of neural activity generation has been investigated separately in networks of varying complexities. Nevertheless, theoretical understanding of hybrid synapses in diverse dynamical states of ne… ▽ More Many experiments have evidenced that electrical and chemical synapses -- hybrid synapses -- coexist in most organisms and brain structures. The role of electrical and chemical synapse connection in diversity of neural activity generation has been investigated separately in networks of varying complexities. Nevertheless, theoretical understanding of hybrid synapses in diverse dynamical states of neural networks for self-organization and robustness still has not been fully studied. Here, we present a model of neural network built with hybrid synapses to investigate the emergence of global and collective dynamics states. This neural networks consists of excitatory and inhibitory population interacting together. The excitatory population is connected by excitatory synapses in small world topology and its adjacent neurons are also connected by gap junctions. The inhibitory population is only connected by chemical inhibitory synapses with all-to-all interaction. Our numerical simulations show that in the balanced networks with absence of electrical coupling, the synchrony states generated by this architecture are mainly controlled by heterogeneity among neurons and the balance of its excitatory and inhibitory inputs. In balanced networks with strong electrical coupling, several dynamical states arise from different combinations of excitatory and inhibitory weights. More importantly, we find that these states, such as synchronous firing, cluster synchrony, and various ripples events, emerge by slight modification of chemical coupling weights. For large enough electrical synapse coupling, the whole neural networks become synchronized. Our results pave a way in the study of the dynamical mechanisms and computational significance of the contribution of mixed synapse in the neural functions. △ Less

Submitted 6 November, 2019; originally announced November 2019.

Journal ref: Nonlinear Dynamics ,2021

arXiv:1905.12375 [pdf, other]

Scalable Spike Source Localization in Extracellular Recordings using Amortized Variational Inference

Authors: Cole L. Hurwitz, Kai Xu, Akash Srivastava, Alessio P. Buccino, Matthias H. Hennig

Abstract: Determining the positions of neurons in an extracellular recording is useful for investigating functional properties of the underlying neural circuitry. In this work, we present a Bayesian modelling approach for localizing the source of individual spikes on high-density, microelectrode arrays. To allow for scalable inference, we implement our model as a variational autoencoder and perform amortize… ▽ More Determining the positions of neurons in an extracellular recording is useful for investigating functional properties of the underlying neural circuitry. In this work, we present a Bayesian modelling approach for localizing the source of individual spikes on high-density, microelectrode arrays. To allow for scalable inference, we implement our model as a variational autoencoder and perform amortized variational inference. We evaluate our method on both biophysically realistic simulated and real extracellular datasets, demonstrating that it is more accurate than and can improve spike sorting performance over heuristic localization methods such as center of mass. △ Less

Submitted 26 January, 2022; v1 submitted 29 May, 2019; originally announced May 2019.

arXiv:1901.00785 [pdf, other]

A^2-Net: Molecular Structure Estimation from Cryo-EM Density Volumes

Authors: Kui Xu, Zhe Wang, Jiang** Shi, Hongsheng Li, Qiangfeng Cliff Zhang

Abstract: Constructing of molecular structural models from Cryo-Electron Microscopy (Cryo-EM) density volumes is the critical last step of structure determination by Cryo-EM technologies. Methods have evolved from manual construction by structural biologists to perform 6D translation-rotation searching, which is extremely compute-intensive. In this paper, we propose a learning-based method and formulate thi… ▽ More Constructing of molecular structural models from Cryo-Electron Microscopy (Cryo-EM) density volumes is the critical last step of structure determination by Cryo-EM technologies. Methods have evolved from manual construction by structural biologists to perform 6D translation-rotation searching, which is extremely compute-intensive. In this paper, we propose a learning-based method and formulate this problem as a vision-inspired 3D detection and pose estimation task. We develop a deep learning framework for amino acid determination in a 3D Cryo-EM density volume. We also design a sequence-guided Monte Carlo Tree Search (MCTS) to thread over the candidate amino acids to form the molecular structure. This framework achieves 91% coverage on our newly proposed dataset and takes only a few minutes for a typical structure with a thousand amino acids. Our method is hundreds of times faster and several times more accurate than existing automated solutions without any human intervention. △ Less

Submitted 12 February, 2019; v1 submitted 3 January, 2019; originally announced January 2019.

Comments: 8 pages, 5 figures, 4 tables

Journal ref: published on AAAI2019

arXiv:1006.4397 [pdf, other]

Statistical analysis on detecting recombination sites in DNA-beta satellites associated with the old world geminiviruses

Authors: Kai Xu, Ruriko Yoshida

Abstract: Although an exchange of genetic information by recombination plays an important role in the evolution of viruses, it is not clear how it generates diversity. {\it Geminiviruses} are plant viruses which have ambisense single-stranded circular DNA genomes and one of the most economically important plant viruses in agricultural production. Small circular single-stranded DNA satellites, termed DNA-… ▽ More Although an exchange of genetic information by recombination plays an important role in the evolution of viruses, it is not clear how it generates diversity. {\it Geminiviruses} are plant viruses which have ambisense single-stranded circular DNA genomes and one of the most economically important plant viruses in agricultural production. Small circular single-stranded DNA satellites, termed DNA-$β$, have recently been found associated with some geminivirus infections. In this paper we analyze a satellite molecule DNA-$β$ of geminiviruses for recombination events using phylogenetic and statistical analysis and we find that one strain from ToLCMaB has a recombination pattern and is possibly recombinant molecule between two strains from two species, PaLCuB-[IN:Chi:05] (major parent) and ToLCB-[IN:CP:04] (minor parent). △ Less

Submitted 12 September, 2010; v1 submitted 22 June, 2010; originally announced June 2010.

Comments: 8 figures and 2 tables. To appear in Frontiers in Systems Biology

Showing 1–13 of 13 results for author: Xu, K