Search | arXiv e-print repository

Gene-associated Disease Discovery Powered by Large Language Models

Authors: Jiayu Chang, Shiyu Wang, Chen Ling, Zhaohui Qin, Liang Zhao

Abstract: The intricate relationship between genetic variation and human diseases has been a focal point of medical research, evidenced by the identification of risk genes regarding specific diseases. The advent of advanced genome sequencing techniques has significantly improved the efficiency and cost-effectiveness of detecting these genetic markers, playing a crucial role in disease diagnosis and forming… ▽ More The intricate relationship between genetic variation and human diseases has been a focal point of medical research, evidenced by the identification of risk genes regarding specific diseases. The advent of advanced genome sequencing techniques has significantly improved the efficiency and cost-effectiveness of detecting these genetic markers, playing a crucial role in disease diagnosis and forming the basis for clinical decision-making and early risk assessment. To overcome the limitations of existing databases that record disease-gene associations from existing literature, which often lack real-time updates, we propose a novel framework employing Large Language Models (LLMs) for the discovery of diseases associated with specific genes. This framework aims to automate the labor-intensive process of sifting through medical literature for evidence linking genetic variations to diseases, thereby enhancing the efficiency of disease identification. Our approach involves using LLMs to conduct literature searches, summarize relevant findings, and pinpoint diseases related to specific genes. This paper details the development and application of our LLM-powered framework, demonstrating its potential in streamlining the complex process of literature retrieval and summarization to identify diseases associated with specific genetic variations. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: This is the official paper accepted by AAAI 2024 Workshop on Large Language Models for Biological Discoveries

arXiv:2006.05509 [pdf]

Can artificial intelligence (AI) be used to accurately detect tuberculosis (TB) from chest X-rays? An evaluation of five AI products for TB screening and triaging in a high TB burden setting

Authors: Zhi Zhen Qin, Shahriar Ahmed, Mohammad Shahnewaz Sarker, Kishor Paul, Ahammad Shafiq Sikder Adel, Tasneem Naheyan, Rachael Barrett, Sayera Banu, Jacob Creswell

Abstract: Artificial intelligence (AI) products can be trained to recognize tuberculosis (TB)-related abnormalities on chest radiographs. Various AI products are available commercially, yet there is lack of evidence on how their performance compared with each other and with radiologists. We evaluated five AI software products for screening and triaging TB using a large dataset that had not been used to trai… ▽ More Artificial intelligence (AI) products can be trained to recognize tuberculosis (TB)-related abnormalities on chest radiographs. Various AI products are available commercially, yet there is lack of evidence on how their performance compared with each other and with radiologists. We evaluated five AI software products for screening and triaging TB using a large dataset that had not been used to train any commercial AI products. Individuals (>=15 years old) presenting to three TB screening centers in Dhaka, Bangladesh, were recruited consecutively. All CXR were read independently by a group of three Bangladeshi registered radiologists and five commercial AI products: CAD4TB (v7), InferReadDR (v2), Lunit INSIGHT CXR (v4.9.0), JF CXR-1 (v2), and qXR (v3). All five AI products significantly outperformed the Bangladeshi radiologists. The areas under the receiver operating characteristic curve are qXR: 90.81% (95% CI:90.33-91.29%), CAD4TB: 90.34% (95% CI:89.81-90.87), Lunit INSIGHT CXR: 88.61% (95% CI:88.03%-89.20%), InferReadDR: 84.90% (95% CI: 84.27-85.54%) and JF CXR-1: 84.89% (95% CI:84.26-85.53%). Only qXR met the TPP with 74.3% specificity at 90% sensitivity. Five AI algorithms can reduce the number of Xpert tests required by 50%, while maintaining a sensitivity above 90%. All AI algorithms performed worse among the older age and people with prior TB history. AI products can be highly accurate and useful screening and triage tools for TB detection in high burden regions and outperform human readers. △ Less

Submitted 28 May, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

Comments: 43 pages, 3 Tables 3 Figures

MSC Class: 92B20 ACM Class: I.2.1

arXiv:1902.08357 [pdf]

Cognitive computation of brain disorders based primarily on ocular responses

Authors: Xiaotao Li, Xue**g Chen, Fangfang Fan, Li Ning, Kangguang Lin, Zan Chen, Zhenyun Qin, Albert S. Yeung, Li** Wang, Xiaojian Li, Kwok-Fai So

Abstract: The present review presents multiple techniques in which ocular assessments may serve as a noninvasive approach for the early diagnoses of various cognitive and psychiatric disorders, such as Alzheimer's disease (AD), autism spectrum disorder (ASD), schizophrenia (SZ), and major depressive disorder (MDD). Real-time ocular responses are tightly associated with emotional and cognitive processing wit… ▽ More The present review presents multiple techniques in which ocular assessments may serve as a noninvasive approach for the early diagnoses of various cognitive and psychiatric disorders, such as Alzheimer's disease (AD), autism spectrum disorder (ASD), schizophrenia (SZ), and major depressive disorder (MDD). Real-time ocular responses are tightly associated with emotional and cognitive processing within the central nervous system. Patterns seen in saccades, pupillary responses, and blinking, as well as retinal microvasculature and morphology visualized via office-based ophthalmic imaging, are potential biomarkers for the screening and evaluation of cognitive and psychiatric disorders. Additionally, rapid advances in artificial intelligence (AI) present a growing opportunity to use machine-learning-based AI, especially deep-learning neural networks, to shed new light on the field of cognitive neuroscience, which may lead to novel evaluations and interventions via ocular approaches for cognitive and psychiatric disorders. △ Less

Submitted 3 April, 2020; v1 submitted 21 February, 2019; originally announced February 2019.

arXiv:1504.06463 [pdf]

The dichotomy structure of Y chromosome Haplogroup N

Authors: Kang Hu, Shi Yan, Kai Liu, Chao Ning, Lan-Hai Wei, Shi-Lin Li, Bing Song, Ge Yu, Feng Chen, Li-Jun Liu, Zhi-Peng Zhao, Chuan-Chao Wang, Ya-Jun Yang, Zhen-Dong Qin, **g-Ze Tan, Fu-Zhong Xue, Hui Li, Long-Li Kang, Li **

Abstract: Haplogroup N-M231 of human Y chromosome is a common clade from Eastern Asia to Northern Europe, being one of the most frequent haplogroups in Altaic and Uralic-speaking populations. Using newly discovered bi-allelic markers from high-throughput DNA sequencing, we largely improved the phylogeny of Haplogroup N, in which 16 subclades could be identified by 33 SNPs. More than 400 males belonging to H… ▽ More Haplogroup N-M231 of human Y chromosome is a common clade from Eastern Asia to Northern Europe, being one of the most frequent haplogroups in Altaic and Uralic-speaking populations. Using newly discovered bi-allelic markers from high-throughput DNA sequencing, we largely improved the phylogeny of Haplogroup N, in which 16 subclades could be identified by 33 SNPs. More than 400 males belonging to Haplogroup N in 34 populations in China were successfully genotyped, and populations in Northern Asia and Eastern Europe were also compared together. We found that all the N samples were typed as inside either clade N1-F1206 (including former N1a-M128, N1b-P43 and N1c-M46 clades), most of which were found in Altaic, Uralic, Russian and Chinese-speaking populations, or N2-F2930, common in Tibeto-Burman and Chinese-speaking populations. Our detailed results suggest that Haplogroup N developed in the region of China since the final stage of late Paleolithic Era. △ Less

Submitted 24 April, 2015; originally announced April 2015.

Comments: main text 14 pages, 3 figures, 1 table, 3 SI tables

arXiv:1310.3897 [pdf]

doi 10.1371/journal.pone.0105691

Y Chromosomes of 40% Chinese Are Descendants of Three Neolithic Super-grandfathers

Authors: Shi Yan, Chuan-Chao Wang, Hong-Xiang Zheng, Wei Wang, Zhen-Dong Qin, Lan-Hai Wei, Yi Wang, Xue-Dong Pan, Wen-Qing Fu, Yun-Gang He, Li-Jun Xiong, Wen-Fei **, Shi-Lin Li, Yu An, Hui Li, Li **

Abstract: Demographic change of human populations is one of the central questions for delving into the past of human beings. To identify major population expansions related to male lineages, we sequenced 78 East Asian Y chromosomes at 3.9 Mbp of the non-recombining region (NRY), discovered >4,000 new SNPs, and identified many new clades. The relative divergence dates can be estimated much more precisely usi… ▽ More Demographic change of human populations is one of the central questions for delving into the past of human beings. To identify major population expansions related to male lineages, we sequenced 78 East Asian Y chromosomes at 3.9 Mbp of the non-recombining region (NRY), discovered >4,000 new SNPs, and identified many new clades. The relative divergence dates can be estimated much more precisely using molecular clock. We found that all the Paleolithic divergences were binary; however, three strong star-like Neolithic expansions at ~6 kya (thousand years ago) (assuming a constant substitution rate of 1e-9/bp/year) indicates that ~40% of modern Chinese are patrilineal descendants of only three super-grandfathers at that time. This observation suggests that the main patrilineal expansion in China occurred in the Neolithic Era and might be related to the development of agriculture. △ Less

Submitted 14 October, 2013; originally announced October 2013.

Comments: 29 pages of article text including 1 article figure, 9 pages of SI text, and 2 SI figures. 5 SI tables are in a separate ancillary file

Journal ref: Plos ONE 9(8): e105691 (2014)

Showing 1–5 of 5 results for author: Qin, Z