Search | arXiv e-print repository

Self-Attention in Transformer Networks Explains Monkeys' Gaze Pattern in Pac-Man Game

Authors: Zhongqiao Lin, Yunwei Li, Tianming Yang

Abstract: We proactively direct our eyes and attention to collect information during problem solving and decision making. Understanding gaze patterns is crucial for gaining insights into the computation underlying the problem-solving process. However, there is a lack of interpretable models that can account for how the brain directs the eyes to collect information and utilize it, especially in the context o… ▽ More We proactively direct our eyes and attention to collect information during problem solving and decision making. Understanding gaze patterns is crucial for gaining insights into the computation underlying the problem-solving process. However, there is a lack of interpretable models that can account for how the brain directs the eyes to collect information and utilize it, especially in the context of complex problem solving. In the current study, we analyzed the gaze patterns of two monkeys playing the Pac-Man game. We trained a transformer network to mimic the monkeys' gameplay and found its attention pattern captures the monkeys' eye movements. In addition, the prediction based on the transformer network's attention outperforms the human subjects' predictions. Importantly, we dissected the computation underlying the attention mechanism of the transformer network, revealing its layered structures reflecting a value-based attention component and a component that captures the interactions between Pac-Man and other game objects. Based on these findings, we built a condensed attention model that is not only as accurate as the transformer network but also fully interpretable. Our results highlight the potential of using transformer neural networks to model and understand the cognitive processes underlying complex problem solving in the brain, opening new avenues for investigating the neural basis of cognition. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2401.09641 [pdf, ps, other]

Functional Linear Non-Gaussian Acyclic Model for Causal Discovery

Authors: Tian-Le Yang, Kuang-Yao Lee, Kun Zhang, Joe Suzuki

Abstract: In causal discovery, non-Gaussianity has been used to characterize the complete configuration of a Linear Non-Gaussian Acyclic Model (LiNGAM), encompassing both the causal ordering of variables and their respective connection strengths. However, LiNGAM can only deal with the finite-dimensional case. To expand this concept, we extend the notion of variables to encompass vectors and even functions,… ▽ More In causal discovery, non-Gaussianity has been used to characterize the complete configuration of a Linear Non-Gaussian Acyclic Model (LiNGAM), encompassing both the causal ordering of variables and their respective connection strengths. However, LiNGAM can only deal with the finite-dimensional case. To expand this concept, we extend the notion of variables to encompass vectors and even functions, leading to the Functional Linear Non-Gaussian Acyclic Model (Func-LiNGAM). Our motivation stems from the desire to identify causal relationships in brain-effective connectivity tasks involving, for example, fMRI and EEG datasets. We demonstrate why the original LiNGAM fails to handle these inherently infinite-dimensional datasets and explain the availability of functional data analysis from both empirical and theoretical perspectives. {We establish theoretical guarantees of the identifiability of the causal relationship among non-Gaussian random vectors and even random functions in infinite-dimensional Hilbert spaces.} To address the issue of sparsity in discrete time points within intrinsic infinite-dimensional functional data, we propose optimizing the coordinates of the vectors using functional principal component analysis. Experimental results on synthetic data verify the ability of the proposed framework to identify causal relationships among multivariate functions using the observed samples. For real data, we focus on analyzing the brain connectivity patterns derived from fMRI data. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2311.11046 [pdf]

DenseNet and Support Vector Machine classifications of major depressive disorder using vertex-wise cortical features

Authors: Vladimir Belov, Tracy Erwin-Grabner, Ling-Li Zeng, Christopher R. K. Ching, Andre Aleman, Alyssa R. Amod, Zeynep Basgoze, Francesco Benedetti, Bianca Besteher, Katharina Brosch, Robin Bülow, Romain Colle, Colm G. Connolly, Emmanuelle Corruble, Baptiste Couvy-Duchesne, Kathryn Cullen, Udo Dannlowski, Christopher G. Davey, Annemiek Dols, Jan Ernsting, Jennifer W. Evans, Lukas Fisch, Paola Fuentes-Claramonte, Ali Saffet Gonul, Ian H. Gotlib , et al. (63 additional authors not shown)

Abstract: Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, h… ▽ More Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, has the potential to provide diagnostic and predictive biomarkers for MDD. However, previous attempts to demarcate MDD patients and healthy controls (HC) based on segmented cortical features via linear machine learning approaches have reported low accuracies. In this study, we used globally representative data from the ENIGMA-MDD working group containing an extensive sample of people with MDD (N=2,772) and HC (N=4,240), which allows a comprehensive analysis with generalizable results. Based on the hypothesis that integration of vertex-wise cortical features can improve classification performance, we evaluated the classification of a DenseNet and a Support Vector Machine (SVM), with the expectation that the former would outperform the latter. As we analyzed a multi-site sample, we additionally applied the ComBat harmonization tool to remove potential nuisance effects of site. We found that both classifiers exhibited close to chance performance (balanced accuracy DenseNet: 51%; SVM: 53%), when estimated on unseen sites. Slightly higher classification performance (balanced accuracy DenseNet: 58%; SVM: 55%) was found when the cross-validation folds contained subjects from all sites, indicating site effect. In conclusion, the integration of vertex-wise morphometric features and the use of the non-linear classifier did not lead to the differentiability between MDD and HC. Our results support the notion that MDD classification on this combination of features and classifiers is unfeasible. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:2206.08122 [pdf]

Multi-site benchmark classification of major depressive disorder using machine learning on cortical and subcortical measures

Authors: Vladimir Belov, Tracy Erwin-Grabner, Ali Saffet Gonul, Alyssa R. Amod, Amar Ojha, Andre Aleman, Annemiek Dols, Anouk Scharntee, Aslihan Uyar-Demir, Ben J Harrison, Benson M. Irungu, Bianca Besteher, Bonnie Klimes-Dougan, Brenda W. J. H. Penninx, Bryon A. Mueller, Carlos Zarate, Christopher G. Davey, Christopher R. K. Ching, Colm G. Connolly, Cynthia H. Y. Fu, Dan J. Stein, Danai Dima, David E. J. Linden, David M. A. Mehler, Edith Pomarol-Clotet , et al. (41 additional authors not shown)

Abstract: Machine learning (ML) techniques have gained popularity in the neuroimaging field due to their potential for classifying neuropsychiatric disorders. However, the diagnostic predictive power of the existing algorithms has been limited by small sample sizes, lack of representativeness, data leakage, and/or overfitting. Here, we overcome these limitations with the largest multi-site sample size to da… ▽ More Machine learning (ML) techniques have gained popularity in the neuroimaging field due to their potential for classifying neuropsychiatric disorders. However, the diagnostic predictive power of the existing algorithms has been limited by small sample sizes, lack of representativeness, data leakage, and/or overfitting. Here, we overcome these limitations with the largest multi-site sample size to date (n=5,356) to provide a generalizable ML classification benchmark of major depressive disorder (MDD). Using brain measures from standardized ENIGMA analysis pipelines in FreeSurfer, we were able to classify MDD vs healthy controls (HC) with around 62% balanced accuracy, but when harmonizing the data using ComBat balanced accuracy dropped to approximately 52%. Similar results were observed in stratified groups according to age of onset, antidepressant use, number of episodes and sex. Future studies incorporating higher dimensional brain imaging/phenotype features, and/or using more advanced machine and deep learning methods may achieve more encouraging prospects. △ Less

Submitted 25 October, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

Comments: main document 37 pages; supplementary material 24 pages

arXiv:2012.01981 [pdf, other]

Advanced Graph and Sequence Neural Networks for Molecular Property Prediction and Drug Discovery

Authors: Zhengyang Wang, Meng Liu, Youzhi Luo, Zhao Xu, Yaochen Xie, Limei Wang, Lei Cai, Qi Qi, Zhuoning Yuan, Tianbao Yang, Shuiwang Ji

Abstract: Properties of molecules are indicative of their functions and thus are useful in many applications. With the advances of deep learning methods, computational approaches for predicting molecular properties are gaining increasing momentum. However, there lacks customized and advanced methods and comprehensive tools for this task currently. Here we develop a suite of comprehensive machine learning me… ▽ More Properties of molecules are indicative of their functions and thus are useful in many applications. With the advances of deep learning methods, computational approaches for predicting molecular properties are gaining increasing momentum. However, there lacks customized and advanced methods and comprehensive tools for this task currently. Here we develop a suite of comprehensive machine learning methods and tools spanning different computational models, molecular representations, and loss functions for molecular property prediction and drug discovery. Specifically, we represent molecules as both graphs and sequences. Built on these representations, we develop novel deep models for learning from molecular graphs and sequences. In order to learn effectively from highly imbalanced datasets, we develop advanced loss functions that optimize areas under precision-recall curves. Altogether, our work not only serves as a comprehensive tool, but also contributes towards develo** novel and advanced graph and sequence learning methodologies. Results on both online and offline antibiotics discovery and molecular property prediction tasks show that our methods achieve consistent improvements over prior methods. In particular, our methods achieve #1 ranking in terms of both ROC-AUC and PRC-AUC on the AI Cures Open Challenge for drug discovery related to COVID-19. Our software is released as part of the MoleculeX library under AdvProp. △ Less

Submitted 6 July, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

Comments: Supplementary Material: https://github.com/divelab/MoleculeX/blob/master/AdvProp/AdvProp_supp.pdf

arXiv:2008.05644 [pdf, other]

A Deep Learning Approach for COVID-19 Trend Prediction

Authors: Tong Yang, Long Sha, Justin Li, Pengyu Hong

Abstract: In this work, we developed a deep learning model-based approach to forecast the spreading trend of SARS-CoV-2 in the United States. We implemented the designed model using the United States to confirm cases and state demographic data and achieved promising trend prediction results. The model incorporates demographic information and epidemic time-series data through a Gated Recurrent Unit structure… ▽ More In this work, we developed a deep learning model-based approach to forecast the spreading trend of SARS-CoV-2 in the United States. We implemented the designed model using the United States to confirm cases and state demographic data and achieved promising trend prediction results. The model incorporates demographic information and epidemic time-series data through a Gated Recurrent Unit structure. The identification of dominating demographic factors is delivered in the end. △ Less

Submitted 9 August, 2020; originally announced August 2020.

Comments: 7 pages, 11 figures, accepted by KDD 2020 epiDAMIK workshop

arXiv:2005.10948 [pdf, other]

CovidNet: To Bring Data Transparency in the Era of COVID-19

Authors: Tong Yang, Kai Shen, Sixuan He, Enyu Li, Peter Sun, **ying Chen, Lin Zuo, Jiayue Hu, Yiwen Mo, Weiwei Zhang, Haonan Zhang, **gxue Chen, Yu Guo

Abstract: Timely, creditable, and fine-granular case information is vital for local communities and individual citizens to make rational and data-driven responses to the COVID-19 pandemic. This paper presents CovidNet, a COVID-19 tracking project associated with a large scale epidemic dataset, which was initiated by 1Point3Acres. To the best of our knowledge, the project is the only platform providing real-… ▽ More Timely, creditable, and fine-granular case information is vital for local communities and individual citizens to make rational and data-driven responses to the COVID-19 pandemic. This paper presents CovidNet, a COVID-19 tracking project associated with a large scale epidemic dataset, which was initiated by 1Point3Acres. To the best of our knowledge, the project is the only platform providing real-time global case information of more than 4,124 sub-divisions from over 27 countries worldwide with multi-language supports. The platform also offers interactive visualization tools to analyze the full historical case curves in each region. Initially launched as a voluntary project to bridge the data transparency gap in North America in January 2020, this project by far has become one of the major independent sources worldwide and has been consumed by many other tracking platforms. The accuracy and freshness of the dataset is a result of the painstaking efforts from our voluntary teamwork, crowd-sourcing channels, and automated data pipelines. As of May 18, 2020, the project website has been visited more than 200 million times and the CovidNet dataset has empowered over 522 institutions and organizations worldwide in policy-making and academic researches. All datasets are openly accessible for non-commercial purposes at https://coronavirus.1point3acres.com via a formal request through our APIs. △ Less

Submitted 20 July, 2020; v1 submitted 21 May, 2020; originally announced May 2020.

Comments: 10 pages, 5 figures, 2 tables

arXiv:2003.00110 [pdf]

doi 10.1186/s13059-021-02443-7

Technology dictates algorithms: Recent developments in read alignment

Authors: Mohammed Alser, Jeremy Rotman, Kodi Taraszka, Huwenbo Shi, Pelin Icer Baykal, Harry Taegyun Yang, Victor Xue, Sergey Knyazev, Benjamin D. Singer, Brunilda Balliu, David Koslicki, Pavel Skums, Alex Zelikovsky, Can Alkan, Onur Mutlu, Serghei Mangul

Abstract: Massively parallel sequencing techniques have revolutionized biological and medical sciences by providing unprecedented insight into the genomes of humans, animals, and microbes. Modern sequencing platforms generate enormous amounts of genomic data in the form of nucleotide sequences or reads. Aligning reads onto reference genomes enables the identification of individual-specific genetic variants… ▽ More Massively parallel sequencing techniques have revolutionized biological and medical sciences by providing unprecedented insight into the genomes of humans, animals, and microbes. Modern sequencing platforms generate enormous amounts of genomic data in the form of nucleotide sequences or reads. Aligning reads onto reference genomes enables the identification of individual-specific genetic variants and is an essential step of the majority of genomic analysis pipelines. Aligned reads are essential for answering important biological questions, such as detecting mutations driving various human diseases and complex traits as well as identifying species present in metagenomic samples. The read alignment problem is extremely challenging due to the large size of analyzed datasets and numerous technological limitations of sequencing platforms, and researchers have developed novel bioinformatics algorithms to tackle these difficulties. Importantly, computational algorithms have evolved and diversified in accordance with technological advances, leading to todays diverse array of bioinformatics tools. Our review provides a survey of algorithmic foundations and methodologies across 107 alignment methods published between 1988 and 2020, for both short and long reads. We provide rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read aligners. We separately discuss how longer read lengths produce unique advantages and limitations to read alignment techniques. We also discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology, including whole transcriptome, adaptive immune repertoire, and human microbiome studies. △ Less

Submitted 9 July, 2020; v1 submitted 28 February, 2020; originally announced March 2020.

Journal ref: Genome Biol . Aug 26;22(1):249, 2021

arXiv:1709.03645 [pdf, other]

Identifying Genetic Risk Factors via Sparse Group Lasso with Group Graph Structure

Authors: Tao Yang, Paul Thompson, Sihai Zhao, Jie** Ye

Abstract: Genome-wide association studies (GWA studies or GWAS) investigate the relationships between genetic variants such as single-nucleotide polymorphisms (SNPs) and individual traits. Recently, incorporating biological priors together with machine learning methods in GWA studies has attracted increasing attention. However, in real-world, nucleotide-level bio-priors have not been well-studied to date. A… ▽ More Genome-wide association studies (GWA studies or GWAS) investigate the relationships between genetic variants such as single-nucleotide polymorphisms (SNPs) and individual traits. Recently, incorporating biological priors together with machine learning methods in GWA studies has attracted increasing attention. However, in real-world, nucleotide-level bio-priors have not been well-studied to date. Alternatively, studies at gene-level, for example, protein--protein interactions and pathways, are more rigorous and legitimate, and it is potentially beneficial to utilize such gene-level priors in GWAS. In this paper, we proposed a novel two-level structured sparse model, called Sparse Group Lasso with Group-level Graph structure (SGLGG), for GWAS. It can be considered as a sparse group Lasso along with a group-level graph Lasso. Essentially, SGLGG penalizes the nucleotide-level sparsity as well as takes advantages of gene-level priors (both gene groups and networks), to identifying phenotype-associated risk SNPs. We employ the alternating direction method of multipliers algorithm to optimize the proposed model. Our experiments on the Alzheimer's Disease Neuroimaging Initiative whole genome sequence data and neuroimage data demonstrate the effectiveness of SGLGG. As a regression model, it is competitive to the state-of-the-arts sparse models; as a variable selection method, SGLGG is promising for identifying Alzheimer's disease-related risk SNPs. △ Less

Submitted 11 September, 2017; originally announced September 2017.

arXiv:1403.1914

doi 10.1142/S021797921450235X

Desynchronization in an ensemble of globally coupled chaotic bursting neuronal oscillators by dynamic delayed feedback control

Authors: Yanqiu Che, Tingting Yang, Ruixue Li, Huiyan Li, Chunxiao Han, Jiang Wang, Xile Wei

Abstract: In this paper, we propose a dynamic delayed feedback control approach for desynchronization of chaotic-bursting synchronous activities in an ensemble of globally coupled neuronal oscillators. We demonstrate that the difference signal between an ensemble's mean field and its time delayed state, filtered and fed back to the ensemble, can suppress the self-synchronization in the ensemble. These indiv… ▽ More In this paper, we propose a dynamic delayed feedback control approach for desynchronization of chaotic-bursting synchronous activities in an ensemble of globally coupled neuronal oscillators. We demonstrate that the difference signal between an ensemble's mean field and its time delayed state, filtered and fed back to the ensemble, can suppress the self-synchronization in the ensemble. These individual units are decoupled and stabilized at the desired desynchronized states while the stimulation signal reduces to the noise level. The effectiveness of the method is illustrated by examples of two different populations of globally coupled chaotic-bursting neurons. The proposed method has potential for mild, effective and demand-controlled therapy of neurological diseases characterized by pathological synchronization. △ Less

Submitted 11 March, 2014; v1 submitted 7 March, 2014; originally announced March 2014.

Comments: This paper has been withdrawn by the author due to a crucial sign error in equation 3

arXiv:1306.2584 [pdf, other]

Multi-cancer molecular signatures and their interrelationships

Authors: Wei-Yi Cheng, Tai-Hsien Ou Yang, Hui Shen, Peter W. Laird, Dimitris Anastassiou, the Cancer Genome Atlas Research Network

Abstract: Although cancer is known to be characterized by several unifying biological hallmarks, systems biology has had limited success in identifying molecular signatures present in in all types of cancer. The current availability of rich data sets from many different cancer types provides an opportunity for thorough computational data mining in search of such common patterns. Here we report the identific… ▽ More Although cancer is known to be characterized by several unifying biological hallmarks, systems biology has had limited success in identifying molecular signatures present in in all types of cancer. The current availability of rich data sets from many different cancer types provides an opportunity for thorough computational data mining in search of such common patterns. Here we report the identification of 18 "pan-cancer" molecular signatures resulting from analysis of data sets containing values from mRNA expression, microRNA expression, DNA methylation, and protein activity, from twelve different cancer types. The membership of many of these signatures points to particular biological mechanisms related to cancer progression, suggesting that they represent important attributes of cancer in need of being elucidated for potential applications in diagnostic, prognostic and therapeutic products applicable to multiple cancer types. △ Less

Submitted 11 July, 2013; v1 submitted 11 June, 2013; originally announced June 2013.

Comments: [07.11.2013 v2] Additional authors and acknowledgements for people who contributed to the interpretation of attractor signatures. Summarized table for all 18 signatures. Comments on possible functions

Showing 1–11 of 11 results for author: Yang, T