Search | arXiv e-print repository

arXiv:2403.00093 [pdf]

Synthesizing study-specific controls using generative models on open access datasets for harmonized multi-study analyses

Authors: Shruti P. Gadewar, Alyssa H. Zhu, Iyad Ba Gari, Sunanda Somu, Sophia I. Thomopoulos, Paul M. Thompson, Talia M. Nir, Neda Jahanshad

Abstract: Neuroimaging consortia can enhance reliability and generalizability of findings by pooling data across studies to achieve larger sample sizes. To adjust for site and MRI protocol effects, imaging datasets are often harmonized based on healthy controls. When data from a control group were not collected, statistical harmonization options are limited as patient characteristics and acquisition-related… ▽ More Neuroimaging consortia can enhance reliability and generalizability of findings by pooling data across studies to achieve larger sample sizes. To adjust for site and MRI protocol effects, imaging datasets are often harmonized based on healthy controls. When data from a control group were not collected, statistical harmonization options are limited as patient characteristics and acquisition-related variables may be confounded. Here, in a multi-study neuroimaging analysis of Alzheimer's patients and controls, we tested whether it is possible to generate synthetic control MRIs. For one case-control study, we used a generative adversarial model for style-based harmonization to generate site-specific controls. Downstream feature extraction, statistical harmonization and group-level multi-study case-control and case-only analyses were performed twice, using either true or synthetic controls. All effect sizes using synthetic controls overlapped with those based on true study controls. This line of work may facilitate wider inclusion of case-only studies in multi-study consortia. △ Less

Submitted 29 February, 2024; originally announced March 2024.

arXiv:2311.11046 [pdf]

DenseNet and Support Vector Machine classifications of major depressive disorder using vertex-wise cortical features

Authors: Vladimir Belov, Tracy Erwin-Grabner, Ling-Li Zeng, Christopher R. K. Ching, Andre Aleman, Alyssa R. Amod, Zeynep Basgoze, Francesco Benedetti, Bianca Besteher, Katharina Brosch, Robin Bülow, Romain Colle, Colm G. Connolly, Emmanuelle Corruble, Baptiste Couvy-Duchesne, Kathryn Cullen, Udo Dannlowski, Christopher G. Davey, Annemiek Dols, Jan Ernsting, Jennifer W. Evans, Lukas Fisch, Paola Fuentes-Claramonte, Ali Saffet Gonul, Ian H. Gotlib , et al. (63 additional authors not shown)

Abstract: Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, h… ▽ More Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, has the potential to provide diagnostic and predictive biomarkers for MDD. However, previous attempts to demarcate MDD patients and healthy controls (HC) based on segmented cortical features via linear machine learning approaches have reported low accuracies. In this study, we used globally representative data from the ENIGMA-MDD working group containing an extensive sample of people with MDD (N=2,772) and HC (N=4,240), which allows a comprehensive analysis with generalizable results. Based on the hypothesis that integration of vertex-wise cortical features can improve classification performance, we evaluated the classification of a DenseNet and a Support Vector Machine (SVM), with the expectation that the former would outperform the latter. As we analyzed a multi-site sample, we additionally applied the ComBat harmonization tool to remove potential nuisance effects of site. We found that both classifiers exhibited close to chance performance (balanced accuracy DenseNet: 51%; SVM: 53%), when estimated on unseen sites. Slightly higher classification performance (balanced accuracy DenseNet: 58%; SVM: 55%) was found when the cross-validation folds contained subjects from all sites, indicating site effect. In conclusion, the integration of vertex-wise morphometric features and the use of the non-linear classifier did not lead to the differentiability between MDD and HC. Our results support the notion that MDD classification on this combination of features and classifiers is unfeasible. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:2309.07352 [pdf]

Tackling the dimensions in imaging genetics with CLUB-PLS

Authors: Andre Altmann, Ana C Lawry Aguila, Neda Jahanshad, Paul M Thompson, Marco Lorenzi

Abstract: A major challenge in imaging genetics and similar fields is to link high-dimensional data in one domain, e.g., genetic data, to high dimensional data in a second domain, e.g., brain imaging data. The standard approach in the area are mass univariate analyses across genetic factors and imaging phenotypes. That entails executing one genome-wide association study (GWAS) for each pre-defined imaging m… ▽ More A major challenge in imaging genetics and similar fields is to link high-dimensional data in one domain, e.g., genetic data, to high dimensional data in a second domain, e.g., brain imaging data. The standard approach in the area are mass univariate analyses across genetic factors and imaging phenotypes. That entails executing one genome-wide association study (GWAS) for each pre-defined imaging measure. Although this approach has been tremendously successful, one shortcoming is that phenotypes must be pre-defined. Consequently, effects that are not confined to pre-selected regions of interest or that reflect larger brain-wide patterns can easily be missed. In this work we introduce a Partial Least Squares (PLS)-based framework, which we term Cluster-Bootstrap PLS (CLUB-PLS), that can work with large input dimensions in both domains as well as with large sample sizes. One key factor of the framework is to use cluster bootstrap to provide robust statistics for single input features in both domains. We applied CLUB-PLS to investigating the genetic basis of surface area and cortical thickness in a sample of 33,000 subjects from the UK Biobank. We found 107 genome-wide significant locus-phenotype pairs that are linked to 386 different genes. We found that a vast majority of these loci could be technically validated at a high rate: using classic GWAS or Genome-Wide Inferred Statistics (GWIS) we found that 85 locus-phenotype pairs exceeded the genome-wide suggestive (P<1e-05) threshold. △ Less

Submitted 19 September, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

Comments: 12 pages, 4 Figures, 2 Tables

arXiv:2309.04607 [pdf]

Linking Symptom Inventories using Semantic Textual Similarity

Authors: Eamonn Kennedy, Shashank Vadlamani, Hannah M Lindsey, Kelly S Peterson, Kristen Dams OConnor, Kenton Murray, Ronak Agarwal, Houshang H Amiri, Raeda K Andersen, Talin Babikian, David A Baron, Erin D Bigler, Karen Caeyenberghs, Lisa Delano-Wood, Seth G Disner, Ekaterina Dobryakova, Blessen C Eapen, Rachel M Edelstein, Carrie Esopenko, Helen M Genova, Elbert Geuze, Naomi J Goodrich-Hunsaker, Jordan Grafman, Asta K Haberg, Cooper B Hodges , et al. (57 additional authors not shown)

Abstract: An extensive library of symptom inventories has been developed over time to measure clinical symptoms, but this variety has led to several long standing issues. Most notably, results drawn from different settings and studies are not comparable, which limits reproducibility. Here, we present an artificial intelligence (AI) approach using semantic textual similarity (STS) to link symptoms and scores… ▽ More An extensive library of symptom inventories has been developed over time to measure clinical symptoms, but this variety has led to several long standing issues. Most notably, results drawn from different settings and studies are not comparable, which limits reproducibility. Here, we present an artificial intelligence (AI) approach using semantic textual similarity (STS) to link symptoms and scores across previously incongruous symptom inventories. We tested the ability of four pre-trained STS models to screen thousands of symptom description pairs for related content - a challenging task typically requiring expert panels. Models were tasked to predict symptom severity across four different inventories for 6,607 participants drawn from 16 international data sources. The STS approach achieved 74.8% accuracy across five tasks, outperforming other models tested. This work suggests that incorporating contextual, semantic information can assist expert decision-making processes, yielding gains for both general and disease-specific clinical assessment. △ Less

Submitted 8 September, 2023; originally announced September 2023.

arXiv:2305.01107 [pdf]

A Comprehensive Corpus Callosum Segmentation Tool for Detecting Callosal Abnormalities and Genetic Associations from Multi Contrast MRIs

Authors: Shruti P. Gadewar, Elnaz Nourollahimoghadam, Ravi R. Bhatt, Abhinaav Ramesh, Shayan Javid, Iyad Ba Gari, Alyssa H. Zhu, Sophia Thomopoulos, Paul M. Thompson, Neda Jahanshad

Abstract: Structural alterations of the midsagittal corpus callosum (midCC) have been associated with a wide range of brain disorders. The midCC is visible on most MRI contrasts and in many acquisitions with a limited field-of-view. Here, we present an automated tool for segmenting and assessing the shape of the midCC from T1w, T2w, and FLAIR images. We train a UNet on images from multiple public datasets t… ▽ More Structural alterations of the midsagittal corpus callosum (midCC) have been associated with a wide range of brain disorders. The midCC is visible on most MRI contrasts and in many acquisitions with a limited field-of-view. Here, we present an automated tool for segmenting and assessing the shape of the midCC from T1w, T2w, and FLAIR images. We train a UNet on images from multiple public datasets to obtain midCC segmentations. A quality control algorithm is also built-in, trained on the midCC shape features. We calculate intraclass correlations (ICC) and average Dice scores in a test-retest dataset to assess segmentation reliability. We test our segmentation on poor quality and partial brain scans. We highlight the biological significance of our extracted features using data from over 40,000 individuals from the UK Biobank; we classify clinically defined shape abnormalities and perform genetic analyses. △ Less

Submitted 1 May, 2023; originally announced May 2023.

arXiv:2206.08122 [pdf]

Multi-site benchmark classification of major depressive disorder using machine learning on cortical and subcortical measures

Authors: Vladimir Belov, Tracy Erwin-Grabner, Ali Saffet Gonul, Alyssa R. Amod, Amar Ojha, Andre Aleman, Annemiek Dols, Anouk Scharntee, Aslihan Uyar-Demir, Ben J Harrison, Benson M. Irungu, Bianca Besteher, Bonnie Klimes-Dougan, Brenda W. J. H. Penninx, Bryon A. Mueller, Carlos Zarate, Christopher G. Davey, Christopher R. K. Ching, Colm G. Connolly, Cynthia H. Y. Fu, Dan J. Stein, Danai Dima, David E. J. Linden, David M. A. Mehler, Edith Pomarol-Clotet , et al. (41 additional authors not shown)

Abstract: Machine learning (ML) techniques have gained popularity in the neuroimaging field due to their potential for classifying neuropsychiatric disorders. However, the diagnostic predictive power of the existing algorithms has been limited by small sample sizes, lack of representativeness, data leakage, and/or overfitting. Here, we overcome these limitations with the largest multi-site sample size to da… ▽ More Machine learning (ML) techniques have gained popularity in the neuroimaging field due to their potential for classifying neuropsychiatric disorders. However, the diagnostic predictive power of the existing algorithms has been limited by small sample sizes, lack of representativeness, data leakage, and/or overfitting. Here, we overcome these limitations with the largest multi-site sample size to date (n=5,356) to provide a generalizable ML classification benchmark of major depressive disorder (MDD). Using brain measures from standardized ENIGMA analysis pipelines in FreeSurfer, we were able to classify MDD vs healthy controls (HC) with around 62% balanced accuracy, but when harmonizing the data using ComBat balanced accuracy dropped to approximately 52%. Similar results were observed in stratified groups according to age of onset, antidepressant use, number of episodes and sex. Future studies incorporating higher dimensional brain imaging/phenotype features, and/or using more advanced machine and deep learning methods may achieve more encouraging prospects. △ Less

Submitted 25 October, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

Comments: main document 37 pages; supplementary material 24 pages

arXiv:2204.11206 [pdf, other]

Partial Identification of Dose Responses with Hidden Confounders

Authors: Myrl G. Marmarelis, Elizabeth Haddad, Andrew Jesson, Neda Jahanshad, Aram Galstyan, Greg Ver Steeg

Abstract: Inferring causal effects of continuous-valued treatments from observational data is a crucial task promising to better inform policy- and decision-makers. A critical assumption needed to identify these effects is that all confounding variables -- causal parents of both the treatment and the outcome -- are included as covariates. Unfortunately, given observational data alone, we cannot know with ce… ▽ More Inferring causal effects of continuous-valued treatments from observational data is a crucial task promising to better inform policy- and decision-makers. A critical assumption needed to identify these effects is that all confounding variables -- causal parents of both the treatment and the outcome -- are included as covariates. Unfortunately, given observational data alone, we cannot know with certainty that this criterion is satisfied. Sensitivity analyses provide principled ways to give bounds on causal estimates when confounding variables are hidden. While much attention is focused on sensitivity analyses for discrete-valued treatments, much less is paid to continuous-valued treatments. We present novel methodology to bound both average and conditional average continuous-valued treatment-effect estimates when they cannot be point identified due to hidden confounding. A semi-synthetic benchmark on multiple datasets shows our method giving tighter coverage of the true dose-response curve than a recently proposed continuous sensitivity model and baselines. Finally, we apply our method to a real-world observational case study to demonstrate the value of identifying dose-dependent causal effects. △ Less

Submitted 12 June, 2023; v1 submitted 24 April, 2022; originally announced April 2022.

arXiv:2011.09115 [pdf, other]

3D Grid-Attention Networks for Interpretable Age and Alzheimer's Disease Prediction from Structural MRI

Authors: Pradeep Lam, Alyssa H. Zhu, Iyad Ba Gari, Neda Jahanshad, Paul M. Thompson

Abstract: We propose an interpretable 3D Grid-Attention deep neural network that can accurately predict a person's age and whether they have Alzheimer's disease (AD) from a structural brain MRI scan. Building on a 3D convolutional neural network, we added two attention modules at different layers of abstraction, so that features learned are spatially related to the global features for the task. The attentio… ▽ More We propose an interpretable 3D Grid-Attention deep neural network that can accurately predict a person's age and whether they have Alzheimer's disease (AD) from a structural brain MRI scan. Building on a 3D convolutional neural network, we added two attention modules at different layers of abstraction, so that features learned are spatially related to the global features for the task. The attention layers allow the network to focus on brain regions relevant to the task, while masking out irrelevant or noisy regions. In evaluations based on 4,561 3-Tesla T1-weighted MRI scans from 4 phases of the Alzheimer's Disease Neuroimaging Initiative (ADNI), salience maps for age and AD prediction partially overlapped, but lower-level features overlapped more than higher-level features. The brain age prediction network also distinguished AD and healthy control groups better than another state-of-the-art method. The resulting visual analyses can distinguish interpretable feature patterns that are important for predicting clinical diagnosis. Future work is needed to test performance across scanners and populations. △ Less

Submitted 18 November, 2020; originally announced November 2020.

arXiv:1710.10641 [pdf]

A Fast, Accurate Two-Step Linear Mixed Model for Genetic Analysis Applied to Repeat MRI Measurements

Authors: Qifan Yang, Gennady V. Roshchupkin, Wiro J. Niessen, Sarah E. Medland, Alyssa H. Zhu, Paul M. Thompson, Neda Jahanshad

Abstract: Large-scale biobanks are being collected around the world in efforts to better understand human health and risk factors for disease. They often survey hundreds of thousands of individuals, combining questionnaires with clinical, genetic, demographic, and imaging assessments; some of this data may be collected longitudinally. Genetic associations analysis of such datasets requires methods to proper… ▽ More Large-scale biobanks are being collected around the world in efforts to better understand human health and risk factors for disease. They often survey hundreds of thousands of individuals, combining questionnaires with clinical, genetic, demographic, and imaging assessments; some of this data may be collected longitudinally. Genetic associations analysis of such datasets requires methods to properly handle relatedness, population structure and other types of biases introduced by confounders. Most popular and accurate approaches rely on linear mixed model (LMM) algorithms, which are iterative and computational complexity of each iteration scales by the square of the sample size, slowing the pace of discoveries (up to several days for single trait analysis), and, furthermore, limiting the use of repeat phenotypic measurements. Here, we describe our new, non-iterative, much faster and accurate Two-Step Linear Mixed Model (Two-Step LMM) approach, that has a computational complexity that scales linearly with sample size. We show that the first step retains accurate estimates of the heritability (the proportion of the trait variance explained by additive genetic factors), even when increasingly complex genetic relationships between individuals are modeled. Second step provides a faster framework to obtain the effect sizes of covariates in regression model. We applied Two-Step LMM to real data from the UK Biobank, which recently released genoty** information and processed MRI data from 9,725 individuals. We used the left and right hippocampus volume (HV) as repeated measures, and observed increased and more accurate heritability estimation, consistent with simulations. △ Less

Submitted 15 March, 2019; v1 submitted 29 October, 2017; originally announced October 2017.

Comments: 2017 Neural Information Processing Systems (NeurIPS) BigNeuro Workshop

arXiv:1710.05213 [pdf, ps, other]

doi 10.1007/978-3-319-72150-7_102

Simultaneous Matrix Diagonalization for Structural Brain Networks Classification

Authors: Nikita Mokrov, Maxim Panov, Boris A. Gutman, Joshua I. Faskowitz, Neda Jahanshad, Paul M. Thompson

Abstract: This paper considers the problem of brain disease classification based on connectome data. A connectome is a network representation of a human brain. The typical connectome classification problem is very challenging because of the small sample size and high dimensionality of the data. We propose to use simultaneous approximate diagonalization of adjacency matrices in order to compute their eigenst… ▽ More This paper considers the problem of brain disease classification based on connectome data. A connectome is a network representation of a human brain. The typical connectome classification problem is very challenging because of the small sample size and high dimensionality of the data. We propose to use simultaneous approximate diagonalization of adjacency matrices in order to compute their eigenstructures in more stable way. The obtained approximate eigenvalues are further used as features for classification. The proposed approach is demonstrated to be efficient for detection of Alzheimer's disease, outperforming simple baselines and competing with state-of-the-art approaches to brain disease classification. △ Less

Submitted 14 October, 2017; originally announced October 2017.

Journal ref: Complex Networks & Their Applications VI. COMPLEX NETWORKS 2017. Studies in Computational Intelligence, vol 689

arXiv:1709.08578 [pdf]

Heritability estimates on resting state fMRI data using the ENIGMA analysis pipeline

Authors: Bhim M. Adhikari, Neda Jahanshad, Dinesh Shukla, Dinesh Shukla, Richard C. Reynolds, Robert W. Cox, Els Fieremans, Jelle Veraart, Dmitry S. Novikov, L. Elliot Hong, Paul M. Thompson, Peter Kochunov

Abstract: Big data initiatives such as the Enhancing NeuroImaging Genetics through Meta-Analysis consortium (ENIGMA), combine data collected by independent studies worldwide to achieve more accurate estimates of effect sizes and more reliable and reproducible outcomes. Such efforts require harmonized analyses protocols to consistently extract phenotypes. Even so, challenges include wide variability of fMRI… ▽ More Big data initiatives such as the Enhancing NeuroImaging Genetics through Meta-Analysis consortium (ENIGMA), combine data collected by independent studies worldwide to achieve more accurate estimates of effect sizes and more reliable and reproducible outcomes. Such efforts require harmonized analyses protocols to consistently extract phenotypes. Even so, challenges include wide variability of fMRI protocols and scanner platforms; this leads to site-to-site variance in quality, resolution and temporal signal-to-noise ratio (tSNR). An effective harmonization should provide optimal measures for data of different qualities. We developed a multi-site rsfMRI analysis pipeline to allow research groups around the world to process rsfMRI scans in a harmonized way, to extract consistent and quantitative measurements of connectivity and to perform coordinated statistical tests. We used the single-modality ENIGMA rsfMRI pipeline based on model-free Marchenko-Pastor PCA based denoising to verify and replicate findings of significant heritability of measures from resting state networks. We analyzed two independent cohorts, GOBS (Genetics of Brain Structure) and HCP (the Human Connectome Project), which collected data using conventional and connectomics oriented fMRI protocols. We used seed-based connectivity and dual-regression approaches to show that rsfMRI signal is consistently heritable across twenty major functional network measures. Heritability values of 20-40% were observed across both cohorts. △ Less

Submitted 13 September, 2017; originally announced September 2017.

Comments: 12 pages, 3 figures, PSB 2018

arXiv:1706.06031 [pdf, other]

Evaluating 35 Methods to Generate Structural Connectomes Using Pairwise Classification

Authors: Dmitry Petrov, Alexander Ivanov, Joshua Faskowitz, Boris Gutman, Daniel Moyer, Julio Villalon, Neda Jahanshad, Paul Thompson

Abstract: There is no consensus on how to construct structural brain networks from diffusion MRI. How variations in pre-processing steps affect network reliability and its ability to distinguish subjects remains opaque. In this work, we address this issue by comparing 35 structural connectome-building pipelines. We vary diffusion reconstruction models, tractography algorithms and parcellations. Next, we cla… ▽ More There is no consensus on how to construct structural brain networks from diffusion MRI. How variations in pre-processing steps affect network reliability and its ability to distinguish subjects remains opaque. In this work, we address this issue by comparing 35 structural connectome-building pipelines. We vary diffusion reconstruction models, tractography algorithms and parcellations. Next, we classify structural connectome pairs as either belonging to the same individual or not. Connectome weights and eight topological derivative measures form our feature set. For experiments, we use three test-retest datasets from the Consortium for Reliability and Reproducibility (CoRR) comprised of a total of 105 individuals. We also compare pairwise classification results to a commonly used parametric test-retest measure, Intraclass Correlation Coefficient (ICC). △ Less

Submitted 19 June, 2017; originally announced June 2017.

Comments: Accepted for MICCAI 2017, 8 pages, 3 figures

arXiv:1705.10312 [pdf]

Classification of Major Depressive Disorder via Multi-Site Weighted LASSO Model

Authors: Dajiang Zhu, Brandalyn C. Riedel, Neda Jahanshad, Nynke A. Groenewold, Dan J. Stein, Ian H. Gotlib, Matthew D. Sacchet, Danai Dima, James H. Cole, Cynthia H. Y. Fu, Henrik Walter, Ilya M. Veer, Thomas Frodl, Lianne Schmaal, Dick J. Veltman, Paul M. Thompson

Abstract: Large-scale collaborative analysis of brain imaging data, in psychiatry and neu-rology, offers a new source of statistical power to discover features that boost ac-curacy in disease classification, differential diagnosis, and outcome prediction. However, due to data privacy regulations or limited accessibility to large datasets across the world, it is challenging to efficiently integrate distribut… ▽ More Large-scale collaborative analysis of brain imaging data, in psychiatry and neu-rology, offers a new source of statistical power to discover features that boost ac-curacy in disease classification, differential diagnosis, and outcome prediction. However, due to data privacy regulations or limited accessibility to large datasets across the world, it is challenging to efficiently integrate distributed information. Here we propose a novel classification framework through multi-site weighted LASSO: each site performs an iterative weighted LASSO for feature selection separately. Within each iteration, the classification result and the selected features are collected to update the weighting parameters for each feature. This new weight is used to guide the LASSO process at the next iteration. Only the fea-tures that help to improve the classification accuracy are preserved. In tests on da-ta from five sites (299 patients with major depressive disorder (MDD) and 258 normal controls), our method boosted classification accuracy for MDD by 4.9% on average. This result shows the potential of the proposed new strategy as an ef-fective and practical collaborative platform for machine learning on large scale distributed imaging and biobank data. △ Less

Submitted 3 June, 2017; v1 submitted 26 May, 2017; originally announced May 2017.

Comments: Accepted by MICCAI 2017

arXiv:1704.08383 [pdf, other]

Large-scale Feature Selection of Risk Genetic Factors for Alzheimer's Disease via Distributed Group Lasso Regression

Authors: Qingyang Li, Dajiang Zhu, Jie Zhang, Derrek Paul Hibar, Neda Jahanshad, Yalin Wang, Jie** Ye, Paul M. Thompson, Jie Wang

Abstract: Genome-wide association studies (GWAS) have achieved great success in the genetic study of Alzheimer's disease (AD). Collaborative imaging genetics studies across different research institutions show the effectiveness of detecting genetic risk factors. However, the high dimensionality of GWAS data poses significant challenges in detecting risk SNPs for AD. Selecting relevant features is crucial in… ▽ More Genome-wide association studies (GWAS) have achieved great success in the genetic study of Alzheimer's disease (AD). Collaborative imaging genetics studies across different research institutions show the effectiveness of detecting genetic risk factors. However, the high dimensionality of GWAS data poses significant challenges in detecting risk SNPs for AD. Selecting relevant features is crucial in predicting the response variable. In this study, we propose a novel Distributed Feature Selection Framework (DFSF) to conduct the large-scale imaging genetics studies across multiple institutions. To speed up the learning process, we propose a family of distributed group Lasso screening rules to identify irrelevant features and remove them from the optimization. Then we select the relevant group features by performing the group Lasso feature selection process in a sequence of parameters. Finally, we employ the stability selection to rank the top risk SNPs that might help detect the early stage of AD. To the best of our knowledge, this is the first distributed feature selection model integrated with group Lasso feature selection as well as detecting the risk genetic factors across multiple research institutions system. Empirical studies are conducted on 809 subjects with 5.9 million SNPs which are distributed across several individual institutions, demonstrating the efficiency and effectiveness of the proposed method. △ Less

Submitted 26 April, 2017; originally announced April 2017.

arXiv:1703.00981 [pdf, other]

A Restaurant Process Mixture Model for Connectivity Based Parcellation of the Cortex

Authors: Daniel Moyer, Boris A Gutman, Neda Jahanshad, Paul M. Thompson

Abstract: One of the primary objectives of human brain map** is the division of the cortical surface into functionally distinct regions, i.e. parcellation. While it is generally agreed that at macro-scale different regions of the cortex have different functions, the exact number and configuration of these regions is not known. Methods for the discovery of these regions are thus important, particularly as… ▽ More One of the primary objectives of human brain map** is the division of the cortical surface into functionally distinct regions, i.e. parcellation. While it is generally agreed that at macro-scale different regions of the cortex have different functions, the exact number and configuration of these regions is not known. Methods for the discovery of these regions are thus important, particularly as the volume of available information grows. Towards this end, we present a parcellation method based on a Bayesian non-parametric mixture model of cortical connectivity. △ Less

Submitted 2 March, 2017; originally announced March 2017.

Comments: In the Proceedings of Information Processing in Medical Imaging 2017

arXiv:1701.07847 [pdf, other]

Structural Connectome Validation Using Pairwise Classification

Authors: Dmitry Petrov, Boris Gutman, Alexander Ivanov, Joshua Faskowitz, Neda Jahanshad, Mikhail Belyaev, Paul Thompson

Abstract: In this work, we study the extent to which structural connectomes and topological derivative measures are unique to individual changes within human brains. To do so, we classify structural connectome pairs from two large longitudinal datasets as either belonging to the same individual or not. Our data is comprised of 227 individuals from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and 2… ▽ More In this work, we study the extent to which structural connectomes and topological derivative measures are unique to individual changes within human brains. To do so, we classify structural connectome pairs from two large longitudinal datasets as either belonging to the same individual or not. Our data is comprised of 227 individuals from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and 226 from the Parkinson's Progression Markers Initiative (PPMI). We achieve 0.99 area under the ROC curve score for features which represent either weights or network structure of the connectomes (node degrees, PageRank and local efficiency). Our approach may be useful for eliminating noisy features as a preprocessing step in brain aging studies and early diagnosis classification problems. △ Less

Submitted 30 January, 2017; v1 submitted 26 January, 2017; originally announced January 2017.

Comments: Accepted for IEEE International Symposium on Biomedical Imaging 2017

arXiv:1611.06197 [pdf, other]

An Empirical Study of Continuous Connectivity Degree Sequence Equivalents

Authors: Daniel Moyer, Boris A. Gutman, Joshua Faskowitz, Neda Jahanshad, Paul M. Thompson

Abstract: In the present work we demonstrate the use of a parcellation free connectivity model based on Poisson point processes. This model produces for each subject a continuous bivariate intensity function that represents for every possible pair of points the relative rate at which we observe tracts terminating at those points. We fit this model to explore degree sequence equivalents for spatial continuum… ▽ More In the present work we demonstrate the use of a parcellation free connectivity model based on Poisson point processes. This model produces for each subject a continuous bivariate intensity function that represents for every possible pair of points the relative rate at which we observe tracts terminating at those points. We fit this model to explore degree sequence equivalents for spatial continuum graphs, and to investigate the local differences between estimated intensity functions for two different tractography methods. This is a companion paper to Moyer et al. (2016), where the model was originally defined. △ Less

Submitted 18 November, 2016; originally announced November 2016.

Comments: Presented at The MICCAI-BACON 16 Workshop (https://arxiv.longhoe.net/abs/1611.03363)

Report number: BACON/2016/04

arXiv:1610.03809 [pdf, other]

A Continuous Model of Cortical Connectivity

Authors: Daniel Moyer, Boris A. Gutman, Joshua Faskowitz, Neda Jahanshad, Paul M. Thompson

Abstract: We present a continuous model for structural brain connectivity based on the Poisson point process. The model treats each streamline curve in a tractography as an observed event in connectome space, here a product space of cortical white matter boundaries. We approximate the model parameter via kernel density estimation. To deal with the heavy computational burden, we develop a fast parameter esti… ▽ More We present a continuous model for structural brain connectivity based on the Poisson point process. The model treats each streamline curve in a tractography as an observed event in connectome space, here a product space of cortical white matter boundaries. We approximate the model parameter via kernel density estimation. To deal with the heavy computational burden, we develop a fast parameter estimation method by pre-computing associated Legendre products of the data, leveraging properties of the spherical heat kernel. We show how our approach can be used to assess the quality of cortical parcellations with respect to connectivty. We further present empirical results that suggest the discrete connectomes derived from our model have substantially higher test-retest reliability compared to standard methods. △ Less

Submitted 5 November, 2018; v1 submitted 12 October, 2016; originally announced October 2016.

Comments: Accepted at MICCAI 2016

arXiv:1608.07251 [pdf, other]

Large-scale Collaborative Imaging Genetics Studies of Risk Genetic Factors for Alzheimer's Disease Across Multiple Institutions

Authors: Qingyang Li, Tao Yang, Liang Zhan, Derrek Paul Hibar, Neda Jahanshad, Yalin Wang, Jie** Ye, Paul M. Thompson, Jie Wang

Abstract: Genome-wide association studies (GWAS) offer new opportunities to identify genetic risk factors for Alzheimer's disease (AD). Recently, collaborative efforts across different institutions emerged that enhance the power of many existing techniques on individual institution data. However, a major barrier to collaborative studies of GWAS is that many institutions need to preserve individual data priv… ▽ More Genome-wide association studies (GWAS) offer new opportunities to identify genetic risk factors for Alzheimer's disease (AD). Recently, collaborative efforts across different institutions emerged that enhance the power of many existing techniques on individual institution data. However, a major barrier to collaborative studies of GWAS is that many institutions need to preserve individual data privacy. To address this challenge, we propose a novel distributed framework, termed Local Query Model (LQM) to detect risk SNPs for AD across multiple research institutions. To accelerate the learning process, we propose a Distributed Enhanced Dual Polytope Projection (D-EDPP) screening rule to identify irrelevant features and remove them from the optimization. To the best of our knowledge, this is the first successful run of the computationally intensive model selection procedure to learn a consistent model across different institutions without compromising their privacy while ranking the SNPs that may collectively affect AD. Empirical studies are conducted on 809 subjects with 5.9 million SNP features which are distributed across three individual institutions. D-EDPP achieved a 66-fold speed-up by effectively identifying irrelevant features. △ Less

Submitted 19 August, 2016; originally announced August 2016.

Comments: Published on the 19th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). 2016

arXiv:1510.05391 [pdf, other]

Unifying inference on brain network variations in neurological diseases: The Alzheimer's case

Authors: Daniele Durante, Madelaine Daianu, Neda Jahanshad, Paul M. Thompson, David B. Dunson

Abstract: There is growing interest in understanding how the structural interconnections among brain regions change with the occurrence of neurological diseases. Diffusion weighted MRI imaging has allowed researchers to non-invasively estimate a network of structural cortical connections made by white matter tracts, but current statistical methods for relating such networks to the presence or absence of a d… ▽ More There is growing interest in understanding how the structural interconnections among brain regions change with the occurrence of neurological diseases. Diffusion weighted MRI imaging has allowed researchers to non-invasively estimate a network of structural cortical connections made by white matter tracts, but current statistical methods for relating such networks to the presence or absence of a disease cannot exploit this rich network information. Standard practice considers each edge independently or summarizes the network with a few simple features. We enable dramatic gains in biological insight via a novel unifying methodology for inference on brain network variations associated to the occurrence of neurological diseases. The key of this approach is to define a probabilistic generative mechanism directly on the space of network configurations via dependent mixtures of low-rank factorizations, which efficiently exploit network information and allow the probability mass function for the brain network-valued random variable to vary flexibly across the group of patients characterized by a specific neurological disease and the one comprising age-matched cognitively healthy individuals. △ Less

Submitted 19 October, 2015; originally announced October 2015.

Showing 1–20 of 20 results for author: Jahanshad, N