Search | arXiv e-print repository

arXiv:2301.10772 [pdf]

Gene-SGAN: a method for discovering disease subtypes with imaging and genetic signatures via multi-view weakly-supervised deep clustering

Authors: Zhijian Yang, Junhao Wen, Ahmed Abdulkadir, Yuhan Cui, Guray Erus, Elizabeth Mamourian, Randa Melhem, Dhivya Srinivasan, Sindhuja T. Govindarajan, Jiong Chen, Mohamad Habes, Colin L. Masters, Paul Maruff, Jurgen Fripp, Luigi Ferrucci, Marilyn S. Albert, Sterling C. Johnson, John C. Morris, Pamela LaMontagne, Daniel S. Marcus, Tammie L. S. Benzinger, David A. Wolk, Li Shen, **gxuan Bao, Susan M. Resnick , et al. (3 additional authors not shown)

Abstract: Disease heterogeneity has been a critical challenge for precision diagnosis and treatment, especially in neurologic and neuropsychiatric diseases. Many diseases can display multiple distinct brain phenotypes across individuals, potentially reflecting disease subtypes that can be captured using MRI and machine learning methods. However, biological interpretability and treatment relevance are limite… ▽ More Disease heterogeneity has been a critical challenge for precision diagnosis and treatment, especially in neurologic and neuropsychiatric diseases. Many diseases can display multiple distinct brain phenotypes across individuals, potentially reflecting disease subtypes that can be captured using MRI and machine learning methods. However, biological interpretability and treatment relevance are limited if the derived subtypes are not associated with genetic drivers or susceptibility factors. Herein, we describe Gene-SGAN - a multi-view, weakly-supervised deep clustering method - which dissects disease heterogeneity by jointly considering phenotypic and genetic data, thereby conferring genetic correlations to the disease subtypes and associated endophenotypic signatures. We first validate the generalizability, interpretability, and robustness of Gene-SGAN in semi-synthetic experiments. We then demonstrate its application to real multi-site datasets from 28,858 individuals, deriving subtypes of Alzheimer's disease and brain endophenotypes associated with hypertension, from MRI and SNP data. Derived brain phenotypes displayed significant differences in neuroanatomical patterns, genetic determinants, biological and clinical biomarkers, indicating potentially distinct underlying neuropathologic processes, genetic drivers, and susceptibility factors. Overall, Gene-SGAN is broadly applicable to disease subty** and endophenotype discovery, and is herein tested on disease-related, genetically-driven neuroimaging phenotypes. △ Less

Submitted 25 January, 2023; originally announced January 2023.

arXiv:2208.13939 [pdf, other]

Mediation analysis with densities as mediators with an application to iCOMPARE trial

Authors: **gru Zhang, Mathias Basner, Christopher W. Jones, David F. Dinges, Haochang Shou, Hongzhe Li

Abstract: Physical activity has long been shown to be associated with biological and physiological performance and risk of diseases. It is of great interest to assess whether the effect of an exposure or intervention on an outcome is mediated through physical activity measured by modern wearable devices such as actigraphy. However, existing methods for mediation analysis focus almost exclusively on mediatio… ▽ More Physical activity has long been shown to be associated with biological and physiological performance and risk of diseases. It is of great interest to assess whether the effect of an exposure or intervention on an outcome is mediated through physical activity measured by modern wearable devices such as actigraphy. However, existing methods for mediation analysis focus almost exclusively on mediation variable that is in the Euclidean space, which cannot be applied directly to the actigraphy data of physical activity. Such data is best summarized in the form of an histogram or density. In this paper, we extend the structural equation models (SEMs) to the settings where a density is treated as the mediator to study the indirect mediation effect of physical activity on an outcome. We provide sufficient conditions for identifying the average causal effects of density mediator and present methods for estimating the direct and mediating effects of density on an outcome. We apply our method to the data set from the iCOMPARE trial that compares flexible duty-hour policies and standard duty-hour policies on interns' sleep related outcomes to explore the mediation effect of physical activity on the causal path between flexible duty-hour policies and sleep related outcomes. △ Less

Submitted 29 August, 2022; originally announced August 2022.

arXiv:2208.13936 [pdf, other]

Empirical Likelihood Inference of Variance Components in Linear Mixed-Effects Models

Authors: J. Zhang, W. Guo, J. S. Carpenter, Andrew Leroux, K. R. Merikangas, N. G. Martin, I. B. Hickie, H. Shou, H. Li

Abstract: Linear mixed-effects models are widely used in analyzing repeated measures data, including clustered and longitudinal data, where inferences of both fixed effects and variance components are of importance. Unlike the fixed effect inference that has been well studied, inference on the variance components is more challenging due to null value being on the boundary and the nuisance parameters of the… ▽ More Linear mixed-effects models are widely used in analyzing repeated measures data, including clustered and longitudinal data, where inferences of both fixed effects and variance components are of importance. Unlike the fixed effect inference that has been well studied, inference on the variance components is more challenging due to null value being on the boundary and the nuisance parameters of the fixed effects. Existing methods often require strong distributional assumptions on the random effects and random errors. In this paper, we develop empirical likelihood-based methods for the inference of the variance components in the presence of fixed effects. A nonparametric version of the Wilks' theorem for the proposed empirical likelihood ratio statistics for variance components is derived. We also develop an empirical likelihood test for multiple variance components related to a sequence of correlated outcomes. Simulation studies demonstrate that the proposed methods exhibit better type 1 error control than the commonly used likelihood ratio tests when the Gaussian distributional assumptions of the random effects are violated. We apply the methods to investigate the heritability of physical activity as measured by wearable device in the Australian Twin study and observe that such activity is heritable only in the quantile range from 0.375 to 0.514. △ Less

Submitted 29 August, 2022; originally announced August 2022.

arXiv:2110.11347 [pdf]

Multidimensional representations in late-life depression: convergence in neuroimaging, cognition, clinical symptomatology and genetics

Authors: Junhao Wen, Cynthia H. Y. Fu, Duygu Tosun, Yogasudha Veturi, Zhijian Yang, Ahmed Abdulkadir, Elizabeth Mamourian, Dhivya Srinivasan, **gxuan Bao, Guray Erus, Haochang Shou, Mohamad Habes, Jimit Doshi, Erdem Varol, Scott R Mackin, Aristeidis Sotiras, Yong Fan, Andrew J. Saykin, Yvette I. Sheline, Li Shen, Marylyn D. Ritchie, David A. Wolk, Marilyn Albert, Susan M. Resnick, Christos Davatzikos

Abstract: Late-life depression (LLD) is characterized by considerable heterogeneity in clinical manifestation. Unraveling such heterogeneity would aid in elucidating etiological mechanisms and pave the road to precision and individualized medicine. We sought to delineate, cross-sectionally and longitudinally, disease-related heterogeneity in LLD linked to neuroanatomy, cognitive functioning, clinical sympto… ▽ More Late-life depression (LLD) is characterized by considerable heterogeneity in clinical manifestation. Unraveling such heterogeneity would aid in elucidating etiological mechanisms and pave the road to precision and individualized medicine. We sought to delineate, cross-sectionally and longitudinally, disease-related heterogeneity in LLD linked to neuroanatomy, cognitive functioning, clinical symptomatology, and genetic profiles. Multimodal data from a multicentre sample (N=996) were analyzed. A semi-supervised clustering method (HYDRA) was applied to regional grey matter (GM) brain volumes to derive dimensional representations. Two dimensions were identified, which accounted for the LLD-related heterogeneity in voxel-wise GM maps, white matter (WM) fractional anisotropy (FA), neurocognitive functioning, clinical phenotype, and genetics. Dimension one (Dim1) demonstrated relatively preserved brain anatomy without WM disruptions relative to healthy controls. In contrast, dimension two (Dim2) showed widespread brain atrophy and WM integrity disruptions, along with cognitive impairment and higher depression severity. Moreover, one de novo independent genetic variant (rs13120336) was significantly associated with Dim 1 but not with Dim 2. Notably, the two dimensions demonstrated significant SNP-based heritability of 18-27% within the general population (N=12,518 in UKBB). Lastly, in a subset of individuals having longitudinal measurements, Dim2 demonstrated a more rapid longitudinal decrease in GM and brain age, and was more likely to progress to Alzheimers disease, compared to Dim1 (N=1,413 participants and 7,225 scans from ADNI, BLSA, and BIOCARD datasets). △ Less

Submitted 25 October, 2021; v1 submitted 20 October, 2021; originally announced October 2021.

arXiv:2109.03723 [pdf]

Disentangling Alzheimer's disease neurodegeneration from typical brain aging using machine learning

Authors: Gyujoon Hwang, Ahmed Abdulkadir, Guray Erus, Mohamad Habes, Raymond Pomponio, Haochang Shou, Jimit Doshi, Elizabeth Mamourian, Tanweer Rashid, Murat Bilgel, Yong Fan, Aristeidis Sotiras, Dhivya Srinivasan, John C. Morris, Daniel Marcus, Marilyn S. Albert, Nick R. Bryan, Susan M. Resnick, Ilya M. Nasrallah, Christos Davatzikos, David A. Wolk

Abstract: Neuroimaging biomarkers that distinguish between typical brain aging and Alzheimer's disease (AD) are valuable for determining how much each contributes to cognitive decline. Machine learning models can derive multi-variate brain change patterns related to the two processes, including the SPARE-AD (Spatial Patterns of Atrophy for Recognition of Alzheimer's Disease) and SPARE-BA (of Brain Aging) in… ▽ More Neuroimaging biomarkers that distinguish between typical brain aging and Alzheimer's disease (AD) are valuable for determining how much each contributes to cognitive decline. Machine learning models can derive multi-variate brain change patterns related to the two processes, including the SPARE-AD (Spatial Patterns of Atrophy for Recognition of Alzheimer's Disease) and SPARE-BA (of Brain Aging) investigated herein. However, substantial overlap between brain regions affected in the two processes confounds measuring them independently. We present a methodology toward disentangling the two. T1-weighted MRI images of 4,054 participants (48-95 years) with AD, mild cognitive impairment (MCI), or cognitively normal (CN) diagnoses from the iSTAGING (Imaging-based coordinate SysTem for AGIng and NeurodeGenerative diseases) consortium were analyzed. First, a subset of AD patients and CN adults were selected based purely on clinical diagnoses to train SPARE-BA1 (regression of age using CN individuals) and SPARE-AD1 (classification of CN versus AD). Second, analogous groups were selected based on clinical and molecular markers to train SPARE-BA2 and SPARE-AD2: amyloid-positive (A+) AD continuum group (consisting of A+AD, A+MCI, and A+ and tau-positive CN individuals) and amyloid-negative (A-) CN group. Finally, the combined group of the AD continuum and A-/CN individuals was used to train SPARE-BA3, with the intention to estimate brain age regardless of AD-related brain changes. Disentangled SPARE models derived brain patterns that were more specific to the two types of the brain changes. Correlation between the SPARE-BA and SPARE-AD was significantly reduced. Correlation of disentangled SPARE-AD was non-inferior to the molecular measurements and to the number of APOE4 alleles, but was less to AD-related psychometric test scores, suggesting contribution of advanced brain aging to these scores. △ Less

Submitted 8 September, 2021; originally announced September 2021.

Comments: 4 figures, 3 tables

arXiv:2106.12768 [pdf, other]

Two-sample tests for repeated measurements of histogram objects with applications to wearable device data

Authors: **gru Zhang, Kathleen R. Merikangas, Hongzhe Li, Haochang Shou

Abstract: Repeated observations have become increasingly common in biomedical research and longitudinal studies. For instance, wearable sensor devices are deployed to continuously track physiological and biological signals from each individual over multiple days. It remains of great interest to appropriately evaluate how the daily distribution of biosignals might differ across disease groups and demographic… ▽ More Repeated observations have become increasingly common in biomedical research and longitudinal studies. For instance, wearable sensor devices are deployed to continuously track physiological and biological signals from each individual over multiple days. It remains of great interest to appropriately evaluate how the daily distribution of biosignals might differ across disease groups and demographics. Hence these data could be formulated as multivariate complex object data such as probability densities, histograms, and observations on a tree. Traditional statistical methods would often fail to apply as they are sampled from an arbitrary non-Euclidean metric space. In this paper, we propose novel non-parametric graph-based two-sample tests for object data with repeated measures. A set of test statistics are proposed to capture various possible alternatives. We derive their asymptotic null distributions under the permutation null. These tests exhibit substantial power improvements over the existing methods while controlling the type I errors under finite samples as shown through simulation studies. The proposed tests are demonstrated to provide additional insights on the location, inter- and intra-individual variability of the daily physical activity distributions in a sample of studies for mood disorders. △ Less

Submitted 24 June, 2021; originally announced June 2021.

arXiv:2102.12582 [pdf]

Disentangling brain heterogeneity via semi-supervised deep-learning and MRI: dimensional representations of Alzheimer's Disease

Authors: Zhijian Yang, Ilya M. Nasrallah, Haochang Shou, Junhao Wen, Jimit Doshi, Mohamad Habes, Guray Erus, Ahmed Abdulkadir, Susan M. Resnick, David Wolk, Christos Davatzikos

Abstract: Heterogeneity of brain diseases is a challenge for precision diagnosis/prognosis. We describe and validate Smile-GAN (SeMI-supervised cLustEring-Generative Adversarial Network), a novel semi-supervised deep-clustering method, which dissects neuroanatomical heterogeneity, enabling identification of disease subtypes via their imaging signatures relative to controls. When applied to MRIs (2 studies;… ▽ More Heterogeneity of brain diseases is a challenge for precision diagnosis/prognosis. We describe and validate Smile-GAN (SeMI-supervised cLustEring-Generative Adversarial Network), a novel semi-supervised deep-clustering method, which dissects neuroanatomical heterogeneity, enabling identification of disease subtypes via their imaging signatures relative to controls. When applied to MRIs (2 studies; 2,832 participants; 8,146 scans) including cognitively normal individuals and those with cognitive impairment and dementia, Smile-GAN identified 4 neurodegenerative patterns/axes: P1, normal anatomy and highest cognitive performance; P2, mild/diffuse atrophy and more prominent executive dysfunction; P3, focal medial temporal atrophy and relatively greater memory impairment; P4, advanced neurodegeneration. Further application to longitudinal data revealed two distinct progression pathways: P1$\rightarrow$P2$\rightarrow$P4 and P1$\rightarrow$P3$\rightarrow$P4. Baseline expression of these patterns predicted the pathway and rate of future neurodegeneration. Pattern expression offered better yet complementary performance in predicting clinical progression, compared to amyloid/tau. These deep-learning derived biomarkers offer promise for precision diagnostics and targeted clinical trial recruitment. △ Less

Submitted 24 February, 2021; originally announced February 2021.

Comments: 37 pages, 11 figures

arXiv:2010.05355 [pdf]

Medical Image Harmonization Using Deep Learning Based Canonical Map**: Toward Robust and Generalizable Learning in Imaging

Authors: Vishnu M. Bashyam, Jimit Doshi, Guray Erus, Dhivya Srinivasan, Ahmed Abdulkadir, Mohamad Habes, Yong Fan, Colin L. Masters, Paul Maruff, Chuanjun Zhuo, Henry Völzke, Sterling C. Johnson, Jurgen Fripp, Nikolaos Koutsouleris, Theodore D. Satterthwaite, Daniel H. Wolf, Raquel E. Gur, Ruben C. Gur, John C. Morris, Marilyn S. Albert, Hans J. Grabe, Susan M. Resnick, R. Nick Bryan, David A. Wolk, Haochang Shou , et al. (2 additional authors not shown)

Abstract: Conventional and deep learning-based methods have shown great potential in the medical imaging domain, as means for deriving diagnostic, prognostic, and predictive biomarkers, and by contributing to precision medicine. However, these methods have yet to see widespread clinical adoption, in part due to limited generalization performance across various imaging devices, acquisition protocols, and pat… ▽ More Conventional and deep learning-based methods have shown great potential in the medical imaging domain, as means for deriving diagnostic, prognostic, and predictive biomarkers, and by contributing to precision medicine. However, these methods have yet to see widespread clinical adoption, in part due to limited generalization performance across various imaging devices, acquisition protocols, and patient populations. In this work, we propose a new paradigm in which data from a diverse range of acquisition conditions are "harmonized" to a common reference domain, where accurate model learning and prediction can take place. By learning an unsupervised image to image canonical map** from diverse datasets to a reference domain using generative deep learning models, we aim to reduce confounding data variation while preserving semantic information, thereby rendering the learning task easier in the reference domain. We test this approach on two example problems, namely MRI-based brain age prediction and classification of schizophrenia, leveraging pooled cohorts of neuroimaging MRI data spanning 9 sites and 9701 subjects. Our results indicate a substantial improvement in these tasks in out-of-sample data, even when training is restricted to a single site. △ Less

Submitted 11 October, 2020; originally announced October 2020.

arXiv:2001.04968 [pdf, other]

Graph-Fused Multivariate Regression via Total Variation Regularization

Authors: Ying Liu, Bowei Yan, Kathleen Merikangas, Haochang Shou

Abstract: In this paper, we propose the Graph-Fused Multivariate Regression (GFMR) via Total Variation regularization, a novel method for estimating the association between a one-dimensional or multidimensional array outcome and scalar predictors. While we were motivated by data from neuroimaging and physical activity tracking, the methodology is designed and presented in a generalizable format and is appli… ▽ More In this paper, we propose the Graph-Fused Multivariate Regression (GFMR) via Total Variation regularization, a novel method for estimating the association between a one-dimensional or multidimensional array outcome and scalar predictors. While we were motivated by data from neuroimaging and physical activity tracking, the methodology is designed and presented in a generalizable format and is applicable to many other areas of scientific research. The estimator is the solution of a penalized regression problem where the objective is the sum of square error plus a total variation (TV) regularization on the predicted mean across all subjects. We propose an algorithm for parameter estimation, which is efficient and scalable in a distributed computing platform. Proof of the algorithm convergence is provided, and the statistical consistency of the estimator is presented via an oracle inequality. We present 1D and 2D simulation results and demonstrate that GFMR outperforms existing methods in most cases. We also demonstrate the general applicability of the method by two real data examples, including the analysis of the 1D accelerometry subsample of a large community-based study for mood disorders and the analysis of the 3D MRI data from the attention-deficient/hyperactive deficient (ADHD) 200 consortium. △ Less

Submitted 14 January, 2020; originally announced January 2020.

arXiv:1703.05264 [pdf, other]

Total Variation Regularized Tensor-on-scalar Regression

Authors: Ying Liu, Bowei Yan, Kathleen Merikangas, Haochang Shou

Abstract: In this paper, we propose Total Variation Regularized Tensor-on-scalar Regression(TVTR), a novel method for estimating the association between a tensor outcome (a one dimensional or multidimensional array) and scalar predictors. While the statistical developments proposed here were motivated by the brain map** and activity tracking, the methodology is designed and presented in generality and is… ▽ More In this paper, we propose Total Variation Regularized Tensor-on-scalar Regression(TVTR), a novel method for estimating the association between a tensor outcome (a one dimensional or multidimensional array) and scalar predictors. While the statistical developments proposed here were motivated by the brain map** and activity tracking, the methodology is designed and presented in generality and is applicable to many other areas of scientific research. The estimator is the solution of a penalized regression problem where the objective is the sum of square error plus a total variation (TV) regularization on the predicted mean across all subjects. We propose an algorithm for the parameter estimation, which is efficient and scalable in distributed computing platform. Proof of the algorithm convergence is provided and the statistical consistency of the estimator is presented via an oracle inequality. We presented 1D and 2D simulation results, and demonstrate that TVTR outperforms existing methods in most cases. We also demonstrate the general applicability of the method by two real data examples including the analysis of the 1D accelerometry subsample of a large community-based study for mood disorders and the analysis of the 3D MRI data from the attention deficient/hyperactive deficient (ADHD) 200 consortium. △ Less

Submitted 9 December, 2018; v1 submitted 15 March, 2017; originally announced March 2017.

Comments: 43 pages, 5 figures

arXiv:1509.05279 [pdf, other]

Subcritical behavior for quasi-periodic Schrödinger cocycles with trigonometric potentials

Authors: C. A. Marx, L. H. Shou, J. L. Wellens

Abstract: We give a criterion implying subcritical behavior for quasi-periodic Schrödinger operators where the potential sampling function is given by a trigonometric polynomial. Subcritical behavior, in the sense of Avila's global theory, is known to imply purely absolutely continuous spectrum for all irrational frequencies and all phases. We give a criterion implying subcritical behavior for quasi-periodic Schrödinger operators where the potential sampling function is given by a trigonometric polynomial. Subcritical behavior, in the sense of Avila's global theory, is known to imply purely absolutely continuous spectrum for all irrational frequencies and all phases. △ Less

Submitted 31 October, 2015; v1 submitted 17 September, 2015; originally announced September 2015.

Comments: to appear in the Journal of Spectral Theory

arXiv:1501.04420 [pdf, ps, other]

doi 10.1214/14-AOAS748

Longitudinal high-dimensional principal components analysis with application to diffusion tensor imaging of multiple sclerosis

Authors: Vadim Zipunnikov, Sonja Greven, Haochang Shou, Brian S. Caffo, Daniel S. Reich, Ciprian M. Crainiceanu

Abstract: We develop a flexible framework for modeling high-dimensional imaging data observed longitudinally. The approach decomposes the observed variability of repeatedly measured high-dimensional observations into three additive components: a subject-specific imaging random intercept that quantifies the cross-sectional variability, a subject-specific imaging slope that quantifies the dynamic irreversible… ▽ More We develop a flexible framework for modeling high-dimensional imaging data observed longitudinally. The approach decomposes the observed variability of repeatedly measured high-dimensional observations into three additive components: a subject-specific imaging random intercept that quantifies the cross-sectional variability, a subject-specific imaging slope that quantifies the dynamic irreversible deformation over multiple realizations, and a subject-visit-specific imaging deviation that quantifies exchangeable effects between visits. The proposed method is very fast, scalable to studies including ultrahigh-dimensional data, and can easily be adapted to and executed on modest computing infrastructures. The method is applied to the longitudinal analysis of diffusion tensor imaging (DTI) data of the corpus callosum of multiple sclerosis (MS) subjects. The study includes $176$ subjects observed at $466$ visits. For each subject and visit the study contains a registered DTI scan of the corpus callosum at roughly 30,000 voxels. △ Less

Submitted 19 January, 2015; originally announced January 2015.

Comments: Published in at http://dx.doi.org/10.1214/14-AOAS748 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS748

Journal ref: Annals of Applied Statistics 2014, Vol. 8, No. 4, 2175-2202

arXiv:1409.5450 [pdf, other]

Improving Reliability of Subject-Level Resting-State fMRI Parcellation with Shrinkage Estimators

Authors: Amanda F. Mejia, Mary Beth Nebel, Haochang Shou, Ciprian M. Crainiceanu, James J. Pekar, Stewart Mostofsky, Brian Caffo, Martin A. Lindquist

Abstract: A recent interest in resting state functional magnetic resonance imaging (rsfMRI) lies in subdividing the human brain into anatomically and functionally distinct regions of interest. For example, brain parcellation is often used for defining the network nodes in connectivity studies. While inference has traditionally been performed on group-level data, there is a growing interest in parcellating s… ▽ More A recent interest in resting state functional magnetic resonance imaging (rsfMRI) lies in subdividing the human brain into anatomically and functionally distinct regions of interest. For example, brain parcellation is often used for defining the network nodes in connectivity studies. While inference has traditionally been performed on group-level data, there is a growing interest in parcellating single subject data. However, this is difficult due to the low signal-to-noise ratio of rsfMRI data, combined with typically short scan lengths. A large number of brain parcellation approaches employ clustering, which begins with a measure of similarity or distance between voxels. The goal of this work is to improve the reproducibility of single-subject parcellation using shrinkage estimators of such measures, allowing the noisy subject-specific estimator to "borrow strength" in a principled manner from a larger population of subjects. We present several empirical Bayes shrinkage estimators and outline methods for shrinkage when multiple scans are not available for each subject. We perform shrinkage on raw intervoxel correlation estimates and use both raw and shrinkage estimates to produce parcellations by performing clustering on the voxels. Our proposed method is agnostic to the choice of clustering method and can be used as a pre-processing step for any clustering algorithm. Using two datasets---a simulated dataset where the true parcellation is known and is subject-specific and a test-retest dataset consisting of two 7-minute rsfMRI scans from 20 subjects---we show that parcellations produced from shrinkage correlation estimates have higher reliability and validity than those produced from raw estimates. Application to test-retest data shows that using shrinkage estimators increases the reproducibility of subject-specific parcellations of the motor cortex by up to 30%. △ Less

Submitted 28 October, 2015; v1 submitted 18 September, 2014; originally announced September 2014.

Comments: body 21 pages, 11 figures

arXiv:1306.5524 [pdf, other]

Soft Null Hypotheses: A Case Study of Image Enhancement Detection in Brain Lesions

Authors: Haochang Shou, Russell T. Shinohara, Han Liu, Daniel S. Reich, Ciprian M. Crainiceanu

Abstract: This work is motivated by a study of a population of multiple sclerosis (MS) patients using dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) to identify active brain lesions. At each visit, a contrast agent is administered intravenously to a subject and a series of images is acquired to reveal the location and activity of MS lesions within the brain. Our goal is to identify and quant… ▽ More This work is motivated by a study of a population of multiple sclerosis (MS) patients using dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) to identify active brain lesions. At each visit, a contrast agent is administered intravenously to a subject and a series of images is acquired to reveal the location and activity of MS lesions within the brain. Our goal is to identify and quantify lesion enhancement location at the subject level and lesion enhancement patterns at the population level. With this example, we aim to address the difficult problem of transforming a qualitative scientific null hypothesis, such as "this voxel does not enhance", to a well-defined and numerically testable null hypothesis based on existing data. We call the procedure "soft null hypothesis" testing as opposed to the standard "hard null hypothesis" testing. This problem is fundamentally different from: 1) testing when a quantitative null hypothesis is given; 2) clustering using a mixture distribution; or 3) identifying a reasonable threshold with a parametric null assumption. We analyze a total of 20 subjects scanned at 63 visits (~30Gb), the largest population of such clinical brain images. △ Less

Submitted 24 June, 2013; originally announced June 2013.

arXiv:1304.6783 [pdf, other]

Structured Functional Principal Component Analysis

Authors: Haochang Shou, Vadim Zipunnikov, Ciprian M. Crainiceanu, Sonja Greven

Abstract: Motivated by modern observational studies, we introduce a class of functional models that expands nested and crossed designs. These models account for the natural inheritance of correlation structure from sampling design in studies where the fundamental sampling unit is a function or image. Inference is based on functional quadratics and their relationship with the underlying covariance structure… ▽ More Motivated by modern observational studies, we introduce a class of functional models that expands nested and crossed designs. These models account for the natural inheritance of correlation structure from sampling design in studies where the fundamental sampling unit is a function or image. Inference is based on functional quadratics and their relationship with the underlying covariance structure of the latent processes. A computationally fast and scalable estimation procedure is developed for ultra-high dimensional data. Methods are illustrated in three examples: high-frequency accelerometer data for daily activity, pitch linguistic data for phonetic analysis, and EEG data for studying electrical brain activity during sleep. △ Less

Submitted 24 April, 2013; originally announced April 2013.

MSC Class: 97K80

Showing 1–15 of 15 results for author: Shou, H