Evaluation of data imputation strategies in complex, deeply-phenotyped data sets: the case of the EU-AIMS Longitudinal European Autism Project
Authors:
A. Llera,
M. Brammer,
B. Oakley,
J. Tillmann,
M. Zabihi,
T. Mei,
T. Charman,
C. Ecker,
F. Dell Acqua,
T. Banaschewski,
C. Moessnang,
S. Baron-Cohen,
R. Holt,
S. Durston,
D. Murphy,
E. Loth,
J. K. Buitelaar,
D. L. Floris,
C. F. Beckmann
Abstract:
An increasing number of large-scale multi-modal research initiatives has been conducted in the typically develo** population, as well as in psychiatric cohorts. Missing data is a common problem in such datasets due to the difficulty of assessing multiple measures on a large number of participants. The consequences of missing data accumulate when researchers aim to explore relationships between m…
▽ More
An increasing number of large-scale multi-modal research initiatives has been conducted in the typically develo** population, as well as in psychiatric cohorts. Missing data is a common problem in such datasets due to the difficulty of assessing multiple measures on a large number of participants. The consequences of missing data accumulate when researchers aim to explore relationships between multiple measures. Here we aim to evaluate different imputation strategies to fill in missing values in clinical data from a large (total N=764) and deeply characterised (i.e. range of clinical and cognitive instruments administered) sample of N=453 autistic individuals and N=311 control individuals recruited as part of the EU-AIMS Longitudinal European Autism Project (LEAP) consortium. In particular we consider a total of 160 clinical measures divided in 15 overlap** subsets of participants. We use two simple but common univariate strategies, mean and median imputation, as well as a Round Robin regression approach involving four independent multivariate regression models including a linear model, Bayesian Ridge regression, as well as several non-linear models, Decision Trees, Extra Trees and K-Neighbours regression. We evaluate the models using the traditional mean square error towards removed available data, and consider in addition the KL divergence between the observed and the imputed distributions. We show that all of the multivariate approaches tested provide a substantial improvement compared to typical univariate approaches. Further, our analyses reveal that across all 15 data-subsets tested, an Extra Trees regression approach provided the best global results. This allows the selection of a unique model to impute missing data for the LEAP project and deliver a fixed set of imputed clinical data to be used by researchers working with the LEAP dataset in the future.
△ Less
Submitted 20 January, 2022;
originally announced January 2022.
Disentangling causal webs in the brain using functional Magnetic Resonance Imaging: A review of current approaches
Authors:
Natalia Z. Bielczyk,
Sebo Uithol,
Tim van Mourik,
Paul Anderson,
Jeffrey C. Glennon,
Jan K. Buitelaar
Abstract:
In the past two decades, functional Magnetic Resonance Imaging has been used to relate neuronal network activity to cognitive processing and behaviour. Recently this approach has been augmented by algorithms that allow us to infer causal links between component populations of neuronal networks. Multiple inference procedures have been proposed to approach this research question but so far, each met…
▽ More
In the past two decades, functional Magnetic Resonance Imaging has been used to relate neuronal network activity to cognitive processing and behaviour. Recently this approach has been augmented by algorithms that allow us to infer causal links between component populations of neuronal networks. Multiple inference procedures have been proposed to approach this research question but so far, each method has limitations when it comes to establishing whole-brain connectivity patterns. In this work, we discuss eight ways to infer causality in fMRI research: Bayesian Nets, Dynamical Causal Modelling, Granger Causality, Likelihood Ratios, LiNGAM, Patel's Tau, Structural Equation Modelling, and Transfer Entropy. We finish with formulating some recommendations for the future directions in this area.
△ Less
Submitted 30 May, 2019; v1 submitted 14 August, 2017;
originally announced August 2017.