-
Simple and Scalable Algorithms for Cluster-Aware Precision Medicine
Authors:
Amanda M. Buch,
Conor Liston,
Logan Grosenick
Abstract:
AI-enabled precision medicine promises a transformational improvement in healthcare outcomes by enabling data-driven personalized diagnosis, prognosis, and treatment. However, the well-known "curse of dimensionality" and the clustered structure of biomedical data together interact to present a joint challenge in the high dimensional, limited observation precision medicine regime. To overcome both…
▽ More
AI-enabled precision medicine promises a transformational improvement in healthcare outcomes by enabling data-driven personalized diagnosis, prognosis, and treatment. However, the well-known "curse of dimensionality" and the clustered structure of biomedical data together interact to present a joint challenge in the high dimensional, limited observation precision medicine regime. To overcome both issues simultaneously we propose a simple and scalable approach to joint clustering and embedding that combines standard embedding methods with a convex clustering penalty in a modular way. This novel, cluster-aware embedding approach overcomes the complexity and limitations of current joint embedding and clustering methods, which we show with straightforward implementations of hierarchically clustered principal component analysis (PCA), locally linear embedding (LLE), and canonical correlation analysis (CCA). Through both numerical experiments and real-world examples, we demonstrate that our approach outperforms traditional and contemporary clustering methods on highly underdetermined problems (e.g., with just tens of observations) as well as on large sample datasets. Importantly, our approach does not require the user to choose the desired number of clusters, but instead yields interpretable dendrograms of hierarchically clustered embeddings. Thus our approach improves significantly on existing methods for identifying patient subgroups in multiomics and neuroimaging data, enabling scalable and interpretable biomarkers for precision medicine.
△ Less
Submitted 17 May, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Self-tracking Energy Transfer for Neural Stimulation in Untethered Mice
Authors:
John S. Ho,
Yuji Tanabe,
Shrivats Mohan Iyer,
Amelia J. Christensen,
Logan Grosenick,
Karl Deisseroth,
Scott L. Delp,
Ada S. Y. Poon
Abstract:
Optical or electrical stimulation of neural circuits in mice during natural behavior is an important paradigm for studying brain function. Conventional systems for optogenetics and electrical microstimulation require tethers or large head-mounted devices that disrupt animal behavior. We report a method for wireless powering of small-scale implanted devices based on the strong localization of energ…
▽ More
Optical or electrical stimulation of neural circuits in mice during natural behavior is an important paradigm for studying brain function. Conventional systems for optogenetics and electrical microstimulation require tethers or large head-mounted devices that disrupt animal behavior. We report a method for wireless powering of small-scale implanted devices based on the strong localization of energy that occurs during resonant interaction between a radio-frequency cavity and intrinsic modes in mice. The system features self-tracking over a wide (16 cm diameter) operational area, and is used to demonstrate wireless activation of cortical neurons with miniaturized stimulators (10 mm$^{3}$, 20 mg) fully implanted under the skin.
△ Less
Submitted 4 March, 2015;
originally announced March 2015.
-
The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience
Authors:
Randal Burns,
William Gray Roncal,
Dean Kleissas,
Kunal Lillaney,
Priya Manavalan,
Eric Perlman,
Daniel R. Berger,
Davi D. Bock,
Kwanghun Chung,
Logan Grosenick,
Narayanan Kasthuri,
Nicholas C. Weiler,
Karl Deisseroth,
Michael Kazhdan,
Jeff Lichtman,
R. Clay Reid,
Stephen J. Smith,
Alexander S. Szalay,
Joshua T. Vogelstein,
R. Jacob Vogelstein
Abstract:
We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes---neural connectivity maps of the brain---using the parallel execution of computer vision algorithms on hi…
▽ More
We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes---neural connectivity maps of the brain---using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at http://openconnecto.me.
The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems---reads to parallel disk arrays and writes to solid-state storage---to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effectiveness of spatial data organization.
△ Less
Submitted 18 June, 2013; v1 submitted 14 June, 2013;
originally announced June 2013.
-
Whole-brain Prediction Analysis with GraphNet
Authors:
Logan Grosenick,
Brad Klingenberg,
Kiefer Katovich,
Brian Knutson,
Jonathan E. Taylor
Abstract:
Multivariate machine learning methods are increasingly used to analyze neuroimaging data, often replacing more traditional "mass univariate" techniques that fit data one voxel at a time. In the functional magnetic resonance imaging (fMRI) literature, this has led to broad application of "off-the-shelf" classification and regression methods. These generic approaches allow investigators to use ready…
▽ More
Multivariate machine learning methods are increasingly used to analyze neuroimaging data, often replacing more traditional "mass univariate" techniques that fit data one voxel at a time. In the functional magnetic resonance imaging (fMRI) literature, this has led to broad application of "off-the-shelf" classification and regression methods. These generic approaches allow investigators to use ready-made algorithms to accurately decode perceptual, cognitive, or behavioral states from distributed patterns of neural activity. However, when applied to correlated whole-brain fMRI data these methods suffer from coefficient instability, are sensitive to outliers, and yield dense solutions that are hard to interpret without arbitrary thresholding. Here, we develop variants of the the Graph-constrained Elastic Net (GraphNet), ..., we (1) extend GraphNet to include robust loss functions that confer insensitivity to outliers, (2) equip them with "adaptive" penalties that asymptotically guarantee correct variable selection, and (3) develop a novel sparse structured Support Vector GraphNet classifier (SVGN). When applied to previously published data, these efficient whole-brain methods significantly improved classification accuracy over previously reported VOI-based analyses on the same data while discovering task-related regions not documented in the original VOI approach. Critically, GraphNet estimates generalize well to out-of-sample data collected more than three years later on the same task but with different subjects and stimuli. By enabling robust and efficient selection of important voxels from whole-brain data taken over multiple time points (>100,000 "features"), these methods enable data-driven selection of brain areas that accurately predict single-trial behavior within and across individuals.
△ Less
Submitted 26 December, 2012; v1 submitted 18 October, 2011;
originally announced October 2011.
-
A Generalized Least Squares Matrix Decomposition
Authors:
Genevera I. Allen,
Logan Grosenick,
Jonathan Taylor
Abstract:
Variables in many massive high-dimensional data sets are structured, arising for example from measurements on a regular grid as in imaging and time series or from spatial-temporal measurements as in climate studies. Classical multivariate techniques ignore these structural relationships often resulting in poor performance. We propose a generalization of the singular value decomposition (SVD) and p…
▽ More
Variables in many massive high-dimensional data sets are structured, arising for example from measurements on a regular grid as in imaging and time series or from spatial-temporal measurements as in climate studies. Classical multivariate techniques ignore these structural relationships often resulting in poor performance. We propose a generalization of the singular value decomposition (SVD) and principal components analysis (PCA) that is appropriate for massive data sets with structured variables or known two-way dependencies. By finding the best low rank approximation of the data with respect to a transposable quadratic norm, our decomposition, entitled the Generalized least squares Matrix Decomposition (GMD), directly accounts for structural relationships. As many variables in high-dimensional settings are often irrelevant or noisy, we also regularize our matrix decomposition by adding two-way penalties to encourage sparsity or smoothness. We develop fast computational algorithms using our methods to perform generalized PCA (GPCA), sparse GPCA, and functional GPCA on massive data sets. Through simulations and a whole brain functional MRI example we demonstrate the utility of our methodology for dimension reduction, signal recovery, and feature selection with high-dimensional structured data.
△ Less
Submitted 13 March, 2012; v1 submitted 15 February, 2011;
originally announced February 2011.