-
$\texttt{Davos}$: a Python "smuggler" for constructing lightweight reproducible notebooks
Authors:
Paxton C. Fitzpatrick,
Jeremy R. Manning
Abstract:
Reproducibility is a core requirement of modern scientific research. For computational research, reproducibility means that code should produce the same results, even when run on different systems. A standard approach to ensuring reproducibility entails packaging a project's dependencies along with its primary code base. Existing solutions vary in how deeply these dependencies are specified, rangi…
▽ More
Reproducibility is a core requirement of modern scientific research. For computational research, reproducibility means that code should produce the same results, even when run on different systems. A standard approach to ensuring reproducibility entails packaging a project's dependencies along with its primary code base. Existing solutions vary in how deeply these dependencies are specified, ranging from virtual environments, to containers, to virtual machines. Each of these existing solutions requires installing or setting up a system for running the desired code, increasing the complexity and time cost of sharing or engaging with reproducible science. Here, we propose a lighter-weight solution: the $\texttt{Davos}$ package. When used in combination with a notebook-based Python project, $\texttt{Davos}$ provides a mechanism for specifying the correct versions of the project's dependencies directly within the code that requires them, and automatically installing them in an isolated environment when the code is run. The $\texttt{Davos}$ package further ensures that those packages and specific versions are used every time the notebook's code is executed. This enables researchers to share a complete reproducible copy of their code within a single Jupyter notebook file.
△ Less
Submitted 1 October, 2023; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Identifying stimulus-driven neural activity patterns in multi-patient intracranial recordings
Authors:
Jeremy R. Manning
Abstract:
Identifying stimulus-driven neural activity patterns is critical for studying the neural basis of cognition. This can be particularly challenging in intracranial datasets, where electrode locations typically vary across patients. This chapter first presents an overview of the major challenges to identifying stimulus-driven neural activity patterns in the general case. Next, we will review several…
▽ More
Identifying stimulus-driven neural activity patterns is critical for studying the neural basis of cognition. This can be particularly challenging in intracranial datasets, where electrode locations typically vary across patients. This chapter first presents an overview of the major challenges to identifying stimulus-driven neural activity patterns in the general case. Next, we will review several modality-specific considerations and approaches, along with a discussion of several issues that are particular to intracranial recordings. Against this backdrop, we will consider a variety of within-subject and across-subject approaches to identifying and modeling stimulus-driven neural activity patterns in multi-patient intracranial recordings. These approaches include generalized linear models, multivariate pattern analysis, representational similarity analysis, joint stimulus-activity models, hierarchical matrix factorization models, Gaussian process models, geometric alignment models, inter-subject correlations, and inter-subject functional correlations. Examples from the recent literature serve to illustrate the major concepts and provide the conceptual intuitions for each approach.
△ Less
Submitted 3 February, 2022;
originally announced February 2022.
-
Enabling Factor Analysis on Thousand-Subject Neuroimaging Datasets
Authors:
Michael J. Anderson,
Mihai Capotă,
Javier S. Turek,
Xia Zhu,
Theodore L. Willke,
Yida Wang,
Po-Hsuan Chen,
Jeremy R. Manning,
Peter J. Ramadge,
Kenneth A. Norman
Abstract:
The scale of functional magnetic resonance image data is rapidly increasing as large multi-subject datasets are becoming widely available and high-resolution scanners are adopted. The inherent low-dimensionality of the information in this data has led neuroscientists to consider factor analysis methods to extract and analyze the underlying brain activity. In this work, we consider two recent multi…
▽ More
The scale of functional magnetic resonance image data is rapidly increasing as large multi-subject datasets are becoming widely available and high-resolution scanners are adopted. The inherent low-dimensionality of the information in this data has led neuroscientists to consider factor analysis methods to extract and analyze the underlying brain activity. In this work, we consider two recent multi-subject factor analysis methods: the Shared Response Model and Hierarchical Topographic Factor Analysis. We perform analytical, algorithmic, and code optimization to enable multi-node parallel implementations to scale. Single-node improvements result in 99x and 1812x speedups on these two methods, and enables the processing of larger datasets. Our distributed implementations show strong scaling of 3.3x and 5.5x respectively with 20 nodes on real datasets. We also demonstrate weak scaling on a synthetic dataset with 1024 subjects, on up to 1024 nodes and 32,768 cores.
△ Less
Submitted 17 August, 2016; v1 submitted 16 August, 2016;
originally announced August 2016.