-
Estimation of Parameter Distributions for Reaction-Diffusion Equations with Competition using Aggregate Spatiotemporal Data
Authors:
Kyle Nguyen,
Erica M. Rutter,
Kevin Flores
Abstract:
Reaction diffusion equations have been used to model a wide range of biological phenomenon related to population spread and proliferation from ecology to cancer. It is commonly assumed that individuals in a population have homogeneous diffusion and growth rates, however, this assumption can be inaccurate when the population is intrinsically divided into many distinct subpopulations that compete wi…
▽ More
Reaction diffusion equations have been used to model a wide range of biological phenomenon related to population spread and proliferation from ecology to cancer. It is commonly assumed that individuals in a population have homogeneous diffusion and growth rates, however, this assumption can be inaccurate when the population is intrinsically divided into many distinct subpopulations that compete with each other. In previous work, the task of inferring the degree of phenotypic heterogeneity between subpopulations from total population density has been performed within a framework that combines parameter distribution estimation with reaction-diffusion models. Here, we extend this approach so that it is compatible with reaction-diffusion models that include competition between subpopulations. We use a reaction-diffusion model of Glioblastoma multiforme, an aggressive type of brain cancer, to test our approach on simulated data that are similar to measurements that could be collected in practice. We use Prokhorov metric framework and convert the reaction-diffusion model to a random differential equation model to estimate joint distributions of diffusion and growth rates among heterogeneous subpopulations. We then compare the new random differential equation model performance against other partial differential equation models' performance. We find that the random differential equation is more capable at predicting the cell density compared to other models while being more time efficient. Finally, we use $k$-means clustering to predict the number of subpopulations based on the recovered distributions.
△ Less
Submitted 13 April, 2023;
originally announced April 2023.
-
Few-Shot Learning Enables Population-Scale Analysis of Leaf Traits in Populus trichocarpa
Authors:
John Lagergren,
Mirko Pavicic,
Hari B. Chhetri,
Larry M. York,
P. Doug Hyatt,
David Kainer,
Erica M. Rutter,
Kevin Flores,
Jack Bailey-Bale,
Marie Klein,
Gail Taylor,
Daniel Jacobson,
Jared Streich
Abstract:
Plant phenoty** is typically a time-consuming and expensive endeavor, requiring large groups of researchers to meticulously measure biologically relevant plant traits, and is the main bottleneck in understanding plant adaptation and the genetic architecture underlying complex traits at population scale. In this work, we address these challenges by leveraging few-shot learning with convolutional…
▽ More
Plant phenoty** is typically a time-consuming and expensive endeavor, requiring large groups of researchers to meticulously measure biologically relevant plant traits, and is the main bottleneck in understanding plant adaptation and the genetic architecture underlying complex traits at population scale. In this work, we address these challenges by leveraging few-shot learning with convolutional neural networks (CNNs) to segment the leaf body and visible venation of 2,906 P. trichocarpa leaf images obtained in the field. In contrast to previous methods, our approach (i) does not require experimental or image pre-processing, (ii) uses the raw RGB images at full resolution, and (iii) requires very few samples for training (e.g., just eight images for vein segmentation). Traits relating to leaf morphology and vein topology are extracted from the resulting segmentations using traditional open-source image-processing tools, validated using real-world physical measurements, and used to conduct a genome-wide association study to identify genes controlling the traits. In this way, the current work is designed to provide the plant phenoty** community with (i) methods for fast and accurate image-based feature extraction that require minimal training data, and (ii) a new population-scale data set, including 68 different leaf phenotypes, for domain scientists and machine learning researchers. All of the few-shot learning code, data, and results are made publicly available.
△ Less
Submitted 18 May, 2023; v1 submitted 24 January, 2023;
originally announced January 2023.
-
Learning Equations from Biological Data with Limited Time Samples
Authors:
John T. Nardini,
John H. Lagergren,
Andrea Hawkins-Daarud,
Lee Curtin,
Bethan Morris,
Erica M. Rutter,
Kristin R. Swanson,
Kevin B. Flores
Abstract:
Equation learning methods present a promising tool to aid scientists in the modeling process for biological data. Previous equation learning studies have demonstrated that these methods can infer models from rich datasets, however, the performance of these methods in the presence of common challenges from biological data has not been thoroughly explored. We present an equation learning methodology…
▽ More
Equation learning methods present a promising tool to aid scientists in the modeling process for biological data. Previous equation learning studies have demonstrated that these methods can infer models from rich datasets, however, the performance of these methods in the presence of common challenges from biological data has not been thoroughly explored. We present an equation learning methodology comprised of data denoising, equation learning, model selection and post-processing steps that infers a dynamical systems model from noisy spatiotemporal data. The performance of this methodology is thoroughly investigated in the face of several common challenges presented by biological data, namely, sparse data sampling, large noise levels, and heterogeneity between datasets. We find that this methodology can accurately infer the correct underlying equation and predict unobserved system dynamics from a small number of time samples when the data is sampled over a time interval exhibiting both linear and nonlinear dynamics. Our findings suggest that equation learning methods can be used for model discovery and selection in many areas of biology when an informative dataset is used. We focus on glioblastoma multiforme modeling as a case study in this work to highlight how these results are informative for data-driven modeling-based tumor invasion predictions.
△ Less
Submitted 19 May, 2020;
originally announced May 2020.