Search | arXiv e-print repository

doi 10.1080/10618600.2024.2350476

Variance-Reduced Stochastic Optimization for Efficient Inference of Hidden Markov Models

Authors: Evan Sidrow, Nancy Heckman, Alexandre Bouchard-Côté, Sarah M. E. Fortune, Andrew W. Trites, Marie Auger-Méthé

Abstract: Hidden Markov models (HMMs) are popular models to identify a finite number of latent states from sequential data. However, fitting them to large data sets can be computationally demanding because most likelihood maximization techniques require iterating through the entire underlying data set for every parameter update. We propose a novel optimization algorithm that updates the parameters of an HMM… ▽ More Hidden Markov models (HMMs) are popular models to identify a finite number of latent states from sequential data. However, fitting them to large data sets can be computationally demanding because most likelihood maximization techniques require iterating through the entire underlying data set for every parameter update. We propose a novel optimization algorithm that updates the parameters of an HMM without iterating through the entire data set. Namely, we combine a partial E step with variance-reduced stochastic optimization within the M step. We prove the algorithm converges under certain regularity conditions. We test our algorithm empirically using a simulation study as well as a case study of kinematic data collected using suction-cup attached biologgers from eight northern resident killer whales (Orcinus orca) off the western coast of Canada. In both, our algorithm converges in fewer epochs and to regions of higher likelihood compared to standard numerical optimization techniques. Our algorithm allows practitioners to fit complicated HMMs to large time-series data sets more efficiently than existing baselines. △ Less

Submitted 16 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: 23 pages, 7 figures. Code available at https://github.com/evsi8432/sublinear-HMM-inference

arXiv:2304.00149 [pdf]

Using online student focus groups in the development of new educational resources

Authors: Gian Carlo Diluvi, Sonja Isberg, Bruce Dunham, Nancy Heckman, Melissa Lee

Abstract: Educational resources, such as web apps and self-directed tutorials, have become popular tools for teaching and active learning. Ideally, students - the intended users of these resources - should be involved in the resource development stage. However, in practice students often only interact with fully developed resources, when it might be too late to incorporate changes. Previous work has address… ▽ More Educational resources, such as web apps and self-directed tutorials, have become popular tools for teaching and active learning. Ideally, students - the intended users of these resources - should be involved in the resource development stage. However, in practice students often only interact with fully developed resources, when it might be too late to incorporate changes. Previous work has addressed this by involving students in the development of new resources via in-person focus groups and interviews. In these, the resource developers observe students interacting with the resource. This allows developers to incorporate their observations and students' direct feedback into further development of the resource. However, as a result of the COVID-19 pandemic, carrying out in-person focus groups became infeasible due to social distancing restrictions. Instead, online meetings and classes became ubiquitous. In this work, we describe a fully-online methodology to evaluate new resources in development. Specifically, our methodology consists of carrying out student focus groups via online video conferencing software. We assessed two educational resources for introductory statistics using our methodology and found that the online setting allowed us to obtain rich, detailed information from the students. We also found online focus groups to be more efficient: students and researchers did not need to travel and scheduling was not restricted by the availability of physical space. Our findings suggest that online focus groups are an attractive alternative to in-person focus groups for student assessment of resources in development, even now that pandemic restrictions are being eased. △ Less

Submitted 31 March, 2023; originally announced April 2023.

arXiv:2101.03268 [pdf, other]

doi 10.1002/cjs.11673

Modelling multi-scale state-switching functional data with hidden Markov models

Authors: Evan Sidrow, Nancy Heckman, Sarah M. E. Fortune, Andrew W. Trites, Ian Murphy, Marie Auger-Méthé

Abstract: Data sets comprised of sequences of curves sampled at high frequencies in time are increasingly common in practice, but they can exhibit complicated dependence structures that cannot be modelled using common methods of Functional Data Analysis (FDA). We detail a hierarchical approach which treats the curves as observations from a hidden Markov model (HMM). The distribution of each curve is then de… ▽ More Data sets comprised of sequences of curves sampled at high frequencies in time are increasingly common in practice, but they can exhibit complicated dependence structures that cannot be modelled using common methods of Functional Data Analysis (FDA). We detail a hierarchical approach which treats the curves as observations from a hidden Markov model (HMM). The distribution of each curve is then defined by another fine-scale model which may involve auto-regression and require data transformations using moving-window summary statistics or Fourier analysis. This approach is broadly applicable to sequences of curves exhibiting intricate dependence structures. As a case study, we use this framework to model the fine-scale kinematic movement of a northern resident killer whale (Orcinus orca) off the coast of British Columbia, Canada. Through simulations, we show that our model produces more interpretable state estimation and more accurate parameter estimates compared to existing methods. △ Less

Submitted 8 January, 2021; originally announced January 2021.

Comments: 23 pages, 8 figures, 2 tables. Supplementary material appended to submission

Journal ref: Can J Statistics 50 (2022) 327-356

arXiv:1712.07265 [pdf, other]

Model-based curve registration via stochastic approximation EM algorithm

Authors: Eric Fu, Nancy Heckman

Abstract: Functional data often exhibit both amplitude and phase variation around a common base shape, with phase variation represented by a so called war** function. The process removing phase variation by curve alignment and inference of the war** functions is referred to as curve registration. When functional data are observed with substantial noise, model-based methods can be employed for simultaneo… ▽ More Functional data often exhibit both amplitude and phase variation around a common base shape, with phase variation represented by a so called war** function. The process removing phase variation by curve alignment and inference of the war** functions is referred to as curve registration. When functional data are observed with substantial noise, model-based methods can be employed for simultaneous smoothing and curve registration. However, the nonlinearity of the model often renders the inference computationally challenging. In this paper, we propose an alternative method for model-based curve registration which is computationally more stable and efficient than existing approaches in the literature. We apply our method to the analysis of elephant seal dive profiles and show that more intuitive grou**s can be obtained by clustering on phase variations via the predicted war** functions. △ Less

Submitted 24 June, 2018; v1 submitted 19 December, 2017; originally announced December 2017.

arXiv:1504.02813 [pdf, other]

doi 10.1002/cjs.11331

Switching nonparametric regression models for multi-curve data

Authors: Camila P. E. de Souza, Nancy E. Heckman, Helena Xu

Abstract: We develop and apply an approach for analyzing multi-curve data where each curve is driven by a latent state process. The state at any particular point determines a smooth function, forcing the individual curve to switch from one function to another. Thus each curve follows what we call a switching nonparametric regression model. We develop an EM algorithm to estimate the model parameters. We also… ▽ More We develop and apply an approach for analyzing multi-curve data where each curve is driven by a latent state process. The state at any particular point determines a smooth function, forcing the individual curve to switch from one function to another. Thus each curve follows what we call a switching nonparametric regression model. We develop an EM algorithm to estimate the model parameters. We also obtain standard errors for the parameter estimates of the state process. We consider several types of state processes: independent and identically distributed, independent but depending on a covariate and Markov. Simulation studies show the frequentist properties of our estimates. We apply our methods to a data set of a building's power usage. △ Less

Submitted 13 September, 2017; v1 submitted 10 April, 2015; originally announced April 2015.

Comments: 24 pages, 4 figues

Journal ref: The Canadian Journal of Statistics 2017

arXiv:1402.1740 [pdf, other]

doi 10.1002/env.2414

Analysis of Aggregated Functional Data from Mixed Populations with Application to Energy Consumption

Authors: Amanda Lenzi, Camila P. E. de Souza, Ronaldo Dias, Nancy Garcia, Nancy E. Heckman

Abstract: Understanding the energy consumption patterns of different types of consumers is essential in any planning of energy distribution. However, obtaining consumption information for single individuals is often either not possible or too expensive. Therefore, we consider data from aggregations of energy use, that is, from sums of individuals' energy use, where each individual falls into one of C consum… ▽ More Understanding the energy consumption patterns of different types of consumers is essential in any planning of energy distribution. However, obtaining consumption information for single individuals is often either not possible or too expensive. Therefore, we consider data from aggregations of energy use, that is, from sums of individuals' energy use, where each individual falls into one of C consumer classes. Unfortunately, the exact number of individuals of each class may be unknown: consumers do not always report the appropriate class, due to various factors including differential energy rates for different consumer classes. We develop a methodology to estimate the expected energy use of each class as a function of time and the true number of consumers in each class. We also provide some measure of uncertainty of the resulting estimates. To accomplish this, we assume that the expected consumption is a function of time that can be well approximated by a linear combination of B-splines. Individual consumer perturbations from this baseline are modeled as B-splines with random coefficients. We treat the reported numbers of consumers in each category as random variables with distribution depending on the true number of consumers in each class and on the probabilities of a consumer in one class reporting as another class. We obtain maximum likelihood estimates of all parameters via a maximization algorithm. We introduce a special numerical trick for calculating the maximum likelihood estimates of the true number of consumers in each class. We apply our method to a data set and study our method via simulation. △ Less

Submitted 7 February, 2014; originally announced February 2014.

Comments: 31 pages, 9 figures and 5 tables

Journal ref: Environmetrics 2016

arXiv:1312.1801 [pdf, ps, other]

doi 10.1214/12-AOAS603

Visualizing genetic constraints

Authors: Travis L. Gaydos, Nancy E. Heckman, Mark Kirkpatrick, J. R. Stinchcombe, Johanna Schmitt, Joel Kingsolver, J. S. Marron

Abstract: Principal Components Analysis (PCA) is a common way to study the sources of variation in a high-dimensional data set. Typically, the leading principal components are used to understand the variation in the data or to reduce the dimension of the data for subsequent analysis. The remaining principal components are ignored since they explain little of the variation in the data. However, evolutionary… ▽ More Principal Components Analysis (PCA) is a common way to study the sources of variation in a high-dimensional data set. Typically, the leading principal components are used to understand the variation in the data or to reduce the dimension of the data for subsequent analysis. The remaining principal components are ignored since they explain little of the variation in the data. However, evolutionary biologists gain important insights from these low variation directions. Specifically, they are interested in directions of low genetic variability that are biologically interpretable. These directions are called genetic constraints and indicate directions in which a trait cannot evolve through selection. Here, we propose studying the subspace spanned by low variance principal components by determining vectors in this subspace that are simplest. Our method and accompanying graphical displays enhance the biologist's ability to visualize the subspace and identify interpretable directions of low genetic variability that align with simple directions. △ Less

Submitted 6 December, 2013; originally announced December 2013.

Comments: Published in at http://dx.doi.org/10.1214/12-AOAS603 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS603

Journal ref: Annals of Applied Statistics 2013, Vol. 7, No. 2, 860-882

arXiv:1305.2227 [pdf, other]

Switching Nonparametric Regression Models and the Motorcycle Data revisited

Authors: Camila P. E. de Souza, Nancy E. Heckman

Abstract: We propose a methodology to analyze data arising from a curve that, over its domain, switches among J states. We consider a sequence of response variables, where each response y depends on a covariate x according to an unobserved state z. The states form a stochastic process and their possible values are j=1,...,J. If z equals j the expected response of y is one of J unknown smooth functions evalu… ▽ More We propose a methodology to analyze data arising from a curve that, over its domain, switches among J states. We consider a sequence of response variables, where each response y depends on a covariate x according to an unobserved state z. The states form a stochastic process and their possible values are j=1,...,J. If z equals j the expected response of y is one of J unknown smooth functions evaluated at x. We call this model a switching nonparametric regression model. We develop an EM algorithm to estimate the parameters of the latent state process and the functions corresponding to the J states. We also obtain standard errors for the parameter estimates of the state process. We conduct simulation studies to analyze the frequentist properties of our estimates. We also apply the proposed methodology to the well-known motorcycle data set treating the data as coming from more than one simulated accident run with unobserved run labels. △ Less

Submitted 22 May, 2013; v1 submitted 9 May, 2013; originally announced May 2013.

Comments: The article has one supplementary pdf file (DeSouzaHeckman-supplementA.pdf)

arXiv:1111.1915 [pdf, ps, other]

The theory and application of penalized methods or Reproducing Kernel Hilbert Spaces made easy

Authors: Nancy Heckman

Abstract: The popular cubic smoothing spline estimate of a regression function arises as the minimizer of the penalized sum of squares $\sum_j(Y_j - μ(t_j))^2 + λ\int_a^b [μ"(t)]^2 dt$, where the data are $t_j,Y_j$, $j=1,..., n$. The minimization is taken over an infinite-dimensional function space, the space of all functions with square integrable second derivatives. But the calculations can be carried out… ▽ More The popular cubic smoothing spline estimate of a regression function arises as the minimizer of the penalized sum of squares $\sum_j(Y_j - μ(t_j))^2 + λ\int_a^b [μ"(t)]^2 dt$, where the data are $t_j,Y_j$, $j=1,..., n$. The minimization is taken over an infinite-dimensional function space, the space of all functions with square integrable second derivatives. But the calculations can be carried out in a finite-dimensional space. The reduction from minimizing over an infinite dimensional space to minimizing over a finite dimensional space occurs for more general objective functions: the data may be related to the function $μ$ in another way, the sum of squares may be replaced by a more suitable expression, or the penalty, $\int_a^b [μ"(t)]^2 dt$, might take a different form. This paper reviews the Reproducing Kernel Hilbert Space structure that provides a finite-dimensional solution for a general minimization problem. Particular attention is paid to penalties based on linear differential operators. In this case, one can sometimes easily calculate the minimizer explicitly, using Green's functions. △ Less

Submitted 8 November, 2011; originally announced November 2011.

MSC Class: 62G99; 46E22; 62G08

Showing 1–9 of 9 results for author: Heckman, N