-
Detecting changepoints in globally-indexed functional time series
Authors:
Drew Yarger,
J. Derek Tucker
Abstract:
In environmental and climate data, there is often an interest in determining if and when changes occur in a system. Such changes may result from localized sources in space and time like a volcanic eruption or climate geoengineering events. Detecting such events and their subsequent influence on climate has important policy implications. However, the climate system is complex, and such changes can…
▽ More
In environmental and climate data, there is often an interest in determining if and when changes occur in a system. Such changes may result from localized sources in space and time like a volcanic eruption or climate geoengineering events. Detecting such events and their subsequent influence on climate has important policy implications. However, the climate system is complex, and such changes can be challenging to detect. One statistical perspective for changepoint detection is functional time series, where one observes an entire function at each time point. We will consider the context where each time point is a year, and we observe a function of temperature indexed by day of the year. Furthermore, such data is measured at many spatial locations on Earth, which motivates accommodating sets of functional time series that are spatially-indexed on a sphere. Simultaneously inferring changes that can occur at different times for different locations is challenging. We propose test statistics for detecting these changepoints, and we evaluate performance using varying levels of data complexity, including a simulation study, simplified climate model simulations, and climate reanalysis data. We evaluate changes in stratospheric temperature globally over 1984-1998. Such changes may be associated with the eruption of Mt. Pinatubo in 1991.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
Elastic Bayesian Model Calibration
Authors:
Devin Francom,
J. Derek Tucker,
Gabriel Huerta,
Kurtis Shuler,
Daniel Ries
Abstract:
Functional data are ubiquitous in scientific modeling. For instance, quantities of interest are modeled as functions of time, space, energy, density, etc. Uncertainty quantification methods for computer models with functional response have resulted in tools for emulation, sensitivity analysis, and calibration that are widely used. However, many of these tools do not perform well when the model's p…
▽ More
Functional data are ubiquitous in scientific modeling. For instance, quantities of interest are modeled as functions of time, space, energy, density, etc. Uncertainty quantification methods for computer models with functional response have resulted in tools for emulation, sensitivity analysis, and calibration that are widely used. However, many of these tools do not perform well when the model's parameters control both the amplitude variation of the functional output and its alignment (or phase variation). This paper introduces a framework for Bayesian model calibration when the model responses are misaligned functional data. The approach generates two types of data out of the misaligned functional responses: one that isolates the amplitude variation and one that isolates the phase variation. These two types of data are created for the computer simulation data (both of which may be emulated) and the experimental data. The calibration approach uses both types so that it seeks to match both the amplitude and phase of the experimental data. The framework is careful to respect constraints that arise especially when modeling phase variation, but also in a way that it can be done with readily available calibration software. We demonstrate the techniques on a simulated data example and on two dynamic material science problems: a strength model calibration using flyer plate experiments and an equation of state model calibration using experiments performed on the Sandia National Laboratories' Z-machine.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Elastic Functional Changepoint Detection of Climate Impacts from Localized Sources
Authors:
J. Derek Tucker,
Drew Yarger
Abstract:
Detecting changepoints in functional data has become an important problem as interest in monitoring of climate phenomenon has increased, where the data is functional in nature. The observed data often contains both amplitude ($y$-axis) and phase ($x$-axis) variability. If not accounted for properly, true changepoints may be undetected, and the estimated underlying mean change functions will be inc…
▽ More
Detecting changepoints in functional data has become an important problem as interest in monitoring of climate phenomenon has increased, where the data is functional in nature. The observed data often contains both amplitude ($y$-axis) and phase ($x$-axis) variability. If not accounted for properly, true changepoints may be undetected, and the estimated underlying mean change functions will be incorrect. In this paper, an elastic functional changepoint method is developed which properly accounts for these types of variability. The method can detect amplitude and phase changepoints which current methods in the literature do not, as they focus solely on the amplitude changepoint. This method can easily be implemented using the functions directly or can be computed via functional principal component analysis to ease the computational burden. We apply the method and its non-elastic competitors to both simulated data and observed data to show its efficiency in handling data with phase variation with both amplitude and phase changepoints. We use the method to evaluate potential changes in stratospheric temperature due to the eruption of Mt.\ Pinatubo in the Philippines in June 1991. Using an epidemic changepoint model, we find evidence of a increase in stratospheric temperature during a period that contains the immediate aftermath of Mt.\ Pinatubo, with most detected changepoints occurring in the tropics as expected.
△ Less
Submitted 10 August, 2023; v1 submitted 22 November, 2022;
originally announced November 2022.
-
Dimensionality Reduction using Elastic Measures
Authors:
J. Derek Tucker,
Matthew T. Martinez,
Jose M. Laborde
Abstract:
With the recent surge in big data analytics for hyper-dimensional data there is a renewed interest in dimensionality reduction techniques for machine learning applications. In order for these methods to improve performance gains and understanding of the underlying data, a proper metric needs to be identified. This step is often overlooked and metrics are typically chosen without consideration of t…
▽ More
With the recent surge in big data analytics for hyper-dimensional data there is a renewed interest in dimensionality reduction techniques for machine learning applications. In order for these methods to improve performance gains and understanding of the underlying data, a proper metric needs to be identified. This step is often overlooked and metrics are typically chosen without consideration of the underlying geometry of the data. In this paper, we present a method for incorporating elastic metrics into the t-distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP). We apply our method to functional data, which is uniquely characterized by rotations, parameterization, and scale. If these properties are ignored, they can lead to incorrect analysis and poor classification performance. Through our method we demonstrate improved performance on shape identification tasks for three benchmark data sets (MPEG-7, Car data set, and Plane data set of Thankoor), where we achieve 0.77, 0.95, and 1.00 F1 score, respectively.
△ Less
Submitted 19 January, 2023; v1 submitted 7 September, 2022;
originally announced September 2022.
-
Spatio-temporal extreme event modeling of terror insurgencies
Authors:
Lekha Patel,
Lyndsay Shand,
J. Derek Tucker,
Gabriel Huerta
Abstract:
Extreme events with potential deadly outcomes, such as those organized by terror groups, are highly unpredictable in nature and an imminent threat to society. In particular, quantifying the likelihood of a terror attack occurring in an arbitrary space-time region and its relative societal risk, would facilitate informed measures that would strengthen national security. This paper introduces a nove…
▽ More
Extreme events with potential deadly outcomes, such as those organized by terror groups, are highly unpredictable in nature and an imminent threat to society. In particular, quantifying the likelihood of a terror attack occurring in an arbitrary space-time region and its relative societal risk, would facilitate informed measures that would strengthen national security. This paper introduces a novel self-exciting marked spatio-temporal model for attacks whose inhomogeneous baseline intensity is written as a function of covariates. Its triggering intensity is succinctly modeled with a Gaussian Process prior distribution to flexibly capture intricate spatio-temporal dependencies between an arbitrary attack and previous terror events. By inferring the parameters of this model, we highlight specific space-time areas in which attacks are likely to occur. Furthermore, by measuring the outcome of an attack in terms of the number of casualties it produces, we introduce a novel mixture distribution for the number of casualties. This distribution flexibly handles low and high number of casualties and the discrete nature of the data through a {\it Generalized ZipF} distribution. We rely on a customized Markov chain Monte Carlo (MCMC) method to estimate the model parameters. We illustrate the methodology with data from the open source Global Terrorism Database (GTD) that correspond to attacks in Afghanistan from 2013-2018. We show that our model is able to predict the intensity of future attacks for 2019-2021 while considering various covariates of interest such as population density, number of regional languages spoken, and the density of population supporting the opposing government.
△ Less
Submitted 29 October, 2021; v1 submitted 15 October, 2021;
originally announced October 2021.
-
Hidden Ancestor Graphs: Models for Detagging Property Graphs
Authors:
R. W. R. Darling,
Gregory S. Clark,
J. D. Tucker
Abstract:
Consider a graph $G$ where each vertex is visibly labelled as a member of a distinct class, but also has a hidden binary state: wild or tame. Edges with end points in the same class are called agreement edges. Premise: an edge connecting vertices in different classes -- a conflict edge -- is allowed only when at least one end point is wild. Interpret wild status as readiness to form connections wi…
▽ More
Consider a graph $G$ where each vertex is visibly labelled as a member of a distinct class, but also has a hidden binary state: wild or tame. Edges with end points in the same class are called agreement edges. Premise: an edge connecting vertices in different classes -- a conflict edge -- is allowed only when at least one end point is wild. Interpret wild status as readiness to form connections with any other vertex, regardless of class -- a form of class disaffiliation. The learning goal is to classify each vertex as wild or tame using its neighborhood data. In applications such as communications metadata, bio-informatics, retailing, or bibliography, adjacency in $G$ is typically created by paths of length two in a transactional bipartite graph $B$. Class labelling, imported from a reference data source, is typically assortative, so agreement edges predominate. Conflict edges represent observed behavior (from $B$) inconsistent with prior labelling of $V(G)$. Wild vertices are those whose label is uninformative. The hidden ancestor graph constitutes a natural model for generating agreement edges and conflict edges, depending on a latent tree structure. The model is able to manifest high clustering rates and heavy-tailed degree distributions typical of social and spatial networks. It can be fitted to graph data using a few measurable graph parameters, and supplies a natural statistical classifier for wild versus tame.
△ Less
Submitted 13 December, 2023; v1 submitted 18 February, 2021;
originally announced February 2021.
-
Elastic $k$-means clustering of functional data for posterior exploration, with an application to inference on acute respiratory infection dynamics
Authors:
Xiao Zang,
Sebastian Kurtek,
Oksana Chkrebtii,
J. Derek Tucker
Abstract:
We propose a new method for clustering of functional data using a $k$-means framework. We work within the elastic functional data analysis framework, which allows for decomposition of the overall variation in functional data into amplitude and phase components. We use the amplitude component to partition functions into shape clusters using an automated approach. To select an appropriate number of…
▽ More
We propose a new method for clustering of functional data using a $k$-means framework. We work within the elastic functional data analysis framework, which allows for decomposition of the overall variation in functional data into amplitude and phase components. We use the amplitude component to partition functions into shape clusters using an automated approach. To select an appropriate number of clusters, we additionally propose a novel Bayesian Information Criterion defined using a mixture model on principal components estimated using functional Principal Component Analysis. The proposed method is motivated by the problem of posterior exploration, wherein samples obtained from Markov chain Monte Carlo algorithms are naturally represented as functions. We evaluate our approach using a simulated dataset, and apply it to a study of acute respiratory infection dynamics in San Luis Potosí, Mexico.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
A Deterministic Hitting-Time Moment Approach to Seed-set Expansion over a Graph
Authors:
Alexander H. Foss,
Richard B. Lehoucq,
W. Zachary Stuart,
J. Derek Tucker,
Jonathan W. Berry
Abstract:
We introduce HITMIX, a new technique for network seed-set expansion, i.e., the problem of identifying a set of graph vertices related to a given seed-set of vertices. We use the moments of the graph's hitting-time distribution to quantify the relationship of each non-seed vertex to the seed-set. This involves a deterministic calculation for the hitting-time moments that is scalable in the number o…
▽ More
We introduce HITMIX, a new technique for network seed-set expansion, i.e., the problem of identifying a set of graph vertices related to a given seed-set of vertices. We use the moments of the graph's hitting-time distribution to quantify the relationship of each non-seed vertex to the seed-set. This involves a deterministic calculation for the hitting-time moments that is scalable in the number of graph edges and so avoids directly sampling a Markov chain over the graph. The moments are used to fit a mixture model to estimate the probability that each non-seed vertex should be grouped with the seed set. This membership probability enables us to sort the non-seeds and threshold in a statistically-justified way. To the best of our knowledge, HITMIX is the first full statistical model for seed-set expansion that can give vertex-level membership probabilities. While HITMIX is a global method, its linear computation complexity in practice enables computations on large graphs. We have a high-performance implementation, and we present computational results on stochastic blockmodels and a small-world network from the SNAP repository. The state of the art in this problem is a collection of recently developed local methods, and we show that distinct advantages in solution quality are available if our global method can be used. In practice, we expect to be able to run HITMIX if the graph can be stored in memory.
△ Less
Submitted 18 November, 2020;
originally announced November 2020.
-
Scalable Multiple Changepoint Detection for Functional Data Sequences
Authors:
Trevor Harris,
Bo Li,
James Derek Tucker
Abstract:
We propose the Multiple Changepoint Isolation (MCI) method for detecting multiple changes in the mean and covariance of a functional process. We first introduce a pair of projections to represent the variability "between" and "within" the functional observations. We then present an augmented fused lasso procedure to split the projections into multiple regions robustly. These regions act to isolate…
▽ More
We propose the Multiple Changepoint Isolation (MCI) method for detecting multiple changes in the mean and covariance of a functional process. We first introduce a pair of projections to represent the variability "between" and "within" the functional observations. We then present an augmented fused lasso procedure to split the projections into multiple regions robustly. These regions act to isolate each changepoint away from the others so that the powerful univariate CUSUM statistic can be applied region-wise to identify the changepoints. Simulations show that our method accurately detects the number and locations of changepoints under many different scenarios. These include light and heavy tailed data, data with symmetric and skewed distributions, sparsely and densely sampled changepoints, and mean and covariance changes. We show that our method outperforms a recent multiple functional changepoint detector and several univariate changepoint detectors applied to our proposed projections. We also show that MCI is more robust than existing approaches and scales linearly with sample size. Finally, we demonstrate our method on a large time series of water vapor mixing ratio profiles from atmospheric emitted radiance interferometer measurements.
△ Less
Submitted 20 October, 2021; v1 submitted 4 August, 2020;
originally announced August 2020.
-
Multimodal Bayesian Registration of Noisy Functions using Hamiltonian Monte Carlo
Authors:
J. Derek Tucker,
Lyndsay Shand,
Kenny Chowdhary
Abstract:
Functional data registration is a necessary processing step for many applications. The observed data can be inherently noisy, often due to measurement error or natural process uncertainty, which most functional alignment methods cannot handle. A pair of functions can also have multiple optimal alignment solutions, which is not addressed in current literature. In this paper, a flexible Bayesian app…
▽ More
Functional data registration is a necessary processing step for many applications. The observed data can be inherently noisy, often due to measurement error or natural process uncertainty, which most functional alignment methods cannot handle. A pair of functions can also have multiple optimal alignment solutions, which is not addressed in current literature. In this paper, a flexible Bayesian approach to functional alignment is presented, which appropriately accounts for noise in the data without any pre-smoothing required. Additionally, by running parallel MCMC chains, the method can account for multiple optimal alignments via the multi-modal posterior distribution of the war** functions. To most efficiently sample the war** functions, the approach relies on a modification of the standard Hamiltonian Monte Carlo to be well-defined on the infinite-dimensional Hilbert space. This flexible Bayesian alignment method is applied to both simulated data and real data sets to show its efficiency in handling noisy functions and successfully accounting for multiple optimal alignments in the posterior; characterizing the uncertainty surrounding the war** functions.
△ Less
Submitted 3 March, 2021; v1 submitted 28 May, 2020;
originally announced May 2020.
-
Regression Models Using Shapes of Functions as Predictors
Authors:
Kyungmin Ahn,
J. Derek Tucker,
Wei Wu,
Anuj Srivastava
Abstract:
Functional variables are often used as predictors in regression problems. A commonly-used parametric approach, called {\it scalar-on-function regression}, uses the $\ltwo$ inner product to map functional predictors into scalar responses. This method can perform poorly when predictor functions contain undesired phase variability, causing phases to have disproportionately large influence on the resp…
▽ More
Functional variables are often used as predictors in regression problems. A commonly-used parametric approach, called {\it scalar-on-function regression}, uses the $\ltwo$ inner product to map functional predictors into scalar responses. This method can perform poorly when predictor functions contain undesired phase variability, causing phases to have disproportionately large influence on the response variable. One past solution has been to perform phase-amplitude separation (as a pre-processing step) and then use only the amplitudes in the regression model. Here we propose a more integrated approach, termed elastic functional regression model (EFRM), where phase-separation is performed inside the regression model, rather than as a pre-processing step. This approach generalizes the notion of phase in functional data, and is based on the norm-preserving time war** of predictors. Due to its invariance properties, this representation provides robustness to predictor phase variability and results in improved predictions of the response variable over traditional models. We demonstrate this framework using a number of datasets involving gait signals, NMR data, and stock market prices.
△ Less
Submitted 25 May, 2020; v1 submitted 5 September, 2019;
originally announced September 2019.
-
Elastic depths for detecting shape anomalies in functional data
Authors:
Trevor Harris,
James Derek Tucker,
Bo Li,
Lyndsay Shand
Abstract:
We propose a new family of depth measures called the elastic depths that can be used to greatly improve shape anomaly detection in functional data. Shape anomalies are functions that have considerably different geometric forms or features from the rest of the data. Identifying them is generally more difficult than identifying magnitude anomalies because shape anomalies are often not distinguishabl…
▽ More
We propose a new family of depth measures called the elastic depths that can be used to greatly improve shape anomaly detection in functional data. Shape anomalies are functions that have considerably different geometric forms or features from the rest of the data. Identifying them is generally more difficult than identifying magnitude anomalies because shape anomalies are often not distinguishable from the bulk of the data with visualization methods. The proposed elastic depths use the recently developed elastic distances to directly measure the centrality of functions in the amplitude and phase spaces. Measuring shape outlyingness in these spaces provides a rigorous quantification of shape, which gives the elastic depths a strong theoretical and practical advantage over other methods in detecting shape anomalies. A simple boxplot and thresholding method is introduced to identify shape anomalies using the elastic depths. We assess the elastic depth's detection skill on simulated shape outlier scenarios and compare them against popular shape anomaly detectors. Finally, we use hurricane trajectories to demonstrate the elastic depth methodology on manifold valued functional data. Supplementary materials, including additional simulations, data examples, and an R-package are available online.
△ Less
Submitted 6 August, 2020; v1 submitted 15 July, 2019;
originally announced July 2019.
-
Ab initio study of phosphorus effect on vacancy-mediated process in nickel alloys - an insight into Ni2Cr ordering
Authors:
Jia-Hong Ke,
George A. Young,
Julie D. Tucker
Abstract:
The development of long range order in nickel-chromium alloys is of great technological interest but the kinetics and mechanisms of the transformation are poorly understood. The present research utilizes a combined computational and experimental approach to elucidate the mechanism by which phosphorus accelerates the ordering rate of stoichiometric Ni_2Cr in Ni-Cr alloys. A series of Ni-33%Cr-x%P s…
▽ More
The development of long range order in nickel-chromium alloys is of great technological interest but the kinetics and mechanisms of the transformation are poorly understood. The present research utilizes a combined computational and experimental approach to elucidate the mechanism by which phosphorus accelerates the ordering rate of stoichiometric Ni_2Cr in Ni-Cr alloys. A series of Ni-33%Cr-x%P samples (in atomic percent) were fabricated with phosphorus concentrations, x = <0.005-0.1 at.% and aged between 373 and 470°C for times up to 3000 h. The first-principles modeling considers fcc Ni with dilute P as a reasonable approximation for the complex Ni-Cr-P alloy system. Calculation results show a pronounced enhancement of vacancy transport by vacancy-solute pair diffusion via consecutive exchange and rotation jumps of vacancies associated with the phosphorus atom. The energy barriers of these two migration paths are at least 0.35 eV lower than that of vacancy-atom exchange in pure Ni solvent. The analytical diffusion model predicts enhanced solvent diffusion by 2 orders of magnitude for 0.1 at.% P at 400-500°C. The model prediction is in good agreement with the evolution of micro-hardness. We characterize the micro-hardness result by a kinetic ordering model, showing a significant decrease of the activation energy of ordering transformation. These results help gauge the risk of industrial alloys develo** long range order which increases strength but degrades ductility and toughness. Specifically, minor alloying additions that bind with excess vacancies and lower the vacancy migration barrier can greatly accelerate hardening via Ni_2Cr precipitation.
△ Less
Submitted 21 April, 2019;
originally announced April 2019.
-
Computer model calibration based on image war** metrics: an application for sea ice deformation
Authors:
Yawen Guan,
Christian Sampson,
J. Derek Tucker,
Won Chang,
Anirban Mondal,
Murali Haran,
Deborah Sulsky
Abstract:
Arctic sea ice plays an important role in the global climate. Sea ice models governed by physical equations have been used to simulate the state of the ice including characteristics such as ice thickness, concentration, and motion. More recent models also attempt to capture features such as fractures or leads in the ice. These simulated features can be partially misaligned or misshapen when compar…
▽ More
Arctic sea ice plays an important role in the global climate. Sea ice models governed by physical equations have been used to simulate the state of the ice including characteristics such as ice thickness, concentration, and motion. More recent models also attempt to capture features such as fractures or leads in the ice. These simulated features can be partially misaligned or misshapen when compared to observational data, whether due to numerical approximation or incomplete physics. In order to make realistic forecasts and improve understanding of the underlying processes, it is necessary to calibrate the numerical model to field data. Traditional calibration methods based on generalized least-square metrics are flawed for linear features such as sea ice cracks. We develop a statistical emulation and calibration framework that accounts for feature misalignment and misshapenness, which involves optimally aligning model output with observed features using cutting edge image registration techniques. This work can also have application to other physical models which produce coherent structures.
△ Less
Submitted 24 January, 2019; v1 submitted 15 October, 2018;
originally announced October 2018.
-
Elastic Functional Principal Component Regression
Authors:
J. Derek Tucker,
John Lewis,
Anuj Srivastava
Abstract:
We study regression using functional predictors in situations where these functions contain both phase and amplitude variability. In other words, the functions are misaligned due to errors in time measurements, and these errors can significantly degrade both model estimation and prediction performance. The current techniques either ignore the phase variability, or handle it via pre-processing, i.e…
▽ More
We study regression using functional predictors in situations where these functions contain both phase and amplitude variability. In other words, the functions are misaligned due to errors in time measurements, and these errors can significantly degrade both model estimation and prediction performance. The current techniques either ignore the phase variability, or handle it via pre-processing, i.e., use an off-the-shelf technique for functional alignment and phase removal. We develop a functional principal component regression model which has comprehensive approach in handling phase and amplitude variability. The model utilizes a mathematical representation of the data known as the square-root slope function. These functions preserve the $\mathbf{L}^2$ norm under war** and are ideally suited for simultaneous estimation of regression and war** parameters. Using both simulated and real-world data sets, we demonstrate our approach and evaluate its prediction performance relative to current models. In addition, we propose an extension to functional logistic and multinomial logistic regression
△ Less
Submitted 29 May, 2018;
originally announced May 2018.
-
A Geometric Approach for Computing Tolerance Bounds for Elastic Functional Data
Authors:
J. Derek Tucker,
John R. Lewis,
Caleb King,
Sebastian Kurtek
Abstract:
We develop a method for constructing tolerance bounds for functional data with random war** variability. In particular, we define a generative, probabilistic model for the amplitude and phase components of such observations, which parsimoniously characterizes variability in the baseline data. Based on the proposed model, we define two different types of tolerance bounds that are able to measure…
▽ More
We develop a method for constructing tolerance bounds for functional data with random war** variability. In particular, we define a generative, probabilistic model for the amplitude and phase components of such observations, which parsimoniously characterizes variability in the baseline data. Based on the proposed model, we define two different types of tolerance bounds that are able to measure both types of variability, and as a result, identify when the data has gone beyond the bounds of amplitude and/or phase. The first functional tolerance bounds are computed via a bootstrap procedure on the geometric space of amplitude and phase functions. The second functional tolerance bounds utilize functional Principal Component Analysis to construct a tolerance factor. This work is motivated by two main applications: process control and disease monitoring. The problem of statistical analysis and modeling of functional data in process control is important in determining when a production has moved beyond a baseline. Similarly, in biomedical applications, doctors use long, approximately periodic signals (such as the electrocardiogram) to diagnose and monitor diseases. In this context, it is desirable to identify abnormalities in these signals. We additionally consider a simulated example to assess our approach and compare it to two existing methods.
△ Less
Submitted 25 April, 2019; v1 submitted 29 May, 2018;
originally announced May 2018.
-
Generative Models for Functional Data using Phase and Amplitude Separation
Authors:
J. Derek Tucker,
Wei Wu,
Anuj Srivastava
Abstract:
Constructing generative models for functional observations is an important task in statistical functional analysis. In general, functional data contains both phase (or x or horizontal) and amplitude (or y or vertical) variability. Tradi- tional methods often ignore the phase variability and focus solely on the amplitude variation, using cross-sectional techniques such as fPCA for dimensional reduc…
▽ More
Constructing generative models for functional observations is an important task in statistical functional analysis. In general, functional data contains both phase (or x or horizontal) and amplitude (or y or vertical) variability. Tradi- tional methods often ignore the phase variability and focus solely on the amplitude variation, using cross-sectional techniques such as fPCA for dimensional reduction and data modeling. Ignoring phase variability leads to a loss of structure in the data and inefficiency in data models. This paper presents an approach that relies on separating the phase (x-axis) and amplitude (y-axis), then modeling these components using joint distributions. This separation, in turn, is performed using a technique called elastic shape analysis of curves that involves a new mathematical representation of functional data. Then, using individual fPCAs, one each for phase and amplitude components, while respecting the nonlinear geometry of the phase representation space; impose joint probability models on principal coefficients of these components. These ideas are demonstrated using random sampling, for models estimated from simulated and real datasets, and show their superiority over models that ignore phase-amplitude separation. Furthermore, the generative models are applied to classification of functional data and achieve high performance in applications involv- ing SONAR signals of underwater objects, handwritten signatures, and periodic body movements recorded by smart phones.
△ Less
Submitted 18 December, 2012; v1 submitted 8 December, 2012;
originally announced December 2012.