-
Scalar-on-function local linear regression and beyond
Authors:
Frédéric Ferraty,
Stanislav Nagy
Abstract:
Regressing a scalar response on a random function is nowadays a common situation. In the nonparametric setting, this paper paves the way for making the local linear regression based on a projection approach a prominent method for solving this regression problem. Our asymptotic results demonstrate that the functional local linear regression outperforms its functional local constant counterpart. Bey…
▽ More
Regressing a scalar response on a random function is nowadays a common situation. In the nonparametric setting, this paper paves the way for making the local linear regression based on a projection approach a prominent method for solving this regression problem. Our asymptotic results demonstrate that the functional local linear regression outperforms its functional local constant counterpart. Beyond the estimation of the regression operator itself, the local linear regression is also a useful tool for predicting the functional derivative of the regression operator, a promising mathematical object on its own. The local linear estimator of the functional derivative is shown to be consistent. On simulated datasets we illustrate good finite sample properties of both proposed methods. On a real data example of a single-functional index model we indicate how the functional derivative of the regression operator provides an original and fast, widely applicable estimating method.
△ Less
Submitted 18 July, 2019;
originally announced July 2019.
-
Estimation of temperature-dependent growth profiles for the assessment of time of hatching in forensic entomology
Authors:
D. Pigoli,
J. A. D. Aston,
F. Ferraty,
A. Mazumder,
C. Richards,
M. J. R. Hall
Abstract:
Forensic entomology contributes important information to crime scene investigations. In this paper, we propose a method to estimate the hatching time of larvae (or maggots) based on their lengths, the temperature profile at the crime scene and experimental data on larval development. This requires the estimation of a time-dependent growth curve from experiments where larvae have been exposed to a…
▽ More
Forensic entomology contributes important information to crime scene investigations. In this paper, we propose a method to estimate the hatching time of larvae (or maggots) based on their lengths, the temperature profile at the crime scene and experimental data on larval development. This requires the estimation of a time-dependent growth curve from experiments where larvae have been exposed to a relatively small number of constant temperature profiles. Since the temperature influences the developmental speed, a crucial step is the time alignment of the curves at different temperatures. We propose a model for time varying temperature profiles based on the local growth rate estimated from the experimental data. This allows us to estimate the most likely hatching time for a sample of larvae from the crime scene. Asymptotic properties are provided for the estimators of the growth curves and the hatching time. We explore via simulations the robustness of the method to errors in the estimated temperature profile. We also apply the methodology to data from two criminal cases from the United Kingdom.
△ Less
Submitted 4 November, 2021; v1 submitted 2 September, 2017;
originally announced September 2017.
-
Stable and predictive functional domain selection with application to brain images
Authors:
Ah Yeon Park,
John A. D. Aston,
Frederic Ferraty
Abstract:
Motivated by increasing trends of relating brain images to a clinical outcome of interest, we propose a functional domain selection (FuDoS) method that effectively selects subregions of the brain associated with the outcome. View each individual's brain as a 3D functional object, the statistical aim is to distinguish the region where a regression coefficient $β(t)=0$ from $β(t)\neq0$, where $t$ de…
▽ More
Motivated by increasing trends of relating brain images to a clinical outcome of interest, we propose a functional domain selection (FuDoS) method that effectively selects subregions of the brain associated with the outcome. View each individual's brain as a 3D functional object, the statistical aim is to distinguish the region where a regression coefficient $β(t)=0$ from $β(t)\neq0$, where $t$ denotes spatial location. FuDoS is composed of two stages of estimation. We first segment the brain into several small parts based on the correlation structure. Then, potential subsets are built using the obtained segments and their predictive performance are evaluated to select the best subset, augmented by a stability selection criterion. We conduct extensive simulations both for 1D and 3D functional data, and evaluate its effectiveness in selecting the true subregion. We also investigate predictive ability of the selected stable regions. To find the brain regions related to cognitive ability, FuDoS is applied to the ADNI's PET data. Due to the induced sparseness, the results naturally provide more interpretable information about the relations between the regions and the outcome. Moreover, the selected regions from our analysis show high associations with the expected anatomical brain areas known to have memory-related functions.
△ Less
Submitted 7 June, 2016;
originally announced June 2016.
-
Fast forward feature selection for the nonlinear classification of hyperspectral images
Authors:
Mathieu Fauvel,
Clement Dechesne,
Anthony Zullo,
Frédéric Ferraty
Abstract:
A fast forward feature selection algorithm is presented in this paper. It is based on a Gaussian mixture model (GMM) classifier. GMM are used for classifying hyperspectral images. The algorithm selects iteratively spectral features that maximizes an estimation of the classification rate. The estimation is done using the k-fold cross validation. In order to perform fast in terms of computing time,…
▽ More
A fast forward feature selection algorithm is presented in this paper. It is based on a Gaussian mixture model (GMM) classifier. GMM are used for classifying hyperspectral images. The algorithm selects iteratively spectral features that maximizes an estimation of the classification rate. The estimation is done using the k-fold cross validation. In order to perform fast in terms of computing time, an efficient implementation is proposed. First, the GMM can be updated when the estimation of the classification rate is computed, rather than re-estimate the full model. Secondly, using marginalization of the GMM, sub models can be directly obtained from the full model learned with all the spectral features. Experimental results for two real hyperspectral data sets show that the method performs very well in terms of classification accuracy and processing time. Furthermore, the extracted model contains very few spectral channels.
△ Less
Submitted 5 January, 2015;
originally announced January 2015.
-
An Algorithm for Nonlinear, Nonparametric Model Choice and Prediction
Authors:
Frédéric Ferraty,
Peter Hall
Abstract:
We introduce an algorithm which, in the context of nonlinear regression on vector-valued explanatory variables, chooses those combinations of vector components that provide best prediction. The algorithm devotes particular attention to components that might be of relatively little predictive value by themselves, and so might be ignored by more conventional methodology for model choice, but which,…
▽ More
We introduce an algorithm which, in the context of nonlinear regression on vector-valued explanatory variables, chooses those combinations of vector components that provide best prediction. The algorithm devotes particular attention to components that might be of relatively little predictive value by themselves, and so might be ignored by more conventional methodology for model choice, but which, in combination with other difficult-to-find components, can be particularly beneficial for prediction. Additionally the algorithm avoids choosing vector components that become redundant once appropriate combinations of other, more relevant components are selected. It is suitable for very high dimensional problems, where it keeps computational labour in check by using a novel sequential argument, and also for more conventional prediction problems, where dimension is relatively low. We explore properties of the algorithm using both theoretical and numerical arguments.
△ Less
Submitted 31 January, 2014;
originally announced January 2014.
-
Various Approaches for Predicting Land Cover in Mountain Areas
Authors:
Nathalie Villa,
Martin Paegelow,
Maria T. Camacho Olmedo,
Laurence Cornez,
Frédéric Ferraty,
Louis Ferré,
Pascal Sarda
Abstract:
Using former maps, geographers intend to study the evolution of the land cover in order to have a prospective approach on the future landscape; predictions of the future land cover, by the use of older maps and environmental variables, are usually done through the GIS (Geographic Information System). We propose here to confront this classical geographical approach with statistical approaches: a…
▽ More
Using former maps, geographers intend to study the evolution of the land cover in order to have a prospective approach on the future landscape; predictions of the future land cover, by the use of older maps and environmental variables, are usually done through the GIS (Geographic Information System). We propose here to confront this classical geographical approach with statistical approaches: a linear parametric model (polychotomous regression modeling) and a nonparametric one (multilayer perceptron). These methodologies have been tested on two real areas on which the land cover is known at various dates; this allows us to emphasize the benefit of these two statistical approaches compared to GIS and to discuss the way GIS could be improved by the use of statistical models.
△ Less
Submitted 3 May, 2007;
originally announced May 2007.
-
Modélisations prospectives de l'occupation du sol. Le cas d'une montagne méditerranéenne
Authors:
Martin Paegelow,
Nathalie Villa,
Laurence Cornez,
Frédéric Ferraty,
Louis Ferré,
Pascal Sarda
Abstract:
The authors apply three methods of prospective modelling to high resolution georeferenced land cover data in a Mediterranean mountain area: GIS approach, non linear parametric model and neuronal network. Land cover prediction to the latest known date is used to validate the models. In the frame of spatial-temporal dynamics in open systems results are encouraging and comparable. Correct predictio…
▽ More
The authors apply three methods of prospective modelling to high resolution georeferenced land cover data in a Mediterranean mountain area: GIS approach, non linear parametric model and neuronal network. Land cover prediction to the latest known date is used to validate the models. In the frame of spatial-temporal dynamics in open systems results are encouraging and comparable. Correct prediction scores are about 73 %. The results analysis focuses on geographic location, land cover categories and parametric distance to reality of the residues. Crossing the three models show the high degree of convergence and a relative similitude of the results obtained by the two statistic approaches compared to the GIS supervised model. Steps under work are the application of the models to other test areas and the identification of respective advantages to develop an integrated model.
△ Less
Submitted 9 May, 2007; v1 submitted 2 May, 2007;
originally announced May 2007.
-
Advances on nonparametric regression for functional variables
Authors:
Frédéric Ferraty,
André Mas,
Philippe Vieu
Abstract:
We consider the problem of predicting a real random variable from a functional explanatory variable. The problem is attacked by mean of nonparametric kernel approach which has been recently adapted to this functional context. We derive theoretical results by giving a deep asymptotic study of the behaviour of the estimate, including mean squared convergence (with rates and precise evaluation of t…
▽ More
We consider the problem of predicting a real random variable from a functional explanatory variable. The problem is attacked by mean of nonparametric kernel approach which has been recently adapted to this functional context. We derive theoretical results by giving a deep asymptotic study of the behaviour of the estimate, including mean squared convergence (with rates and precise evaluation of the constant terms) as well as asymptotic distribution. Practical use of these results are relying on the ability to estimate these constants. Some perspectives in this direction are discussed including the presentation of a functional version of bootstrap** ideas.
△ Less
Submitted 3 March, 2006;
originally announced March 2006.