-
Robust Nonparametric Regression for Compositional Data: the Simplicial--Real case
Authors:
Ana M. Bianco,
Graciela Boente,
Wenceslao González--Manteiga,
Francisco Gude Sampedro,
Ana Pérez--González
Abstract:
Statistical analysis on compositional data has gained a lot of attention due to their great potential of applications. A feature of these data is that they are multivariate vectors that lie in the simplex, that is, the components of each vector are positive and sum up a constant value. This fact poses a challenge to the analyst due to the internal dependency of the components which exhibit a spuri…
▽ More
Statistical analysis on compositional data has gained a lot of attention due to their great potential of applications. A feature of these data is that they are multivariate vectors that lie in the simplex, that is, the components of each vector are positive and sum up a constant value. This fact poses a challenge to the analyst due to the internal dependency of the components which exhibit a spurious negative correlation. Since classical multivariate techniques are not appropriate in this scenario, it is necessary to endow the simplex of a suitable algebraic-geometrical structure, which is a starting point to develop adequate methodology and strategies to handle compositions. We centered our attention on regression problems with real responses and compositional covariates and we adopt a nonparametric approach due to the flexibility it provides. Aware of the potential damage that outliers may produce, we introduce a robust estimator in the framework of nonparametric regression for compositional data. The performance of the estimators is investigated by means of a numerical study where different contamination schemes are simulated. Through a real data analysis the advantages of using a robust procedure is illustrated.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Robust variable selection for partially linear additive models
Authors:
Graciela Boente,
Alejandra Martínez
Abstract:
Among semiparametric regression models, partially linear additive models provide a useful tool to include additive nonparametric components as well as a parametric component, when explaining the relationship between the response and a set of explanatory variables. This paper concerns such models under sparsity assumptions for the covariates included in the linear component. Sparse covariates are f…
▽ More
Among semiparametric regression models, partially linear additive models provide a useful tool to include additive nonparametric components as well as a parametric component, when explaining the relationship between the response and a set of explanatory variables. This paper concerns such models under sparsity assumptions for the covariates included in the linear component. Sparse covariates are frequent in regression problems where the task of variable selection is usually of interest. As in other settings, outliers either in the residuals or in the covariates involved in the linear component have a harmful effect. To simultaneously achieve model selection for the parametric component of the model and resistance to outliers, we combine preliminary robust estimators of the additive component, robust linear $MM-$regression estimators with a penalty such as SCAD on the coefficients in the parametric part. Under mild assumptions, consistency results and rates of convergence for the proposed estimators are derived. A Monte Carlo study is carried out to compare, under different models and contamination schemes, the performance of the robust proposal with its classical counterpart. The obtained results show the advantage of using the robust approach. Through the analysis of a real data set, we also illustrate the benefits of the proposed procedure.
△ Less
Submitted 31 January, 2024; v1 submitted 19 January, 2024;
originally announced January 2024.
-
Robust estimation of heteroscedastic regression models: a brief overview and new proposals
Authors:
Conceição Amado,
Ana M. Bianco,
Graciela Boente,
Isabel M. Rodrigues
Abstract:
We collect robust proposals given in the field of regression models with heteroscedastic errors. Our motivation stems from the fact that the practitioner frequently faces the confluence of two phenomena in the context of data analysis: non--linearity and heteroscedasticity. The impact of heteroscedasticity on the precision of the estimators is well--known, however the conjunction of these two phen…
▽ More
We collect robust proposals given in the field of regression models with heteroscedastic errors. Our motivation stems from the fact that the practitioner frequently faces the confluence of two phenomena in the context of data analysis: non--linearity and heteroscedasticity. The impact of heteroscedasticity on the precision of the estimators is well--known, however the conjunction of these two phenomena makes handling outliers more difficult.
An iterative procedure to estimate the parameters of a heteroscedastic non--linear model is considered. The studied estimators combine weighted $MM-$regression estimators, to control the impact of high leverage points, and a robust method to estimate the parameters of the variance function.
△ Less
Submitted 7 November, 2023; v1 submitted 5 November, 2023;
originally announced November 2023.
-
Threshold detection under a semiparametric regression model
Authors:
Graciela Boente,
Florencia Leonardi,
Daniela Rodriguez,
Mariela Sued
Abstract:
Linear regression models have been extensively considered in the literature. However, in some practical applications they may not be appropriate all over the range of the covariate. In this paper, a more flexible model is introduced by considering a regression model $Y=r(X)+\varepsilon$ where the regression function $r(\cdot)$ is assumed to be linear for large values in the domain of the predictor…
▽ More
Linear regression models have been extensively considered in the literature. However, in some practical applications they may not be appropriate all over the range of the covariate. In this paper, a more flexible model is introduced by considering a regression model $Y=r(X)+\varepsilon$ where the regression function $r(\cdot)$ is assumed to be linear for large values in the domain of the predictor variable $X$. More precisely, we assume that $r(x)=α_0+β_0 x$ for $x> u_0$, where the value $u_0$ is identified as the smallest value satisfying such a property. A penalized procedure is introduced to estimate the threshold $u_0$. The considered proposal focusses on a semiparametric approach since no parametric model is assumed for the regression function for values smaller than $u_0$. Consistency properties of both the threshold estimator and the estimators of $(α_0,β_0)$ are derived, under mild assumptions. Through a numerical study, the small sample properties of the proposed procedure and the importance of introducing a penalization are investigated. The analysis of a real data set allows us to demonstrate the usefulness of the penalized estimators.
△ Less
Submitted 17 December, 2023; v1 submitted 28 October, 2023;
originally announced October 2023.
-
Robust estimation for functional logistic regression models
Authors:
Graciela Boente,
Marina Valdora
Abstract:
This paper addresses the problem of providing robust estimators under a functional logistic regression model. Logistic regression is a popular tool in classification problems with two populations. As in functional linear regression, regularization tools are needed to compute estimators for the functional slope. The traditional methods are based on dimension reduction or penalization combined with…
▽ More
This paper addresses the problem of providing robust estimators under a functional logistic regression model. Logistic regression is a popular tool in classification problems with two populations. As in functional linear regression, regularization tools are needed to compute estimators for the functional slope. The traditional methods are based on dimension reduction or penalization combined with maximum likelihood or quasi--likelihood techniques and for that reason, they may be affected by misclassified points especially if they are associated to functional covariates with atypical behaviour. The proposal given in this paper adapts some of the best practices used when the covariates are finite--dimensional to provide reliable estimations. Under regularity conditions, consistency of the resulting estimators and rates of convergence for the predictions are derived. A numerical study illustrates the finite sample performance of the proposed method and reveals its stability under different contamination scenarios. A real data example is also presented.
△ Less
Submitted 15 August, 2023; v1 submitted 5 August, 2023;
originally announced August 2023.
-
Robust estimation for functional quadratic regression models
Authors:
Graciela Boente,
Daniela Parada
Abstract:
Functional quadratic regression models postulate a polynomial relationship between a scalar response rather than a linear one. As in functional linear regression, vertical and specially high-leverage outliers may affect the classical estimators. For that reason, the proposal of robust procedures providing reliable estimators in such situations is an important issue. Taking into account that the fu…
▽ More
Functional quadratic regression models postulate a polynomial relationship between a scalar response rather than a linear one. As in functional linear regression, vertical and specially high-leverage outliers may affect the classical estimators. For that reason, the proposal of robust procedures providing reliable estimators in such situations is an important issue. Taking into account that the functional polynomial model is equivalent to a regression model that is a polynomial of the same order in the functional principal component scores of the predictor processes, our proposal combines robust estimators of the principal directions with robust regression estimators based on a bounded loss function and a preliminary residual scale estimator. Fisher-consistency of the proposed method is derived under mild assumptions. The results of a numerical study show, for finite samples, the benefits of the robust proposal over the one based on sample principal directions and least squares. The usefulness of the proposed approach is also illustrated through the analysis of a real data set which reveals that when the potential outliers are removed the classical and robust methods behave very similarly.
△ Less
Submitted 29 May, 2023; v1 submitted 6 September, 2022;
originally announced September 2022.
-
Robust tests for equality of regression curves based on characteristic functions
Authors:
Graciela Boente,
Juan Carlos Pardo-Fernández
Abstract:
This paper focuses on the problem of testing the null hypothesis that the regression functions of several populations are equal under a general nonparametric homoscedastic regression model. It is well known that linear kernel regression estimators are sensitive to atypical responses. These distorted estimates will influence the test statistic constructed from them so the conclusions obtained when…
▽ More
This paper focuses on the problem of testing the null hypothesis that the regression functions of several populations are equal under a general nonparametric homoscedastic regression model. It is well known that linear kernel regression estimators are sensitive to atypical responses. These distorted estimates will influence the test statistic constructed from them so the conclusions obtained when testing equality of several regression functions may also be affected. In recent years, the use of testing procedures based on empirical characteristic functions has shown good practical properties. For that reason, to provide more reliable inferences, we construct a test statistic that combines characteristic functions and residuals obtained from a robust smoother under the null hypothesis. The asymptotic distribution of the test statistic is studied under the null hypothesis and under root$-n$ contiguous alternatives. A Monte Carlo study is performed to compare the finite sample behaviour of the proposed test with the classical one obtained using local averages. The reported numerical experiments show the advantage of the proposed methodology over the one based on Nadaraya-Watson estimators for finite samples. An illustration to a real data set is also provided and enables to investigate the sensitivity of the $p-$value to the bandwidth selection.
△ Less
Submitted 31 August, 2023; v1 submitted 24 May, 2022;
originally announced May 2022.
-
Asymptotic behaviour of penalized robust estimators in logistic regression when dimension increases
Authors:
Ana M. Bianco,
Graciela Boente,
Gonzalo Chebi
Abstract:
Penalized $M-$estimators for logistic regression models have been previously study for fixed dimension in order to obtain sparse statistical models and automatic variable selection. In this paper, we derive asymptotic results for penalized $M-$estimators when the dimension $p$ grows to infinity with the sample size $n$. Specifically, we obtain consistency and rates of convergence results, for some…
▽ More
Penalized $M-$estimators for logistic regression models have been previously study for fixed dimension in order to obtain sparse statistical models and automatic variable selection. In this paper, we derive asymptotic results for penalized $M-$estimators when the dimension $p$ grows to infinity with the sample size $n$. Specifically, we obtain consistency and rates of convergence results, for some choices of the penalty function. Moreover, we prove that these estimators consistently select variables with probability tending to 1 and derive their asymptotic distribution.
△ Less
Submitted 4 August, 2023; v1 submitted 28 January, 2022;
originally announced January 2022.
-
Estimators for covariate-adjusted ROC curves with missing biomarkers values
Authors:
Ana M. Bianco,
Graciela Boente,
Wenceslao González-Manteiga,
Ana Pérez-González
Abstract:
In this paper, we present three estimators of the ROC curve when missing observations arise among the biomarkers. Two of the procedures assume that we have covariates that allow to estimate the propensity and the estimators are obtained using an inverse probability weighting method or a smoothed version of it. The other one assumes that the covariates are related to the biomarkers through a regres…
▽ More
In this paper, we present three estimators of the ROC curve when missing observations arise among the biomarkers. Two of the procedures assume that we have covariates that allow to estimate the propensity and the estimators are obtained using an inverse probability weighting method or a smoothed version of it. The other one assumes that the covariates are related to the biomarkers through a regression model which enables us to construct convolution--based estimators of the distribution and quantile functions. Consistency results are obtained under mild conditions. Through a numerical study we evaluate the finite sample performance of the different proposals. A real data set is also analysed.
△ Less
Submitted 17 January, 2022;
originally announced January 2022.
-
A robust spline approach in partially linear additive models
Authors:
Graciela Boente,
Alejandra Mercedes Martinez
Abstract:
Partially linear additive models generalize linear ones since they model the relation between a response variable and covariates by assuming that some covariates have a linear relation with the response but each of the others enter through unknown univariate smooth functions. The harmful effect of outliers either in the residuals or in the covariates involved in the linear component has been descr…
▽ More
Partially linear additive models generalize linear ones since they model the relation between a response variable and covariates by assuming that some covariates have a linear relation with the response but each of the others enter through unknown univariate smooth functions. The harmful effect of outliers either in the residuals or in the covariates involved in the linear component has been described in the situation of partially linear models, that is, when only one nonparametric component is involved in the model. When dealing with additive components, the problem of providing reliable estimators when atypical data arise, is of practical importance motivating the need of robust procedures. Hence, we propose a family of robust estimators for partially linear additive models by combining $B-$splines with robust linear regression estimators. We obtain consistency results, rates of convergence and asymptotic normality for the linear components, under mild assumptions. A Monte Carlo study is carried out to compare the performance of the robust proposal with its classical counterpart under different models and contamination schemes. The numerical experiments show the advantage of the proposed methodology for finite samples. We also illustrate the usefulness of the proposed approach on a real data set.
△ Less
Submitted 4 August, 2023; v1 submitted 27 July, 2021;
originally announced July 2021.
-
Robust functional principal components for sparse longitudinal data
Authors:
Graciela Boente,
Matias Salibian-Barrera
Abstract:
In this paper we review existing methods for robust functional principal component analysis (FPCA) and propose a new method for FPCA that can be applied to longitudinal data where only a few observations per trajectory are available. This method is robust against the presence of atypical observations, and can also be used to derive a new non-robust FPCA approach for sparsely observed functional da…
▽ More
In this paper we review existing methods for robust functional principal component analysis (FPCA) and propose a new method for FPCA that can be applied to longitudinal data where only a few observations per trajectory are available. This method is robust against the presence of atypical observations, and can also be used to derive a new non-robust FPCA approach for sparsely observed functional data. We use local regression to estimate the values of the covariance function, taking advantage of the fact that for elliptically distributed random vectors the conditional location parameter of some of its components given others is a linear function of the conditioning set. This observation allows us to obtain robust FPCA estimators by using robust local regression methods. The finite sample performance of our proposal is explored through a simulation study that shows that, as expected, the robust method outperforms existing alternatives when the data are contaminated. Furthermore, we also see that for samples that do not contain outliers the non-robust variant of our proposal compares favourably to the existing alternative in the literature. A real data example is also presented.
△ Less
Submitted 2 December, 2020;
originally announced December 2020.
-
Robust smoothed canonical correlation analysis for functional data
Authors:
Graciela Boente,
Nadia Kudraszow
Abstract:
This paper provides robust estimators for the first canonical correlation and directions of random elements on Hilbert separable spaces by using robust association and scale measures combined with basis expansion and/or penalizations as a regularization tool. Under regularity conditions, the resulting estimators are consistent.
This paper provides robust estimators for the first canonical correlation and directions of random elements on Hilbert separable spaces by using robust association and scale measures combined with basis expansion and/or penalizations as a regularization tool. Under regularity conditions, the resulting estimators are consistent.
△ Less
Submitted 20 November, 2020;
originally announced November 2020.
-
A robust approach for ROC curves with covariates
Authors:
Ana M. Bianco,
Graciela Boente,
Wenceslao Gonzalez-Manteiga
Abstract:
The Receiver Operating Characteristic (ROC) curve is a useful tool that measures the discriminating power of a continuous variable or the accuracy of a pharmaceutical or medical test to distinguish between two conditions or classes. In certain situations, the practitioner may be able to measure some covariates related to the diagnostic variable which can increase the discriminating power of the RO…
▽ More
The Receiver Operating Characteristic (ROC) curve is a useful tool that measures the discriminating power of a continuous variable or the accuracy of a pharmaceutical or medical test to distinguish between two conditions or classes. In certain situations, the practitioner may be able to measure some covariates related to the diagnostic variable which can increase the discriminating power of the ROC curve. To protect against the existence of atypical data among the observations, a procedure to obtain robust estimators for the ROC curve in presence of covariates is introduced. The considered proposal focusses on a semiparametric approach which fits a location-scale regression model to the diagnostic variable and considers empirical estimators of the regression residuals distributions. Robust parametric estimators are combined with adaptive weighted empirical distribution estimators to down-weight the influence of outliers. The uniform consistency of the proposal is derived under mild assumptions. A Monte Carlo study is carried out to compare the performance of the robust proposed estimators with the classical ones both, in clean and contaminated samples. A real data set is also analysed.
△ Less
Submitted 23 July, 2022; v1 submitted 30 June, 2020;
originally announced July 2020.
-
Robust estimation for semi-functional linear regression models
Authors:
Graciela Boente,
Matias Salibian-Barrera,
Pablo Vena
Abstract:
Semi-functional linear regression models postulate a linear relationship between a scalar response and a functional covariate, and also include a non-parametric component involving a univariate explanatory variable. It is of practical importance to obtain estimators for these models that are robust against high-leverage outliers, which are generally difficult to identify and may cause serious dama…
▽ More
Semi-functional linear regression models postulate a linear relationship between a scalar response and a functional covariate, and also include a non-parametric component involving a univariate explanatory variable. It is of practical importance to obtain estimators for these models that are robust against high-leverage outliers, which are generally difficult to identify and may cause serious damage to least squares and Huber-type $M$-estimators. For that reason, robust estimators for semi-functional linear regression models are constructed combining $B$-splines to approximate both the functional regression parameter and the nonparametric component with robust regression estimators based on a bounded loss function and a preliminary residual scale estimator. Consistency and rates of convergence for the proposed estimators are derived under mild regularity conditions. The reported numerical experiments show the advantage of the proposed methodology over the classical least squares and Huber-type $M$-estimators for finite samples. The analysis of real examples illustrate that the robust estimators provide better predictions for non-outlying points than the classical ones, and that when potential outliers are removed from the training and test sets both methods behave very similarly.
△ Less
Submitted 5 August, 2023; v1 submitted 29 June, 2020;
originally announced June 2020.
-
Principal points and elliptical distributions from the multivariate setting to the functional case
Authors:
Juan Lucas Bali,
Graciela Boente
Abstract:
The $k$ principal points of a random vector $\mathbf{X}$ are defined as a set of points which minimize the expected squared distance between $\mathbf{X}$ and the nearest point in the set. They are thoroughly studied in Flury (1990, 1993), Tarpey (1995) and Tarpey, Li and Flury (1995). For their treatment, the examination is usually restricted to the family of elliptical distributions. In this pape…
▽ More
The $k$ principal points of a random vector $\mathbf{X}$ are defined as a set of points which minimize the expected squared distance between $\mathbf{X}$ and the nearest point in the set. They are thoroughly studied in Flury (1990, 1993), Tarpey (1995) and Tarpey, Li and Flury (1995). For their treatment, the examination is usually restricted to the family of elliptical distributions. In this paper, we present an extension of the previous results to the functional elliptical distribution case, i.e., when dealing with random elements over a separable Hilbert space ${\cal H}$. Principal points for gaussian processes were defined in Tarpey and Kinateder (2003). In this paper, we generalize the concepts of principal points, self-consistent points and elliptical distributions so as to fit them in this functional framework. Results linking self-consistency and the eigenvectors of the covariance operator are re-obtained in this new setting as well as an explicit formula for the $k=2$ case so as to include elliptically distributed random elements in ${\cal H}$.
△ Less
Submitted 7 June, 2020;
originally announced June 2020.
-
Robust location estimators in regression models with covariates and responses missing at random
Authors:
Ana M. Bianco,
Graciela Boente,
Wenceslao González-Manteiga,
Ana Pérez-González
Abstract:
This paper deals with robust marginal estimation under a general regression model when missing data occur in the response and also in some of covariates. The target is a marginal location parameter which is given through an $M-$functional. To obtain robust Fisher--consistent estimators, properly defined marginal distribution function estimators are considered. These estimators avoid the bias due t…
▽ More
This paper deals with robust marginal estimation under a general regression model when missing data occur in the response and also in some of covariates. The target is a marginal location parameter which is given through an $M-$functional. To obtain robust Fisher--consistent estimators, properly defined marginal distribution function estimators are considered. These estimators avoid the bias due to missing values by assuming a missing at random condition. Three methods are considered to estimate the marginal distribution function which allows to obtain the $M-$location of interest: the well-known inverse probability weighting, a convolution--based method that makes use of the regression model and an augmented inverse probability weighting procedure that prevents against misspecification. The robust proposed estimators and the classical ones are compared through a numerical study under different missing models including clean and contaminated samples. We illustrate the estimators behaviour under a nonlinear model. A real data set is also analysed.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
Penalized robust estimators in logistic regression with applications to sparse models
Authors:
Ana M. Bianco,
Graciela Boente,
Gonzalo Chebi
Abstract:
Sparse covariates are frequent in classification and regression problems and in these settings the task of variable selection is usually of interest. As it is well known, sparse statistical models correspond to situations where there are only a small number of non--zero parameters and for that reason, they are much easier to interpret than dense ones. In this paper, we focus on the logistic regres…
▽ More
Sparse covariates are frequent in classification and regression problems and in these settings the task of variable selection is usually of interest. As it is well known, sparse statistical models correspond to situations where there are only a small number of non--zero parameters and for that reason, they are much easier to interpret than dense ones. In this paper, we focus on the logistic regression model and our aim is to address robust and penalized estimation for the regression parameter. We introduce a family of penalized weighted $M-$type estimators for the logistic regression parameter that are stable against atypical data. We explore different penalizations functions and we introduce the so--called Sign penalization. This new penalty has the advantage that it depends only on one penalty parameter, avoiding arbitrary tuning constants. We discuss the variable selection capability of the given proposals as well as their asymptotic behaviour. Through a numerical study, we compare the finite sample performance of the proposal corresponding to different penalized estimators either robust or classical, under different scenarios. A robust cross--validation criterion is also presented. The analysis of two real data sets enables to investigate the stability of the penalized estimators to the presence of outliers.
△ Less
Submitted 12 February, 2020; v1 submitted 1 November, 2019;
originally announced November 2019.
-
The spatial sign covariance operator: Asymptotic results and applications
Authors:
Graciela Boente,
Daniela Rodriguez,
Mariela Sued
Abstract:
Due to the increasing recording capability, functional data analysis has become an important research topic. For functional data the study of outlier detection and/or the development of robust statistical procedures has started recently. One robust alternative to the sample covariance operator is the sample spatial sign covariance operator. In this paper, we study the asymptotic behaviour of the s…
▽ More
Due to the increasing recording capability, functional data analysis has become an important research topic. For functional data the study of outlier detection and/or the development of robust statistical procedures has started recently. One robust alternative to the sample covariance operator is the sample spatial sign covariance operator. In this paper, we study the asymptotic behaviour of the sample spatial sign covariance operator when location is unknown. Among other possible applications of the obtained results, we derive the asymptotic distribution of the principal directions obtained from the sample spatial sign covariance operator and we develop test to detect differences between the scatter operators of two populations. In particular, the test performance is illustrated through a Monte Carlo study for small sample sizes.
△ Less
Submitted 11 April, 2018;
originally announced April 2018.
-
Robust estimators in a generalized partly linear regression model under monotony constraints
Authors:
Graciela Boente,
Daniela Rodriguez,
Pablo Vena
Abstract:
In this paper, we consider the situation in which the observations follow an isotonic generalized partly linear model. Under this model, the mean of the responses is modelled, through a link function, linearly on some covariates and nonparametrically on an univariate regressor in such a way that the nonparametric component is assumed to be a monotone function. A class of robust estimates for the m…
▽ More
In this paper, we consider the situation in which the observations follow an isotonic generalized partly linear model. Under this model, the mean of the responses is modelled, through a link function, linearly on some covariates and nonparametrically on an univariate regressor in such a way that the nonparametric component is assumed to be a monotone function. A class of robust estimates for the monotone nonparametric component and for the regression parameter, related to the linear one, is defined. The robust estimators are based on a spline approach combined with a score function which bounds large values of the deviance. As an application, we consider the isotonic partly linear log--Gamma regression model. Through a Monte Carlo study, we investigate the performance of the proposed estimators under a partly linear log-Gamma regression model with increasing nonparametric component.
△ Less
Submitted 29 November, 2018; v1 submitted 22 February, 2018;
originally announced February 2018.
-
Robust estimation in single index models when the errors have a unimodal density with unknown nuisance parameter
Authors:
Claudio Agostinelli,
Ana M. Bianco,
Graciela Boente
Abstract:
In this paper, we propose a robust profile estimation method for the parametric and nonparametric components of a single index model when the errors have a strongly unimodal density with unknown nuisance parameter. Under regularity conditions, we derive consistency results for the link function estimators as well as consistency and asymptotic distribution results for the single index parameter est…
▽ More
In this paper, we propose a robust profile estimation method for the parametric and nonparametric components of a single index model when the errors have a strongly unimodal density with unknown nuisance parameter. Under regularity conditions, we derive consistency results for the link function estimators as well as consistency and asymptotic distribution results for the single index parameter estimators. Under a log--Gamma model, the sensitivity to anomalous observations is studied by means of the empirical influence curve. We also discuss a robust $K-$fold procedure to select the smoothing parameters involved. A numerical study is conducted to evaluate the small sample performance of the robust proposal with that of their classical relatives, both for errors following a log--Gamma model and for contaminated schemes. The numerical experiment shows the good robustness properties of the proposed estimators and the advantages of considering a robust approach instead of the classical one.
△ Less
Submitted 24 January, 2018; v1 submitted 15 September, 2017;
originally announced September 2017.
-
Marginal integration $M-$estimators for additive models
Authors:
Graciela Boente,
Alejandra Martinez
Abstract:
Additive regression models have a long history in multivariate nonparametric regression. They provide a model in which each regression function depends only on a single explanatory variable allowing to obtain estimators at the optimal univariate rate. Beyond backfitting, marginal integration is a common procedure to estimate each component. In this paper, we propose a robust estimator of the addit…
▽ More
Additive regression models have a long history in multivariate nonparametric regression. They provide a model in which each regression function depends only on a single explanatory variable allowing to obtain estimators at the optimal univariate rate. Beyond backfitting, marginal integration is a common procedure to estimate each component. In this paper, we propose a robust estimator of the additive components which combines local polynomials on the component to be estimated and marginal integration. The proposed estimators are consistent and asymptotically normally distributed. A simulation study allows to show the advantage of the proposal over the classical one when outliers are present in the responses, leading to estimators with good robustness and efficiency properties.
△ Less
Submitted 14 September, 2015;
originally announced September 2015.
-
Conditional tests for elliptical symmetry using robust estimators
Authors:
Ana M. Bianco,
Graciela Boente,
Isabel M. Rodrigues
Abstract:
This paper presents a procedure for testing the hypothesis that the underlying distribution of the data is elliptical when using robust location and scatter estimators instead of the sample mean and covariance matrix. Under mild assumptions that include elliptical distributions without first moments, we derive the test statistic asymptotic behaviour under the null hypothesis and under special alte…
▽ More
This paper presents a procedure for testing the hypothesis that the underlying distribution of the data is elliptical when using robust location and scatter estimators instead of the sample mean and covariance matrix. Under mild assumptions that include elliptical distributions without first moments, we derive the test statistic asymptotic behaviour under the null hypothesis and under special alternatives. Numerical experiments allow to compare the behaviour of the tests based on the sample mean and covariance matrix with that based on robust estimators, under various elliptical distributions and different alternatives. This comparison was done looking not only at the observed level and power but we rather use the size-corrected relative exact power which provides a tool to assess the test statistic skill to detect alternatives. We also provide a numerical comparison with other competing tests.
△ Less
Submitted 19 February, 2015;
originally announced February 2015.
-
Testing equality between several populations covariance operators
Authors:
Graciela Boente,
Daniela Rodriguez,
Mariela Sued
Abstract:
In many situations, when dealing with several populations, equality of the covariance operators is assumed. An important issue is to study if this assumption holds before making other inferences. In this paper, we develop a test for comparing covariance operators of several functional data samples. The proposed test is based on the Hilbert--Schmidt norm of the difference between estimated covarian…
▽ More
In many situations, when dealing with several populations, equality of the covariance operators is assumed. An important issue is to study if this assumption holds before making other inferences. In this paper, we develop a test for comparing covariance operators of several functional data samples. The proposed test is based on the Hilbert--Schmidt norm of the difference between estimated covariance operators. In particular, when dealing with two populations, the tests statistic is just the squared norm of the difference between the two covariance operators estimators. The asymptotic behaviour of the test statistic under the null and under local alternatives is obtained. Since the statistic null asymptotic distribution does not allow to obtain easily its quantiles, a bootstrap procedure to compute the critical values is considered. The performance of the test statistics for small sample sizes is illustrated through a Monte Carlo study.
△ Less
Submitted 18 November, 2016; v1 submitted 28 April, 2014;
originally announced April 2014.
-
Robust functional principal components: A projection-pursuit approach
Authors:
Juan Lucas Bali,
Graciela Boente,
David E. Tyler,
Jane-Ling Wang
Abstract:
In many situations, data are recorded over a period of time and may be regarded as realizations of a stochastic process. In this paper, robust estimators for the principal components are considered by adapting the projection pursuit approach to the functional data setting. Our approach combines robust projection-pursuit with different smoothing methods. Consistency of the estimators are shown unde…
▽ More
In many situations, data are recorded over a period of time and may be regarded as realizations of a stochastic process. In this paper, robust estimators for the principal components are considered by adapting the projection pursuit approach to the functional data setting. Our approach combines robust projection-pursuit with different smoothing methods. Consistency of the estimators are shown under mild assumptions. The performance of the classical and robust procedures are compared in a simulation study under different contamination schemes.
△ Less
Submitted 9 March, 2012;
originally announced March 2012.
-
Robust estimates in generalized partially linear models
Authors:
Graciela Boente,
Xuming He,
Jianhui Zhou
Abstract:
In this paper, we introduce a family of robust estimates for the parametric and nonparametric components under a generalized partially linear model, where the data are modeled by $y_i|(\mathbf{x}_i,t_i)\sim F(\cdot,μ_i)$ with $μ_i=H(η(t_i)+\mathbf{x}_i^{$\mathrm{T}$}β)$, for some known distribution function F and link function H. It is shown that the estimates of $β$ are root-n consistent and as…
▽ More
In this paper, we introduce a family of robust estimates for the parametric and nonparametric components under a generalized partially linear model, where the data are modeled by $y_i|(\mathbf{x}_i,t_i)\sim F(\cdot,μ_i)$ with $μ_i=H(η(t_i)+\mathbf{x}_i^{$\mathrm{T}$}β)$, for some known distribution function F and link function H. It is shown that the estimates of $β$ are root-n consistent and asymptotically normal. Through a Monte Carlo study, the performance of these estimators is compared with that of the classical ones.
△ Less
Submitted 1 August, 2007;
originally announced August 2007.