Skip to main content

Showing 1–32 of 32 results for author: González-Manteiga, W

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.12711  [pdf, other

    stat.ME math.ST stat.AP

    Tests for categorical data beyond Pearson: A distance covariance and energy distance approach

    Authors: Fernando Castro-Prado, Wenceslao González-Manteiga, Javier Costas, Fernando Facal, Dominic Edelmann

    Abstract: Categorical variables are of uttermost importance in biomedical research. When two of them are considered, it is often the case that one wants to test whether or not they are statistically dependent. We show weaknesses of classical methods -- such as Pearson's and the G-test -- and we propose testing strategies based on distances that lack those drawbacks. We first develop this theory for classica… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 15 pages with 2 figures

  2. Testing for linearity in scalar-on-function regression with responses missing at random

    Authors: Manuel Febrero-Bande, Pedro Galeano, Eduardo García-Portugués, Wenceslao González-Manteiga

    Abstract: A goodness-of-fit test for the Functional Linear Model with Scalar Response (FLMSR) with responses Missing at Random (MAR) is proposed in this paper. The test statistic relies on a marked empirical process indexed by the projected functional covariate and its distribution under the null hypothesis is calibrated using a wild bootstrap procedure. The computation and performance of the test rely on h… ▽ More

    Submitted 22 March, 2024; v1 submitted 10 April, 2023; originally announced April 2023.

    Comments: 21 pages, 6 figures, 4 tables

    MSC Class: 62G10; 62J05; 62G09

    Journal ref: Computational Statistics, 2024

  3. arXiv:2208.08420  [pdf, other

    stat.AP

    A Comparative Review of Specification Tests for Diffusion Models

    Authors: Alejandra López-Pérez, Manuel Febrero-Bande, Wenceslao González-Manteiga

    Abstract: Diffusion models play an essential role in modeling continuous-time stochastic processes in the financial field. Therefore, several proposals have been developed in the last decades to test the specification of stochastic differential equations. We provide a survey to collect some developments on goodness-of-fit tests for diffusion models and implement these methods to illustrate their finite samp… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

  4. arXiv:2208.08415  [pdf, other

    stat.ME

    Estimation and Specification Test for Diffusion Models with Stochastic Volatility

    Authors: Alejandra López-Pérez, Manuel Febrero-Bande, Wenceslao González-Manteiga

    Abstract: Given the importance of continuous-time stochastic volatility models to describe the dynamics of interest rates, we propose a goodness-of-fit test for the parametric form of the drift and diffusion functions, based on a marked empirical process of the residuals. The test statistics are constructed using a continuous functional (Kolmogorov-Smirnov and Cramér-von Mises) over the empirical processes.… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

  5. arXiv:2208.00701  [pdf, other

    stat.ME stat.AP

    Novel specification tests for additive concurrent model formulation based on martingale difference divergence

    Authors: Laura Freijeiro-González, Manuel Febrero-Bande, Wenceslao González-Manteiga

    Abstract: Novel significance tests are proposed for the quite general additive concurrent model formulation without the need of model, error structure preliminary estimation or the use of tuning parameters. Making use of the martingale difference divergence coefficient, we propose new tests to measure the conditional mean independence in the concurrent model framework taking under consideration all observed… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

  6. arXiv:2206.12821  [pdf, other

    stat.ME

    A goodness-of-fit test for functional time series with applications to Ornstein-Uhlenbeck processes

    Authors: J. Álvarez-Liébana, A. López-Pérez, W. González-Manteiga, M. Febrero-Bande

    Abstract: High-frequency financial data can be collected as a sequence of curves over time; for example, as intra-day price, currently one of the topics of greatest interest in finance. The Functional Data Analysis framework provides a suitable tool to extract the information contained in the shape of the daily paths, often unavailable from classical statistical methods. In this paper, a novel goodness-of-f… ▽ More

    Submitted 26 June, 2022; originally announced June 2022.

  7. arXiv:2202.12019  [pdf, other

    stat.AP stat.ML

    Functional Classification of Bitcoin Addresses

    Authors: Manuel Febrero-Bande, Wenceslao González-Manteiga, Brenda Prallon, Yuri F. Saporito

    Abstract: This paper proposes a classification model for predicting the main activity of bitcoin addresses based on their balances. Since the balances are functions of time, we apply methods from functional data analysis; more specifically, the features of the proposed classification model are the functional principal components of the data. Classifying bitcoin addresses is a relevant problem for two main r… ▽ More

    Submitted 17 July, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

    Comments: Keywords: Bitcoin market, Darknet market, Functional Data Analysis, Functional Classification, Functional Principal Components

  8. arXiv:2201.06483  [pdf, other

    stat.ME

    Estimators for covariate-adjusted ROC curves with missing biomarkers values

    Authors: Ana M. Bianco, Graciela Boente, Wenceslao González-Manteiga, Ana Pérez-González

    Abstract: In this paper, we present three estimators of the ROC curve when missing observations arise among the biomarkers. Two of the procedures assume that we have covariates that allow to estimate the propensity and the estimators are obtained using an inverse probability weighting method or a smoothed version of it. The other one assumes that the covariates are related to the biomarkers through a regres… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

  9. arXiv:2105.13080  [pdf, other

    stat.ME

    A review of goodness-of-fit tests for models involving functional data

    Authors: Wenceslao González-Manteiga, Rosa M. Crujeiras, Eduardo García-Portugués

    Abstract: A sizable amount of goodness-of-fit tests involving functional data have appeared in the last decade. We provide a relatively compact revision of most of these contributions, within the independent and identically distributed framework, by reviewing goodness-of-fit tests for distribution and regression models with functional predictor and either scalar or functional response.

    Submitted 27 May, 2021; originally announced May 2021.

    Comments: 9 pages

    MSC Class: 62R10; 62G10; 62J05

  10. A test for comparing conditional ROC curves with multidimensional covariates

    Authors: Arís Fanjul-Hevia, Juan Carlos Pardo-Fernández, Ingrid Van Keilegom, Wenceslao González-Manteiga

    Abstract: The comparison of Receiver Operating Characteristic (ROC) curves is frequently used in the literature to compare the discriminatory capability of different classification procedures based on diagnostic variables. The performance of these variables can be sometimes influenced by the presence of other covariates, and thus they should be taken into account when making the comparison. A new non-parame… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Journal ref: Journal of Applied Statistics (2022)

  11. arXiv:2012.11470  [pdf, other

    stat.ME

    A critical review of LASSO and its derivatives for variable selection under dependence among covariates

    Authors: Laura Freijeiro-González, Manuel Febrero-Bande, Wenceslao González-Manteiga

    Abstract: We study the limitations of the well known LASSO regression as a variable selector when there exists dependence structures among covariates. We analyze both the classic situation with $n\geq p$ and the high dimensional framework with $p>n$. Restrictive properties of this methodology to guarantee optimality, as well as the inconveniences in practice, are analyzed. Examples of these drawbacks are sh… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

    Comments: 26 pages, 16 figures

  12. arXiv:2012.05285  [pdf, other

    math.ST q-bio.GN stat.AP

    Testing for genetic interactions in complex disease with distance correlation

    Authors: Fernando Castro-Prado, Javier Costas, Dominic Edelmann, Wenceslao González-Manteiga, David R. Penas

    Abstract: Understanding epistasis (genetic interaction) may shed some light on the genomic basis of common diseases, including disorders of maximum interest due to their high socioeconomic burden, like schizophrenia. Distance correlation is an association measure that characterises general statistical independence between random variables, not only the linear one. Here, we propose distance correlation as a… ▽ More

    Submitted 27 April, 2023; v1 submitted 9 December, 2020; originally announced December 2020.

    Comments: 15 pages with 3 figures, plus a 10-page supplement. This document supersedes a much older version of the manuscript, in which we used other theoretical and computational approaches. Simulations, real data analyses and the writing of the paper have also been improved

  13. arXiv:2009.14150  [pdf, ps, other

    math.ST math.PR stat.ME

    Nonparametric independence tests in metric spaces: What is known and what is not

    Authors: Fernando Castro-Prado, Wenceslao González-Manteiga

    Abstract: Distance correlation is a recent extension of Pearson's correlation, that characterises general statistical independence between Euclidean-space-valued random variables, not only linear relations. This review delves into how and when distance correlation can be extended to metric spaces, combining the information that is available in the literature with some original remarks and proofs, in a way t… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: 18 pages with no figures

  14. Goodness-of-fit tests for functional linear models based on integrated projections

    Authors: Eduardo García-Portugués, Javier Álvarez-Liébana, Gonzalo Álvarez-Pérez, Wenceslao González-Manteiga

    Abstract: Functional linear models are one of the most fundamental tools to assess the relation between two random variables of a functional or scalar nature. This contribution proposes a goodness-of-fit test for the functional linear model with functional response that neatly adapts to functional/scalar responses/predictors. In particular, the new goodness-of-fit test extends a previous proposal for scalar… ▽ More

    Submitted 22 August, 2020; originally announced August 2020.

    Comments: 7 pages, 2 figures

    MSC Class: 62G10; 62J05; 62G09

    Journal ref: In Aneiros, G., Horová, I., Hušková, M. and Vieu, P., editors, Functional and High-Dimensional Statistics and Related Fields, pages 107-114. Springer, 2020

  15. arXiv:2007.00150  [pdf, other

    stat.ME

    A robust approach for ROC curves with covariates

    Authors: Ana M. Bianco, Graciela Boente, Wenceslao Gonzalez-Manteiga

    Abstract: The Receiver Operating Characteristic (ROC) curve is a useful tool that measures the discriminating power of a continuous variable or the accuracy of a pharmaceutical or medical test to distinguish between two conditions or classes. In certain situations, the practitioner may be able to measure some covariates related to the diagnostic variable which can increase the discriminating power of the RO… ▽ More

    Submitted 23 July, 2022; v1 submitted 30 June, 2020; originally announced July 2020.

    MSC Class: 62F35

  16. arXiv:2005.03511  [pdf, ps, other

    stat.ME

    Robust location estimators in regression models with covariates and responses missing at random

    Authors: Ana M. Bianco, Graciela Boente, Wenceslao González-Manteiga, Ana Pérez-González

    Abstract: This paper deals with robust marginal estimation under a general regression model when missing data occur in the response and also in some of covariates. The target is a marginal location parameter which is given through an $M-$functional. To obtain robust Fisher--consistent estimators, properly defined marginal distribution function estimators are considered. These estimators avoid the bias due t… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

  17. A goodness-of-fit test for the functional linear model with functional response

    Authors: Eduardo García-Portugués, Javier Álvarez-Liébana, Gonzalo Álvarez-Pérez, Wenceslao González-Manteiga

    Abstract: The Functional Linear Model with Functional Response (FLMFR) is one of the most fundamental models to assess the relation between two functional random variables. In this paper, we propose a novel goodness-of-fit test for the FLMFR against a general, unspecified, alternative. The test statistic is formulated in terms of a Cramér-von Mises norm over a doubly-projected empirical process which, using… ▽ More

    Submitted 21 September, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

    Comments: 24 pages, 2 figures, 10 tables. Suplementary material: 2 pages, 1 figure

    MSC Class: 62G10; 62J05; 62G09

    Journal ref: Scandinavian Journal of Statistics, 48(2):502-528, 2021

  18. Smoothing-based tests with directional random variables

    Authors: Eduardo García-Portugués, Rosa M. Crujeiras, Wenceslao González-Manteiga

    Abstract: Testing procedures for assessing specific parametric model forms, or for checking the plausibility of simplifying assumptions, play a central role in the mathematical treatment of the uncertain. No certain answers are obtained by testing methods, but at least the uncertainty of these answers is properly quantified. This is the case for tests designed on the two most general data generating mechani… ▽ More

    Submitted 21 September, 2020; v1 submitted 31 March, 2018; originally announced April 2018.

    Comments: 8 pages, 2 figures

    MSC Class: 62H11; 62G10; 62G07; 62G08

    Journal ref: In Gil, E., Gil, E., Gil, J. and Gil, M. Á, editors, The Mathematics of the Uncertain, pages 175-184. Springer, 2018

  19. arXiv:1801.00736  [pdf, other

    stat.ME stat.AP stat.CO

    Variable selection in Functional Additive Regression Models

    Authors: Manuel Febrero-Bande, Wenceslao González-Manteiga, Manuel Oviedo de la Fuente

    Abstract: This paper considers the problem of variable selection in regression models in the case of functional variables that may be mixed with other type of variables (scalar, multivariate, directional, etc.). Our proposal begins with a simple null model and sequentially selects a new variable to be incorporated into the model based on the use of distance correlation proposed by \cite{Szekely2007}. For th… ▽ More

    Submitted 11 April, 2018; v1 submitted 2 January, 2018; originally announced January 2018.

    Comments: 23 pages, 4 figures

    MSC Class: 62J02; 62J05; 62J07; 62P30

  20. arXiv:1709.07716  [pdf, other

    stat.ME stat.AP

    Testing first-order intensity model in non-homogeneous Poisson point processes with covariates

    Authors: M. I. Borrajo, W. González-Manteiga, M. D. Martínez-Miranda

    Abstract: Modelling the first-order intensity function is one of the main aims in point process theory, and it has been approached so far from different perspectives. One appealing model describes the intensity as a function of a spatial covariate. In the recent literature, estimation theory and several applications have been developed assuming this model, but without formally checking this assumption. In t… ▽ More

    Submitted 2 July, 2018; v1 submitted 22 September, 2017; originally announced September 2017.

    Comments: 30 pages (23 main doc + 7 appendix); 9 figures; 3 tables

    MSC Class: 62G10; 62G20; 60G55; 60-02

  21. arXiv:1703.03213  [pdf, other

    stat.ME stat.AP

    Bootstrap** kernel intensity estimation for nonhomogeneous point processes depending on spatial covariates

    Authors: M. I. Borrajo, W. González-Manteiga, M. D. Martínez-Miranda

    Abstract: In the spatial point process context, kernel intensity estimation has been mainly restricted to exploratory analysis due to its lack of consistency. Different methods have been analysed to overcome this problem, and the inclusion of covariates resulted to be one possible solution. In this paper we focus on de\-fi\-ning a theoretical framework to derive a consistent kernel intensity estimator using… ▽ More

    Submitted 18 May, 2018; v1 submitted 9 March, 2017; originally announced March 2017.

    Comments: 32 pages, 7 figures (15 images), 4 tables

    MSC Class: 62G05; 62G09; 62H11; 60G55; 60-08

  22. Goodness-of-fit tests for the functional linear model based on randomly projected empirical processes

    Authors: Juan A. Cuesta-Albertos, Eduardo García-Portugués, Manuel Febrero-Bande, Wenceslao González-Manteiga

    Abstract: We consider marked empirical processes indexed by a randomly projected functional covariate to construct goodness-of-fit tests for the functional linear model with scalar response. The test statistics are built from continuous functionals over the projected process, resulting in computationally efficient tests that exhibit root-n convergence rates and circumvent the curse of dimensionality. The we… ▽ More

    Submitted 21 September, 2020; v1 submitted 29 January, 2017; originally announced January 2017.

    Comments: Paper: 23 pages, 4 figures, 1 table. Supplementary material: 17 pages, 4 figures, 3 tables

    MSC Class: 62G10; 62J05; 62G09

    Journal ref: The Annals of Statistics, 47(1):439-467, 2019

  23. Bandwidth selection for kernel density estimation with length-biased data

    Authors: María Isabel Borrajo, Wenceslao González-Manteiga, María Dolores Martínez-Miranda

    Abstract: Length-biased data are a particular case of weighted data, which arise in many situations: biomedicine, quality control or epidemiology among others. In this paper we study the theoretical properties of kernel density estimation in the context of length-biased data, proposing two consistent bootstrap methods that we use for bandwidth selection. Apart from the bootstrap bandwidth selectors we sugge… ▽ More

    Submitted 13 December, 2016; v1 submitted 17 June, 2016; originally announced June 2016.

    Comments: 35 pages

    MSC Class: 62G07; 62G09; 62G20

  24. A lack-of-fit test for quantile regression models with high-dimensional covariates

    Authors: Mercedes Conde-Amboage, César Sánchez-Sellero, Wenceslao González-Manteiga

    Abstract: We propose a new lack-of-fit test for quantile regression models that is suitable even with high-dimensional covariates. The test is based on the cumulative sum of residuals with respect to unidimensional linear projections of the covariates. The test adapts concepts proposed by Escanciano (Econometric Theory, 22, 2006) to cope with many covariates to the test proposed by He and Zhu (Journal of th… ▽ More

    Submitted 20 February, 2015; originally announced February 2015.

    Comments: 14 pages, 1 figure, 6 tables

  25. Testing parametric models in linear-directional regression

    Authors: Eduardo García-Portugués, Ingrid Van Keilegom, Rosa M. Crujeiras, Wenceslao González-Manteiga

    Abstract: This paper presents a goodness-of-fit test for parametric regression models with scalar response and directional predictor, that is, a vector on a sphere of arbitrary dimension. The testing procedure is based on the weighted squared distance between a smooth and a parametric regression estimator, where the smooth regression estimator is obtained by a projected local approach. Asymptotic behavior o… ▽ More

    Submitted 20 September, 2020; v1 submitted 1 September, 2014; originally announced September 2014.

    Comments: 13 pages, 3 figures. Supplementary material: 22 pages, 9 figures, 3 tables

    MSC Class: 62G10; 62H11; 62G08; 62G09

    Journal ref: Scandinavian Journal of Statistics, 43(4):1178-1191, 2016

  26. Central limit theorems for directional and linear random variables with applications

    Authors: Eduardo García-Portugués, Rosa M. Crujeiras, Wenceslao González-Manteiga

    Abstract: A central limit theorem for the integrated squared error of the directional-linear kernel density estimator is established. The result enables the construction and analysis of two testing procedures based on squared loss: a nonparametric independence test for directional and linear random variables and a goodness-of-fit test for parametric families of directional-linear densities. Limit distributi… ▽ More

    Submitted 20 September, 2020; v1 submitted 27 February, 2014; originally announced February 2014.

    Comments: Paper: 19 pages, 5 figures, 1 table. Supplementary material: 46 pages, 7 figures, 5 tables

    MSC Class: 62G10; 62H11; 62G07; 62G09

    Journal ref: Statistica Sinica, 25(3):1207-1229, 2015

  27. arXiv:1402.0361  [pdf, ps, other

    stat.ME stat.CO

    A comparative simulation study of data-driven methods for estimating density level sets

    Authors: Paula Saavedra-Nieves, Wenceslao González-Manteiga, Alberto Rodríguez-Casal

    Abstract: Density level sets are mainly estimated using one of three methodologies: plug-in, excess mass, or a hybrid approach. The plug-in methods are based on replacing the unknown density by some nonparametric estimator, usually the kernel. Thus, the bandwidth selection is a fundamental problem from a practical point of view. Recently, specific selectors for level sets have been proposed. However, if som… ▽ More

    Submitted 5 March, 2014; v1 submitted 3 February, 2014; originally announced February 2014.

  28. A test for directional-linear independence, with applications to wildfire orientation and size

    Authors: Eduardo García-Portugués, Ana M. G. Barros, Rosa M. Crujeiras, Wenceslao González-Manteiga, J. M. C. Pereira

    Abstract: The relation between wildfire orientation and size is analyzed by means of a nonparametric test for directional-linear independence. The test statistic is designed for assessing the independence between two random variables of different nature, specifically directional (fire orientation, circular or spherical, as particular cases) and linear (fire size measured as burnt area, scalar), based on a d… ▽ More

    Submitted 20 September, 2020; v1 submitted 11 January, 2013; originally announced January 2013.

    Comments: 19 pages, 4 figures, 3 tables

    MSC Class: 62G10; 62H11; 62G07; 62G09

    Journal ref: Stochastic Environmental Research and Risk Assessment, 28(5):1261-1275, 2014

  29. Kernel density estimation for directional-linear data

    Authors: Eduardo García-Portugués, Rosa M. Crujeiras, Wenceslao González-Manteiga

    Abstract: A nonparametric kernel density estimator for directional-linear data is introduced. The proposal is based on a product kernel accounting for the different nature of both (directional and linear) components of the random vector. Expressions for bias, variance and Mean Integrated Squared Error (MISE) are derived, jointly with an asymptotic normality result for the proposed estimator. For some partic… ▽ More

    Submitted 20 September, 2020; v1 submitted 11 October, 2012; originally announced October 2012.

    Comments: 34 pages, 4 figures

    MSC Class: 62G07; 62H11

    Journal ref: Journal of Multivariate Analysis, 121:152-175, 2013

  30. arXiv:1210.1072  [pdf, other

    stat.ME

    Bootstrap independence test for functional linear models

    Authors: Wenceslao González-Manteiga, Gil González-Rodríguez, Adela Martínez-Calvo, Eduardo García-Portugués

    Abstract: Functional data have been the subject of many research works over the last years. Functional regression is one of the most discussed issues. Specifically, significant advances have been made for functional linear regression models with scalar response. Let $(\mathcal{H},<\cdot,\cdot>)$ be a separable Hilbert space. We focus on the model $Y=<Θ,X>+b+\varepsilon$, where $Y$ and $\varepsilon$ are real… ▽ More

    Submitted 20 September, 2020; v1 submitted 3 October, 2012; originally announced October 2012.

    Comments: 17 pages, 5 tables

    MSC Class: 62G10; 62J05; 62G09

  31. Exploring wind direction and SO2 concentration by circular-linear density estimation

    Authors: Eduardo García-Portugués, Rosa M. Crujeiras, Wenceslao González-Manteiga

    Abstract: The study of environmental problems usually requires the description of variables with different nature and the assessment of relations between them. In this work, an algorithm for flexible estimation of the joint density for a circular-linear variable is proposed. The method is applied for exploring the relation between wind direction and SO2 concentration in a monitoring station close to a power… ▽ More

    Submitted 20 September, 2020; v1 submitted 23 August, 2012; originally announced August 2012.

    Comments: 17 pages, 7 figures, 2 tables

    MSC Class: 62G07; 62H11

    Journal ref: Stochastic Environmental Research and Risk Assessment, 27(5):1055-1067, 2013

  32. A goodness-of-fit test for the functional linear model with scalar response

    Authors: Eduardo García-Portugués, Wenceslao González-Manteiga, Manuel Febrero-Bande

    Abstract: In this work, a goodness-of-fit test for the null hypothesis of a functional linear model with scalar response is proposed. The test is based on a generalization to the functional framework of a previous one, designed for the goodness-of-fit of regression models with multivariate covariates using random projections. The test statistic is easy to compute using geometrical and matrix arguments, and… ▽ More

    Submitted 20 September, 2020; v1 submitted 28 May, 2012; originally announced May 2012.

    Comments: Paper: 17 pages, 2 figures, 3 tables. Supplementary material: 8 pages, 6 figures, 10 tables

    MSC Class: 62G10; 62J05; 62G09

    Journal ref: Journal of Computational and Graphical Statistics, 23(3):761-778, 2014