Skip to main content

Showing 1–26 of 26 results for author: Lillo, R E

.
  1. arXiv:2406.19213  [pdf, other

    stat.ME stat.AP

    Comparing Lasso and Adaptive Lasso in High-Dimensional Data: A Genetic Survival Analysis in Triple-Negative Breast Cancer

    Authors: Pilar González-Barquero, Rosa E. Lillo, Álvaro Méndez-Civieta

    Abstract: This study aims to evaluate the performance of Cox regression with lasso penalty and adaptive lasso penalty in high-dimensional settings. Variable selection methods are necessary in this context to reduce dimensionality and make the problem feasible. Several weight calculation procedures for adaptive lasso are proposed to determine if they offer an improvement over lasso, as adaptive lasso address… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 39 pages, 2 figures, 8 tables

  2. arXiv:2406.01588  [pdf, other

    cs.LG cs.AI stat.ML

    nn2poly: An R Package for Converting Neural Networks into Interpretable Polynomials

    Authors: Pablo Morala, Jenny Alexandra Cifuentes, Rosa E. Lillo, Iñaki Ucar

    Abstract: The nn2poly package provides the implementation in R of the NN2Poly method to explain and interpret feed-forward neural networks by means of polynomial representations that predict in an equivalent manner as the original network.Through the obtained polynomial coefficients, the effect and importance of each variable and their interactions on the output can be represented. This capabiltiy of captur… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  3. A bivariate two-state Markov modulated Poisson process for failure modelling

    Authors: Yoel G. Yera, Rosa E. Lillo, Bo F. Nielsen, Pepa Ramírez-Cobo, Fabrizio Ruggeri

    Abstract: Motivated by a real failure dataset in a two-dimensional context, this paper presents an extension of the Markov modulated Poisson process (MMPP) to two dimensions. The one-dimensional MMPP has been proposed for the modeling of dependent and non-exponential inter-failure times (in contexts as queuing, risk or reliability, among others). The novel two-dimensional MMPP allows for dependence between… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Journal ref: Reliability Engineering and System Safety 208(2021) 107318

  4. Fitting procedure for the two-state Batch Markov modulated Poisson process

    Authors: Yoel G. Yera, Rosa E. Lillo, Pepa Ramírez-Cobo

    Abstract: The Batch Markov Modulated Poisson Process (BMMPP) is a subclass of the versatile Batch Markovian Arrival process (BMAP) which has been proposed for the modeling of dependent events occurring in batches (as group arrivals, failures or risk events). This paper focuses on exploring the possibilities of the BMMPP for the modeling of real phenomena involving point processes with group arrivals. The fi… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Journal ref: European Journal of Operational Research (2019)

  5. arXiv:2401.14553  [pdf, ps, other

    q-fin.RM stat.AP

    Analysis of an aggregate loss model in a Markov renewal regime

    Authors: Pepa Ramírez-Cobo, Emilio Carrizosa, Rosa Elvira Lillo

    Abstract: In this article we consider an aggregate loss model with dependent losses. The losses occurrence process is governed by a two-state Markovian arrival process (MAP2), a Markov renewal process process that allows for (1) correlated inter-losses times, (2) non-exponentially distributed inter-losses times and, (3) overdisperse losses counts. Some quantities of interest to measure persistence in the lo… ▽ More

    Submitted 4 February, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Journal ref: Applied Mathematics and Computation (2021)

  6. arXiv:2307.16720  [pdf, other

    stat.ME stat.AP

    The epigraph and the hypograph indexes as useful tools for clustering multivariate functional data

    Authors: Belén Pulido, Alba M. Franco-Pereira, Rosa E. Lillo

    Abstract: The proliferation of data generation has spurred advancements in functional data analysis. With the ability to analyze multiple variables simultaneously, the demand for working with multivariate functional data has increased. This study proposes a novel formulation of the epigraph and hypograph indexes, as well as their generalized expressions, specifically tailored for the multivariate functional… ▽ More

    Submitted 17 October, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: 32 pages

  7. arXiv:2307.06643  [pdf, other

    cs.SI physics.soc-ph

    Nowcasting Temporal Trends Using Indirect Surveys

    Authors: Ajitesh Srivastava, Juan Marcos Ramírez, Sergio Díaz-Aranda, Jose Aguilar, Antonio Ortega, Antonio Fernández Anta, Rosa Elvira Lillo

    Abstract: Indirect surveys, in which respondents provide information about other people they know, have been proposed for estimating (nowcasting) the size of a \emph{hidden population} where privacy is important or the hidden population is hard to reach. Examples include estimating casualties in an earthquake, conditions among female sex workers, and the prevalence of drug use and infectious diseases. The N… ▽ More

    Submitted 14 December, 2023; v1 submitted 13 July, 2023; originally announced July 2023.

    Comments: Accepted at AAAI 2024

    ACM Class: G.3

  8. arXiv:2207.12803  [pdf, other

    stat.ME stat.AP stat.CO

    Multivariate Functional Outlier Detection using the FastMUOD Indices

    Authors: Oluwasegun Taiwo Ojo, Antonio Fernández Anta, Marc G. Genton, Rosa E. Lillo

    Abstract: We present definitions and properties of the fast massive unsupervised outlier detection (FastMUOD) indices, used for outlier detection (OD) in functional data. FastMUOD detects outliers by computing, for each curve, an amplitude, magnitude and shape index meant to target the corresponding types of outliers. Some methods adapting FastMUOD to outlier detection in multivariate functional data are th… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

  9. NN2Poly: A polynomial representation for deep feed-forward artificial neural networks

    Authors: Pablo Morala, Jenny Alexandra Cifuentes, Rosa E. Lillo, Iñaki Ucar

    Abstract: Interpretability of neural networks and their underlying theoretical behavior remain an open field of study even after the great success of their practical applications, particularly with the emergence of deep learning. In this work, NN2Poly is proposed: a theoretical approach to obtain an explicit polynomial model that provides an accurate representation of an already trained fully-connected feed… ▽ More

    Submitted 25 September, 2023; v1 submitted 21 December, 2021; originally announced December 2021.

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems (2023, Early Access)

  10. arXiv:2111.00472  [pdf, other

    stat.CO

    Asgl: A Python Package for Penalized Linear and Quantile Regression

    Authors: Álvaro Méndez Civieta, M. Carmen Aguilera-Morillo, Rosa E. Lillo

    Abstract: Asg is a Python package that solves penalized linear regression and quantile regression models for simultaneous variable selection and prediction, for both high and low dimensional frameworks. It makes very easy to set up and solve different types of lasso-based penalizations among which the asgl (adaptive sparse group lasso, that gives name to the package) is remarked. This package is built on to… ▽ More

    Submitted 31 October, 2021; originally announced November 2021.

    Comments: 31 pages, 1 figure, 1 table

  11. arXiv:2110.07998  [pdf, other

    stat.ME stat.CO

    Fast Partial Quantile Regression

    Authors: Alvaro Mendez Civieta, M. Carmen Aguilera-Morillo, Rosa E. Lillo

    Abstract: Partial least squares (PLS) is a dimensionality reduction technique used as an alternative to ordinary least squares (OLS) in situations where the data is colinear or high dimensional. Both PLS and OLS provide mean based estimates, which are extremely sensitive to the presence of outliers or heavy tailed distributions. In contrast, quantile regression is an alternative to OLS that computes robust… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

    Comments: 22 pages, 5 figures and 5 tables

    MSC Class: 62-08; 62Hxx; 62Jxx ACM Class: G.3

  12. Functional clustering via multivariate clustering

    Authors: Belén Pulido, Alba María Franco-Pereira, Rosa Elvira Lillo

    Abstract: Clustering techniques applied to multivariate data are a very useful tool in Statistics and have been fully studied in the literature. Nevertheless, these clustering methodologies are less well known when dealing with functional data. Our proposal consists of introducing a clustering procedure for functional data using the very well known techniques for clustering multivariate data. The idea is to… ▽ More

    Submitted 31 July, 2021; originally announced August 2021.

  13. arXiv:2105.05213  [pdf, other

    stat.CO stat.ME

    Outlier Detection for Functional Data with R Package fdaoutlier

    Authors: Oluwasegun Ojo, Rosa E. Lillo, Antonio Fernández Anta

    Abstract: Outlier detection is one of the standard exploratory analysis tasks in functional data analysis. We present the R package fdaoutlier which contains implementations of some of the latest techniques for detecting functional outliers. The package makes it easy to detect different types of outliers (magnitude, shape, and amplitude) in functional data, and some of the implemented methods can be applied… ▽ More

    Submitted 14 October, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

  14. Towards a mathematical framework to inform Neural Network modelling via Polynomial Regression

    Authors: Pablo Morala, Jenny Alexandra Cifuentes, Rosa E. Lillo, Iñaki Ucar

    Abstract: Even when neural networks are widely used in a large number of applications, they are still considered as black boxes and present some difficulties for dimensioning or evaluating their prediction error. This has led to an increasing interest in the overlap** area between neural networks and more traditional statistical methods, which can help overcome those problems. In this article, a mathemati… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

    Comments: 39 pages, 15 figures

    Journal ref: Neural Networks 142 (2021), 57-72

  15. arXiv:2009.06357  [pdf

    eess.IV cs.CV

    Automatic elimination of the pectoral muscle in mammograms based on anatomical features

    Authors: Jairo A. Ayala-Godoy, Rosa E. Lillo, Juan Romo

    Abstract: Digital mammogram inspection is the most popular technique for early detection of abnormalities in human breast tissue. When mammograms are analyzed through a computational method, the presence of the pectoral muscle might affect the results of breast lesions detection. This problem is particularly evident in the mediolateral oblique view (MLO), where pectoral muscle occupies a large part of the m… ▽ More

    Submitted 17 August, 2020; originally announced September 2020.

    Journal ref: International Journal of Computer Science Issues; 2020

  16. Detecting and Classifying Outliers in Big Functional Data

    Authors: Oluwasegun Taiwo Ojo, Antonio Fernández Anta, Rosa E. Lillo, Carlo Sguera

    Abstract: We propose two new outlier detection methods, for identifying and classifying different types of outliers in (big) functional data sets. The proposed methods are based on an existing method called Massive Unsupervised Outlier Detection (MUOD). MUOD detects and classifies outliers by computing for each curve, three indices, all based on the concept of linear regression and correlation, which measur… ▽ More

    Submitted 14 October, 2021; v1 submitted 16 December, 2019; originally announced December 2019.

    MSC Class: 2R10 (Functional data analysis)

  17. arXiv:1911.01081  [pdf, other

    stat.ME stat.AP

    Quantile regression: a penalization approach

    Authors: Álvaro Méndez Civieta, M. Carmen Aguilera-Morillo, Rosa E. Lillo

    Abstract: Sparse group LASSO (SGL) is a penalization technique used in regression problems where the covariates have a natural grouped structure and provides solutions that are both between and within group sparse. In this paper the SGL is introduced to the quantile regression (QR) framework, and a more flexible version, the adaptive sparse group LASSO (ASGL), is proposed. This proposal adds weights to the… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

    Comments: 9 figures, 5 tables

  18. Robust regression based on shrinkage estimators

    Authors: Elisa Cabana, Rosa E. Lillo, Henry Laniado

    Abstract: A robust estimator is proposed for the parameters that characterize the linear regression problem. It is based on the notion of shrinkages, often used in Finance and previously studied for outlier detection in multivariate data. A thorough simulation study is conducted to investigate: the efficiency with normal and heavy-tailed errors, the robustness under contamination, the computational times, t… ▽ More

    Submitted 8 May, 2019; originally announced May 2019.

  19. Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators

    Authors: Elisa Cabana, Rosa E. Lillo, Henry Laniado

    Abstract: A collection of robust Mahalanobis distances for multivariate outlier detection is proposed, based on the notion of shrinkage. Robust intensity and scaling factors are optimally estimated to define the shrinkage. Some properties are investigated, such as affine equivariance and breakdown value. The performance of the proposal is illustrated through the comparison to other techniques from the liter… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.

    Journal ref: Stat Papers (2019)

  20. arXiv:1610.08386  [pdf, other

    stat.AP math.ST

    On the estimation of extreme directional multivariate quantiles

    Authors: Raúl Torres, Elena Di Bernardino, Henry Laniado, Rosa E. Lillo

    Abstract: In multivariate extreme value theory (MEVT), the focus is on analysis outside of the observable sampling zone, which implies that the region of interest is associated to high risk levels. This work provides tools to include directional notions into the MEVT, giving the opportunity to characterize the recently introduced directional multivariate quantiles (DMQ) at high levels. Then, an out-sample e… ▽ More

    Submitted 4 December, 2018; v1 submitted 26 October, 2016; originally announced October 2016.

  21. arXiv:1607.05042  [pdf, ps, other

    stat.ME

    An empirical comparison of global and local functional depths

    Authors: Carlo Sguera, Rosa E. Lillo

    Abstract: A functional data depth provides a center-outward ordering criterion which allows the definition of measures such as median, trimmed means, central regions or ranks in a functional framework. A functional data depth can be global or local. With global depths, the degree of centrality of a curve $x$ depends equally on the rest of the sample observations, while with local depths, the contribution of… ▽ More

    Submitted 5 July, 2018; v1 submitted 18 July, 2016; originally announced July 2016.

  22. Directional Multivariate Extremes in Environmental Phenomena

    Authors: Raúl Torres, Carlo De Michele, Henry Laniado, Rosa E. Lillo

    Abstract: Several environmental phenomena can be described by different correlated variables that must be considered jointly in order to be more representative of the nature of these phenomena. For such events, identification of extremes is inappropriate if it is based on marginal analysis. Extremes have usually been linked to the notion of quantile, which is an important tool to analyze risk in the univari… ▽ More

    Submitted 10 June, 2016; v1 submitted 6 June, 2016; originally announced June 2016.

    Comments: Article with supplementary material in the appendix

    Journal ref: Environmetrics, Volume 28, Issue 2 March 2017 e2428

  23. A Directional Multivariate Value at Risk

    Authors: Raúl Torres, Rosa E. Lillo, Henry Laniado

    Abstract: In economics, insurance and finance, value at risk (VaR) is a widely used measure of the risk of loss on a specific portfolio of financial assets. For a given portfolio, time horizon, and probability $α$, the $100α\%$ VaR is defined as a threshold loss value, such that the probability that the loss on the portfolio over the given time horizon exceeds this value is $α$. That is to say, it is a quan… ▽ More

    Submitted 3 February, 2015; originally announced February 2015.

    Comments: 30 pages, 9 figures

    Journal ref: Insurance: Mathematics and Economics, Volume 65, November 2015, Pages 111-123

  24. arXiv:1409.1816  [pdf, ps, other

    stat.ME

    Extremality measures and a rank test for functional data

    Authors: A. M. Franco-Pereira, R. E. Lillo, J. Romo

    Abstract: The statistical analysis of functional data is a growing need in many research areas. In particular, a robust methodology is important to study curves, which are the output of experiments in applied statistics. In this paper we study some new definitions which reflect the "extremality" of a curve with respect to a collection of functions, and provide natural orderings for sample curves. Their fini… ▽ More

    Submitted 4 September, 2014; originally announced September 2014.

    Comments: 20pages, 11 figures

  25. arXiv:1304.4786  [pdf, other

    math.ST stat.CO stat.ME stat.ML

    The Mahalanobis distance for functional data with applications to classification

    Authors: Esdras Joseph, Pedro Galeano, Rosa E. Lillo

    Abstract: This paper presents a general notion of Mahalanobis distance for functional data that extends the classical multivariate concept to situations where the observed data are points belonging to curves generated by a stochastic process. More precisely, a new semi-distance for functional observations that generalize the usual Mahalanobis distance for multivariate datasets is introduced. For that, the d… ▽ More

    Submitted 17 April, 2013; originally announced April 2013.

  26. Bayesian inference for double Pareto lognormal queues

    Authors: Pepa Ramirez-Cobo, Rosa E. Lillo, Simon Wilson, Michael P. Wiper

    Abstract: In this article we describe a method for carrying out Bayesian estimation for the double Pareto lognormal (dPlN) distribution which has been proposed as a model for heavy-tailed phenomena. We apply our approach to estimate the $\mathit{dPlN}/M/1$ and $M/\mathit{dPlN}/1$ queueing systems. These systems cannot be analyzed using standard techniques due to the fact that the dPlN distribution does not… ▽ More

    Submitted 15 November, 2010; originally announced November 2010.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOAS336 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS336

    Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 3, 1533-1557