Search | arXiv e-print repository

Adaptive Lasso, Transfer Lasso, and Beyond: An Asymptotic Perspective

Authors: Masaaki Takada, Hironori Fujisawa

Abstract: This paper presents a comprehensive exploration of the theoretical properties inherent in the Adaptive Lasso and the Transfer Lasso. The Adaptive Lasso, a well-established method, employs regularization divided by initial estimators and is characterized by asymptotic normality and variable selection consistency. In contrast, the recently proposed Transfer Lasso employs regularization subtracted by… ▽ More This paper presents a comprehensive exploration of the theoretical properties inherent in the Adaptive Lasso and the Transfer Lasso. The Adaptive Lasso, a well-established method, employs regularization divided by initial estimators and is characterized by asymptotic normality and variable selection consistency. In contrast, the recently proposed Transfer Lasso employs regularization subtracted by initial estimators with the demonstrated capacity to curtail non-asymptotic estimation errors. A pivotal question thus emerges: Given the distinct ways the Adaptive Lasso and the Transfer Lasso employ initial estimators, what benefits or drawbacks does this disparity confer upon each method? This paper conducts a theoretical examination of the asymptotic properties of the Transfer Lasso, thereby elucidating its differentiation from the Adaptive Lasso. Informed by the findings of this analysis, we introduce a novel method, one that amalgamates the strengths and compensates for the weaknesses of both methods. The paper concludes with validations of our theory and comparisons of the methods via simulation experiments. △ Less

Submitted 17 April, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

arXiv:2208.11592 [pdf, ps, other]

Outlier Robust and Sparse Estimation of Linear Regression Coefficients

Authors: Takeyuki Sasai, Hironori Fujisawa

Abstract: We consider outlier-robust and sparse estimation of linear regression coefficients, when the covariates and the noises are contaminated by adversarial outliers and noises are sampled from a heavy-tailed distribution. Our results present sharper error bounds under weaker assumptions than prior studies that share similar interests with this study. Our analysis relies on some sharp concentration ineq… ▽ More We consider outlier-robust and sparse estimation of linear regression coefficients, when the covariates and the noises are contaminated by adversarial outliers and noises are sampled from a heavy-tailed distribution. Our results present sharper error bounds under weaker assumptions than prior studies that share similar interests with this study. Our analysis relies on some sharp concentration inequalities resulting from generic chaining. △ Less

Submitted 24 May, 2024; v1 submitted 24 August, 2022; originally announced August 2022.

MSC Class: 62J07; 62F35

arXiv:2106.13946 [pdf, other]

Outlier-Resistant Estimators for Average Treatment Effect in Causal Inference

Authors: Kazuharu Harada, Hironori Fujisawa

Abstract: The inverse probability (IPW) and doubly robust (DR) estimators are often used to estimate the average causal effect (ATE), but are vulnerable to outliers. The IPW/DR median can be used for outlier-resistant estimation of the ATE, but the outlier resistance of the median is limited and it is not resistant enough for heavy contamination. We propose extensions of the IPW/DR estimators with density p… ▽ More The inverse probability (IPW) and doubly robust (DR) estimators are often used to estimate the average causal effect (ATE), but are vulnerable to outliers. The IPW/DR median can be used for outlier-resistant estimation of the ATE, but the outlier resistance of the median is limited and it is not resistant enough for heavy contamination. We propose extensions of the IPW/DR estimators with density power weighting, which can eliminate the influence of outliers almost completely. The outlier resistance of the proposed estimators is evaluated through the unbiasedness of the estimating equations. Unlike the median-based methods, our estimators are resistant to outliers even under heavy contamination. Interestingly, the naive extension of the DR estimator requires bias correction to keep the double robustness even under the most tractable form of contamination. In addition, the proposed estimators are found to be highly resistant to outliers in more difficult settings where the contamination ratio depends on the covariates. The outlier resistance of our estimators from the viewpoint of the influence function is also favorable. Our theoretical results are verified via Monte Carlo simulations and real data analysis. The proposed methods were found to have more outlier resistance than the median-based methods and estimated the potential mean with a smaller error than the median-based methods. △ Less

Submitted 15 April, 2022; v1 submitted 26 June, 2021; originally announced June 2021.

arXiv:2102.11120 [pdf, ps, other]

Adversarial robust weighted Huber regression

Authors: Takeyuki Sasai, Hironori Fujisawa

Abstract: We consider a robust estimation of linear regression coefficients. In this note, we focus on the case where the covariates are sampled from an $L$-subGaussian distribution with unknown covariance, the noises are sampled from a distribution with a bounded absolute moment and both covariates and noises may be contaminated by an adversary. We derive an estimation error bound, which depends on the sta… ▽ More We consider a robust estimation of linear regression coefficients. In this note, we focus on the case where the covariates are sampled from an $L$-subGaussian distribution with unknown covariance, the noises are sampled from a distribution with a bounded absolute moment and both covariates and noises may be contaminated by an adversary. We derive an estimation error bound, which depends on the stable rank and the condition number of the covariance matrix of covariates with a polynomial computational complexity of estimation. △ Less

Submitted 24 May, 2024; v1 submitted 22 February, 2021; originally announced February 2021.

Comments: The case of sparse coefficients is investigated in arXiv:2208.11592. This manuscript will not be submitted for publications

MSC Class: 62G35; 62G05

arXiv:2010.13018 [pdf, ps, other]

Adversarial Robust Low Rank Matrix Estimation: Compressed Sensing and Matrix Completion

Authors: Takeyuki Sasai, Hironori Fujisawa

Abstract: We consider robust low rank matrix estimation as a trace regression when outputs are contaminated by adversaries. The adversaries are allowed to add arbitrary values to arbitrary outputs. Such values can depend on any samples. We deal with matrix compressed sensing, including lasso as a partial problem, and matrix completion, and then we obtain sharp estimation error bounds. To obtain the error bo… ▽ More We consider robust low rank matrix estimation as a trace regression when outputs are contaminated by adversaries. The adversaries are allowed to add arbitrary values to arbitrary outputs. Such values can depend on any samples. We deal with matrix compressed sensing, including lasso as a partial problem, and matrix completion, and then we obtain sharp estimation error bounds. To obtain the error bounds for different models such as matrix compressed sensing and matrix completion, we propose a simple unified approach based on a combination of the Huber loss function and the nuclear norm penalization, which is a different approach from the conventional ones. Some error bounds obtained in the present paper are sharper than the past ones. △ Less

Submitted 24 May, 2024; v1 submitted 24 October, 2020; originally announced October 2020.

Comments: The lasso part of this manuscript with contaminated input as well as output is investigated in arXiv:2208.11592. This manuscript will not be submitted for publications

MSC Class: 62G35; 62G05

arXiv:2009.03077 [pdf, other]

Estimation of Structural Causal Model via Sparsely Mixing Independent Component Analysis

Authors: Kazuharu Harada, Hironori Fujisawa

Abstract: We consider the problem of inferring the causal structure from observational data, especially when the structure is sparse. This type of problem is usually formulated as an inference of a directed acyclic graph (DAG) model. The linear non-Gaussian acyclic model (LiNGAM) is one of the most successful DAG models, and various estimation methods have been developed. However, existing methods are not e… ▽ More We consider the problem of inferring the causal structure from observational data, especially when the structure is sparse. This type of problem is usually formulated as an inference of a directed acyclic graph (DAG) model. The linear non-Gaussian acyclic model (LiNGAM) is one of the most successful DAG models, and various estimation methods have been developed. However, existing methods are not efficient for some reasons: (i) the sparse structure is not always incorporated in causal order estimation, and (ii) the whole information of the data is not used in parameter estimation. To address {these issues}, we propose a new estimation method for a linear DAG model with non-Gaussian noises. The proposed method is based on the log-likelihood of independent component analysis (ICA) with two penalty terms related to the sparsity and the consistency condition. The proposed method enables us to estimate the causal order and the parameters simultaneously. For stable and efficient optimization, we propose some devices, such as a modified natural gradient. Numerical experiments show that the proposed method outperforms existing methods, including LiNGAM and NOTEARS. △ Less

Submitted 7 September, 2020; originally announced September 2020.

Comments: 9 pages, 6 figures

arXiv:2006.14845 [pdf, other]

Transfer Learning via $\ell_1$ Regularization

Authors: Masaaki Takada, Hironori Fujisawa

Abstract: Machine learning algorithms typically require abundant data under a stationary environment. However, environments are nonstationary in many real-world applications. Critical issues lie in how to effectively adapt models under an ever-changing environment. We propose a method for transferring knowledge from a source domain to a target domain via $\ell_1$ regularization. We incorporate $\ell_1$ regu… ▽ More Machine learning algorithms typically require abundant data under a stationary environment. However, environments are nonstationary in many real-world applications. Critical issues lie in how to effectively adapt models under an ever-changing environment. We propose a method for transferring knowledge from a source domain to a target domain via $\ell_1$ regularization. We incorporate $\ell_1$ regularization of differences between source parameters and target parameters, in addition to an ordinary $\ell_1$ regularization. Hence, our method yields sparsity for both the estimates themselves and changes of the estimates. The proposed method has a tight estimation error bound under a stationary environment, and the estimate remains unchanged from the source estimate under small residuals. Moreover, the estimate is consistent with the underlying function, even when the source estimate is mistaken due to nonstationarity. Empirical results demonstrate that the proposed method effectively balances stability and plasticity. △ Less

Submitted 26 June, 2020; originally announced June 2020.

arXiv:2004.05990 [pdf, ps, other]

Robust estimation with Lasso when outputs are adversarially contaminated

Authors: Takeyuki Sasai, Hironori Fujisawa

Abstract: We consider robust estimation when outputs are adversarially contaminated. Nguyen and Tran (2012) proposed an extended Lasso for robust parameter estimation and then they showed the convergence rate of the estimation error. Recently, Dalalyan and Thompson (2019) gave some useful inequalities and then they showed a faster convergence rate than Nguyen and Tran (2012). They focused on the fact that t… ▽ More We consider robust estimation when outputs are adversarially contaminated. Nguyen and Tran (2012) proposed an extended Lasso for robust parameter estimation and then they showed the convergence rate of the estimation error. Recently, Dalalyan and Thompson (2019) gave some useful inequalities and then they showed a faster convergence rate than Nguyen and Tran (2012). They focused on the fact that the minimization problem of the extended Lasso can become that of the penalized Huber loss function with $L_1$ penalty. The distinguishing point is that the Huber loss function includes an extra tuning parameter, which is different from the conventional method. We give the proof, which is different from Dalalyan and Thompson (2019) and then we give the same convergence rate as Dalalyan and Thompson (2019). The significance of our proof is to use some specific properties of the Huber function. Such techniques have not been used in the past proofs. △ Less

Submitted 24 May, 2024; v1 submitted 13 April, 2020; originally announced April 2020.

Comments: The case of contaminated inputs as well as outputs is investigated in arXiv:2208.11592. This manuscript will not be submitted for publications

arXiv:1811.00255 [pdf, other]

HMLasso: Lasso with High Missing Rate

Authors: Masaaki Takada, Hironori Fujisawa, Takeichiro Nishikawa

Abstract: Sparse regression such as the Lasso has achieved great success in handling high-dimensional data. However, one of the biggest practical problems is that high-dimensional data often contain large amounts of missing values. Convex Conditioned Lasso (CoCoLasso) has been proposed for dealing with high-dimensional data with missing values, but it performs poorly when there are many missing values, so t… ▽ More Sparse regression such as the Lasso has achieved great success in handling high-dimensional data. However, one of the biggest practical problems is that high-dimensional data often contain large amounts of missing values. Convex Conditioned Lasso (CoCoLasso) has been proposed for dealing with high-dimensional data with missing values, but it performs poorly when there are many missing values, so that the high missing rate problem has not been resolved. In this paper, we propose a novel Lasso-type regression method for high-dimensional data with high missing rates. We effectively incorporate mean imputed covariance, overcoming its inherent estimation bias. The result is an optimally weighted modification of CoCoLasso according to missing ratios. We theoretically and experimentally show that our proposed method is highly effective even when there are many missing values. △ Less

Submitted 19 June, 2019; v1 submitted 1 November, 2018; originally announced November 2018.

arXiv:1805.07960

Stochastic Gradient Descent for Stochastic Doubly-Nonconvex Composite Optimization

Authors: Takayuki Kawashima, Hironori Fujisawa

Abstract: The stochastic gradient descent has been widely used for solving composite optimization problems in big data analyses. Many algorithms and convergence properties have been developed. The composite functions were convex primarily and gradually nonconvex composite functions have been adopted to obtain more desirable properties. The convergence properties have been investigated, but only when either… ▽ More The stochastic gradient descent has been widely used for solving composite optimization problems in big data analyses. Many algorithms and convergence properties have been developed. The composite functions were convex primarily and gradually nonconvex composite functions have been adopted to obtain more desirable properties. The convergence properties have been investigated, but only when either of composite functions is nonconvex. There is no convergence property when both composite functions are nonconvex, which is named the \textit{doubly-nonconvex} case.To overcome this difficulty, we assume a simple and weak condition that the penalty function is \textit{quasiconvex} and then we obtain convergence properties for the stochastic doubly-nonconvex composite optimization problem.The convergence rate obtained here is of the same order as the existing work.We deeply analyze the convergence rate with the constant step size and mini-batch size and give the optimal convergence rate with appropriate sizes, which is superior to the existing work. Experimental results illustrate that our method is superior to existing methods. △ Less

Submitted 1 March, 2020; v1 submitted 21 May, 2018; originally announced May 2018.

Comments: There is a mistake in the proof of Proposition 3.2. related to the Euclidean projection with stochastic gradients

arXiv:1805.06144 [pdf, ps, other]

On Difference Between Two Types of $γ$-divergence for Regression

Authors: Takayuki Kawashima, Hironori Fujisawa

Abstract: The $γ$-divergence is well-known for having strong robustness against heavy contamination. By virtue of this property, many applications via the $γ$-divergence have been proposed. There are two types of \gd\ for regression problem, in which the treatments of base measure are different. In this paper, we compare them and pointed out a distinct difference between these two divergences under heteroge… ▽ More The $γ$-divergence is well-known for having strong robustness against heavy contamination. By virtue of this property, many applications via the $γ$-divergence have been proposed. There are two types of \gd\ for regression problem, in which the treatments of base measure are different. In this paper, we compare them and pointed out a distinct difference between these two divergences under heterogeneous contamination where the outlier ratio depends on the explanatory variable. One divergence has the strong robustness under heterogeneous contamination. The other does not have in general, but has when the parametric model of the response variable belongs to a location-scale family in which the scale does not depend on the explanatory variables or under homogeneous contamination where the outlier ratio does not depend on the explanatory variable. \citet{hung.etal.2017} discussed the strong robustness in a logistic regression model with an additional assumption that the tuning parameter $γ$ is sufficiently large. The results obtained in this paper hold for any parametric model without such an additional assumption. △ Less

Submitted 16 May, 2018; originally announced May 2018.

arXiv:1802.05475 [pdf, ps, other]

Robust and sparse Gaussian graphical modeling under cell-wise contamination

Authors: Shota Katayama, Hironori Fujisawa, Mathias Drton

Abstract: Graphical modeling explores dependences among a collection of variables by inferring a graph that encodes pairwise conditional independences. For jointly Gaussian variables, this translates into detecting the support of the precision matrix. Many modern applications feature high-dimensional and contaminated data that complicate this task. In particular, traditional robust methods that down-weight… ▽ More Graphical modeling explores dependences among a collection of variables by inferring a graph that encodes pairwise conditional independences. For jointly Gaussian variables, this translates into detecting the support of the precision matrix. Many modern applications feature high-dimensional and contaminated data that complicate this task. In particular, traditional robust methods that down-weight entire observation vectors are often inappropriate as high-dimensional data may feature partial contamination in many observations. We tackle this problem by giving a robust method for sparse precision matrix estimation based on the $γ$-divergence under a cell-wise contamination model. Simulation studies demonstrate that our procedure outperforms existing methods especially for highly contaminated data. △ Less

Submitted 15 February, 2018; originally announced February 2018.

arXiv:1802.03127 [pdf, ps, other]

Robust and Sparse Regression in GLM by Stochastic Optimization

Authors: Takayuki Kawashima, Hironori Fujisawa

Abstract: The generalized linear model (GLM) plays a key role in regression analyses. In high-dimensional data, the sparse GLM has been used but it is not robust against outliers. Recently, the robust methods have been proposed for the specific example of the sparse GLM. Among them, we focus on the robust and sparse linear regression based on the $γ$-divergence. The estimator of the $γ$-divergence has stron… ▽ More The generalized linear model (GLM) plays a key role in regression analyses. In high-dimensional data, the sparse GLM has been used but it is not robust against outliers. Recently, the robust methods have been proposed for the specific example of the sparse GLM. Among them, we focus on the robust and sparse linear regression based on the $γ$-divergence. The estimator of the $γ$-divergence has strong robustness under heavy contamination. In this paper, we extend the robust and sparse linear regression based on the $γ$-divergence to the robust and sparse GLM based on the $γ$-divergence with a stochastic optimization approach in order to obtain the estimate. We adopt the randomized stochastic projected gradient descent as a stochastic optimization approach and extend the established convergence property to the classical first-order necessary condition. By virtue of the stochastic optimization approach, we can efficiently estimate parameters for very large problems. Particularly, we show the linear regression, logistic regression and Poisson regression with $L_1$ regularization in detail as specific examples of robust and sparse GLM. In numerical experiments and real data analysis, the proposed method outperformed comparative methods. △ Less

Submitted 8 February, 2018; originally announced February 2018.

Comments: 28 pages

arXiv:1711.01796 [pdf, ps, other]

Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables

Authors: Masaaki Takada, Taiji Suzuki, Hironori Fujisawa

Abstract: Sparse regularization such as $\ell_1$ regularization is a quite powerful and widely used strategy for high dimensional learning problems. The effectiveness of sparse regularization has been supported practically and theoretically by several studies. However, one of the biggest issues in sparse regularization is that its performance is quite sensitive to correlations between features. Ordinary… ▽ More Sparse regularization such as $\ell_1$ regularization is a quite powerful and widely used strategy for high dimensional learning problems. The effectiveness of sparse regularization has been supported practically and theoretically by several studies. However, one of the biggest issues in sparse regularization is that its performance is quite sensitive to correlations between features. Ordinary $\ell_1$ regularization can select variables correlated with each other, which results in deterioration of not only its generalization error but also interpretability. In this paper, we propose a new regularization method, "Independently Interpretable Lasso" (IILasso). Our proposed regularizer suppresses selecting correlated variables, and thus each active variable independently affects the objective variable in the model. Hence, we can interpret regression coefficients intuitively and also improve the performance by avoiding overfitting. We analyze theoretical property of IILasso and show that the proposed method is much advantageous for its sign recovery and achieves almost minimax optimal convergence rate. Synthetic and real data analyses also indicate the effectiveness of IILasso. △ Less

Submitted 22 February, 2018; v1 submitted 6 November, 2017; originally announced November 2017.

arXiv:1609.08886 [pdf, ps, other]

doi 10.1016/j.csda.2018.03.008

Sparse principal component regression for generalized linear models

Authors: Shuichi Kawano, Hironori Fujisawa, Toyoyuki Takada, Toshihiko Shiroishi

Abstract: Principal component regression (PCR) is a widely used two-stage procedure: principal component analysis (PCA), followed by regression in which the selected principal components are regarded as new explanatory variables in the model. Note that PCA is based only on the explanatory variables, so the principal components are not selected using the information on the response variable. In this paper, w… ▽ More Principal component regression (PCR) is a widely used two-stage procedure: principal component analysis (PCA), followed by regression in which the selected principal components are regarded as new explanatory variables in the model. Note that PCA is based only on the explanatory variables, so the principal components are not selected using the information on the response variable. In this paper, we propose a one-stage procedure for PCR in the framework of generalized linear models. The basic loss function is based on a combination of the regression loss and PCA loss. An estimate of the regression parameter is obtained as the minimizer of the basic loss function with a sparse penalty. We call the proposed method sparse principal component regression for generalized linear models (SPCR-glm). Taking the two loss function into consideration simultaneously, SPCR-glm enables us to obtain sparse principal component loadings that are related to a response variable. However, a combination of loss functions may cause a parameter identification problem, but this potential problem is avoided by virtue of the sparse penalty. Thus, the sparse penalty plays two roles in this method. The parameter estimation procedure is proposed using various update algorithms with the coordinate descent algorithm. We apply SPCR-glm to two real datasets, doctor visits data and mouse consomic strain data. SPCR-glm provides more easily interpretable principal component (PC) scores and clearer classification on PC plots than the usual PCA. △ Less

Submitted 12 October, 2016; v1 submitted 28 September, 2016; originally announced September 2016.

Comments: 29 pages

Journal ref: Computational Statistics & Data Analysis 124 (2018) 180-196

arXiv:1604.06637 [pdf, ps, other]

doi 10.3390/e19110608

Robust and Sparse Regression via $γ$-divergence

Authors: Takayuki Kawashima, Hironori Fujisawa

Abstract: In high-dimensional data, many sparse regression methods have been proposed. However, they may not be robust against outliers. Recently, the use of density power weight has been studied for robust parameter estimation and the corresponding divergences have been discussed. One of such divergences is the $γ$-divergence and the robust estimator using the $γ$-divergence is known for having a strong ro… ▽ More In high-dimensional data, many sparse regression methods have been proposed. However, they may not be robust against outliers. Recently, the use of density power weight has been studied for robust parameter estimation and the corresponding divergences have been discussed. One of such divergences is the $γ$-divergence and the robust estimator using the $γ$-divergence is known for having a strong robustness. In this paper, we consider the robust and sparse regression based on $γ$-divergence. We extend the $γ$-divergence to the regression problem and show that it has a strong robustness under heavy contamination even when outliers are heterogeneous. The loss function is constructed by an empirical estimate of the $γ$-divergence with sparse regularization and the parameter estimate is defined as the minimizer of the loss function. To obtain the robust and sparse estimate, we propose an efficient update algorithm which has a monotone decreasing property of the loss function. Particularly, we discuss a linear regression problem with $L_1$ regularization in detail. In numerical experiments and real data analyses, we see that the proposed method outperforms past robust and sparse methods. △ Less

Submitted 29 August, 2016; v1 submitted 22 April, 2016; originally announced April 2016.

Comments: 25 pages

arXiv:1508.05571 [pdf, other]

Robust sparse Gaussian graphical modeling

Authors: Kei Hirose, Hironori Fujisawa, Jun Sese

Abstract: Gaussian graphical modeling has been widely used to explore various network structures, such as gene regulatory networks and social networks. We often use a penalized maximum likelihood approach with the $L_1$ penalty for learning a high-dimensional graphical model. However, the penalized maximum likelihood procedure is sensitive to outliers. To overcome this problem, we introduce a robust estimat… ▽ More Gaussian graphical modeling has been widely used to explore various network structures, such as gene regulatory networks and social networks. We often use a penalized maximum likelihood approach with the $L_1$ penalty for learning a high-dimensional graphical model. However, the penalized maximum likelihood procedure is sensitive to outliers. To overcome this problem, we introduce a robust estimation procedure based on the $γ$-divergence. The proposed method has a redescending property, which is known as a desirable property in robust statistics. The parameter estimation procedure is constructed using the Majorize-Minimization algorithm, which guarantees that the objective function monotonically decreases at each iteration. Extensive simulation studies showed that our procedure performed much better than the existing methods, in particular, when the contamination ratio was large. Two real data analyses were carried out to illustrate the usefulness of our proposed procedure. △ Less

Submitted 12 June, 2017; v1 submitted 23 August, 2015; originally announced August 2015.

Comments: 27 pages

arXiv:1505.05257 [pdf, other]

Sparse and Robust Linear Regression: An Optimization Algorithm and Its Statistical Properties

Authors: Shota Katayama, Hironori Fujisawa

Abstract: This paper studies sparse linear regression analysis with outliers in the responses. A parameter vector for modeling outliers is added to the standard linear regression model and then the sparse estimation problem for both coefficients and outliers is considered. The $\ell_{1}$ penalty is imposed for the coefficients, while various penalties including redescending type penalties are for the outlie… ▽ More This paper studies sparse linear regression analysis with outliers in the responses. A parameter vector for modeling outliers is added to the standard linear regression model and then the sparse estimation problem for both coefficients and outliers is considered. The $\ell_{1}$ penalty is imposed for the coefficients, while various penalties including redescending type penalties are for the outliers. To solve the sparse estimation problem, we introduce an optimization algorithm. Under some conditions, we show the algorithmic and statistical convergence property for the coefficients obtained by the algorithm. Moreover, it is shown that the algorithm can recover the true support of the coefficients with probability going to one. △ Less

Submitted 20 May, 2015; originally announced May 2015.

Comments: 23 pages, 2 figures

MSC Class: 62J05 (Primary); 62F35 (Secondary)

arXiv:1412.1411 [pdf, other]

On the Weak Convergence and Central Limit Theorem of Blurring and Nonblurring Processes with Application to Robust Location Estimation

Authors: Ting-Li Chen, Hironori Fujisawa, Su-Yun Huang, Chii-Ruey Hwang

Abstract: This article studies the weak convergence and associated Central Limit Theorem for blurring and nonblurring processes. Then, they are applied to the estimation of location parameter. Simulation studies show that the location estimation based on the convergence point of blurring process is more robust and often more efficient than that of nonblurring process. This article studies the weak convergence and associated Central Limit Theorem for blurring and nonblurring processes. Then, they are applied to the estimation of location parameter. Simulation studies show that the location estimation based on the convergence point of blurring process is more robust and often more efficient than that of nonblurring process. △ Less

Submitted 27 January, 2015; v1 submitted 3 December, 2014; originally announced December 2014.

arXiv:1402.6455 [pdf, ps, other]

doi 10.1016/j.csda.2015.03.016

Sparse principal component regression with adaptive loading

Authors: Shuichi Kawano, Hironori Fujisawa, Toyoyuki Takada, Toshihiko Shiroishi

Abstract: Principal component regression (PCR) is a two-stage procedure that selects some principal components and then constructs a regression model regarding them as new explanatory variables. Note that the principal components are obtained from only explanatory variables and not considered with the response variable. To address this problem, we propose the sparse principal component regression (SPCR) tha… ▽ More Principal component regression (PCR) is a two-stage procedure that selects some principal components and then constructs a regression model regarding them as new explanatory variables. Note that the principal components are obtained from only explanatory variables and not considered with the response variable. To address this problem, we propose the sparse principal component regression (SPCR) that is a one-stage procedure for PCR. SPCR enables us to adaptively obtain sparse principal component loadings that are related to the response variable and select the number of principal components simultaneously. SPCR can be obtained by the convex optimization problem for each of parameters with the coordinate descent algorithm. Monte Carlo simulations and real data analyses are performed to illustrate the effectiveness of SPCR. △ Less

Submitted 31 October, 2014; v1 submitted 26 February, 2014; originally announced February 2014.

Comments: 24 pages

MSC Class: 62H25; 62J07

Journal ref: Computational Statistics & Data Analysis 89 (2015) 192-203

arXiv:1311.5301 [pdf, other]

Robust Estimation under Heavy Contamination using Enlarged Models

Authors: Takafumi Kanamori, Hironori Fujisawa

Abstract: In data analysis, contamination caused by outliers is inevitable, and robust statistical methods are strongly demanded. In this paper, our concern is to develop a new approach for robust data analysis based on scoring rules. The scoring rule is a discrepancy measure to assess the quality of probabilistic forecasts. We propose a simple way of estimating not only the parameter in the statistical mod… ▽ More In data analysis, contamination caused by outliers is inevitable, and robust statistical methods are strongly demanded. In this paper, our concern is to develop a new approach for robust data analysis based on scoring rules. The scoring rule is a discrepancy measure to assess the quality of probabilistic forecasts. We propose a simple way of estimating not only the parameter in the statistical model but also the contamination ratio of outliers. Estimating the contamination ratio is important, since one can detect outliers out of the training samples based on the estimated contamination ratio. For this purpose, we use scoring rules with an extended statistical models, that is called the enlarged models. Also, the regression problems are considered. We study a complex heterogeneous contamination, in which the contamination ratio of outliers in the dependent variable may depend on the independent variable. We propose a simple method to obtain a robust regression estimator under heterogeneous contamination. In addition, we show that our method provides also an estimator of the expected contamination ratio that is available to detect the outliers out of training samples. Numerical experiments demonstrate the effectiveness of our methods compared to the conventional estimators. △ Less

Submitted 20 November, 2013; originally announced November 2013.

Comments: 32 pages, 3 figures, 3 tables

arXiv:1305.2473 [pdf, ps, other]

Affine Invariant Divergences associated with Composite Scores and its Applications

Authors: Takafumi Kanamori, Hironori Fujisawa

Abstract: In statistical analysis, measuring a score of predictive performance is an important task. In many scientific fields, appropriate scores were tailored to tackle the problems at hand. A proper score is a popular tool to obtain statistically consistent forecasts. Furthermore, a mathematical characterization of the proper score was studied. As a result, it was revealed that the proper score correspon… ▽ More In statistical analysis, measuring a score of predictive performance is an important task. In many scientific fields, appropriate scores were tailored to tackle the problems at hand. A proper score is a popular tool to obtain statistically consistent forecasts. Furthermore, a mathematical characterization of the proper score was studied. As a result, it was revealed that the proper score corresponds to a Bregman divergence, which is an extension of the squared distance over the set of probability distributions. In the present paper, we introduce composite scores as an extension of the typical scores in order to obtain a wider class of probabilistic forecasting. Then, we propose a class of composite scores, named Holder scores, that induce equivariant estimators. The equivariant estimators have a favorable property, implying that the estimator is transformed in a consistent way, when the data is transformed. In particular, we deal with the affine transformation of the data. By using the equivariant estimators under the affine transformation, one can obtain estimators that do no essentially depend on the choice of the system of units in the measurement. Conversely, we prove that the Holder score is characterized by the invariance property under the affine transformations. Furthermore, we investigate statistical properties of the estimators using Holder scores for the statistical problems including estimation of regression functions and robust parameter estimation, and illustrate the usefulness of the newly introduced scores for statistical forecasting. △ Less

Submitted 11 May, 2013; originally announced May 2013.

Comments: 24 pages

arXiv:1012.4921 [pdf, ps, other]

Approximate tail probabilities of the maximum of a chi-square field on multi-dimensional lattice points and their applications to detection of loci interactions

Authors: Satoshi Kuriki, Yoshiaki Harushima, Hironori Fujisawa, Nori Kurata

Abstract: Define a chi-square random field on a multi-dimensional lattice points index set with a direct-product covariance structure, and consider the distribution of the maximum of this random field. We provide two approximate formulas for the upper tail probability of the distribution based on nonlinear renewal theory and an integral-geometric approach called the volume-of-tube method. This study is moti… ▽ More Define a chi-square random field on a multi-dimensional lattice points index set with a direct-product covariance structure, and consider the distribution of the maximum of this random field. We provide two approximate formulas for the upper tail probability of the distribution based on nonlinear renewal theory and an integral-geometric approach called the volume-of-tube method. This study is motivated by the detection problem of the interactive loci pairs which play an important role in forming biological species. The joint distribution of scan statistics for detecting the pairs is regarded as the chi-square random field above, and hence the multiplicity-adjusted $p$-value can be calculated by using the proposed approximate formulas. By using these formulas, we examine the data of Mizuta, et al. (2010) who reported a new interactive loci pair of rice inter-subspecies. △ Less

Submitted 30 March, 2013; v1 submitted 22 December, 2010; originally announced December 2010.

Comments: 33 pages, 5 figures, 2 tables

Showing 1–23 of 23 results for author: Fujisawa, H