-
Geodesic Causal Inference
Authors:
Daisuke Kurisu,
Yidong Zhou,
Taisuke Otsu,
Hans-Georg Müller
Abstract:
Adjusting for confounding and imbalance when establishing statistical relationships is an increasingly important task, and causal inference methods have emerged as the most popular tool to achieve this. Causal inference has been developed mainly for scalar outcomes and recently for distributional outcomes. We introduce here a general framework for causal inference when outcomes reside in general g…
▽ More
Adjusting for confounding and imbalance when establishing statistical relationships is an increasingly important task, and causal inference methods have emerged as the most popular tool to achieve this. Causal inference has been developed mainly for scalar outcomes and recently for distributional outcomes. We introduce here a general framework for causal inference when outcomes reside in general geodesic metric spaces, where we draw on a novel geodesic calculus that facilitates scalar multiplication for geodesics and the characterization of treatment effects through the concept of the geodesic average treatment effect. Using ideas from Fréchet regression, we develop estimation methods of the geodesic average treatment effect and derive consistency and rates of convergence for the proposed estimators. We also study uncertainty quantification and inference for the treatment effect. Our methodology is illustrated by a simulation study and real data examples for compositional outcomes of U.S. statewise energy source data to study the effect of coal mining, network data of New York taxi trips, where the effect of the COVID-19 pandemic is of interest, and brain functional connectivity network data to study the effect of Alzheimer's disease.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Local-Polynomial Estimation for Multivariate Regression Discontinuity Designs
Authors:
Masayuki Sawada,
Takuya Ishihara,
Daisuke Kurisu,
Yasumasa Matsuda
Abstract:
We introduce a multivariate local-linear estimator for multivariate regression discontinuity designs in which treatment is assigned by crossing a boundary in the space of running variables. The dominant approach uses the Euclidean distance from a boundary point as the scalar running variable; hence, multivariate designs are handled as uni-variate designs. However, the distance running variable is…
▽ More
We introduce a multivariate local-linear estimator for multivariate regression discontinuity designs in which treatment is assigned by crossing a boundary in the space of running variables. The dominant approach uses the Euclidean distance from a boundary point as the scalar running variable; hence, multivariate designs are handled as uni-variate designs. However, the distance running variable is incompatible with the assumption for asymptotic validity. We handle multivariate designs as multivariate. In this study, we develop a novel asymptotic normality for multivariate local-polynomial estimators. Our estimator is asymptotically valid and can capture heterogeneous treatment effects over the boundary. We demonstrate the effectiveness of our estimator through numerical simulations. Our empirical illustration of a Colombian scholarship study reveals a richer heterogeneity (including its absence) of the treatment effect that is hidden in the original estimates.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Series ridge regression for spatial data on $\mathbb{R}^d$
Authors:
Daisuke Kurisu,
Yasumasa Matsuda
Abstract:
This paper develops a general asymptotic theory of series estimators for spatial data collected at irregularly spaced locations within a sampling region $R_n \subset \mathbb{R}^d$. We employ a stochastic sampling design that can flexibly generate irregularly spaced sampling sites, encompassing both pure increasing and mixed increasing domain frameworks. Specifically, we focus on a spatial trend re…
▽ More
This paper develops a general asymptotic theory of series estimators for spatial data collected at irregularly spaced locations within a sampling region $R_n \subset \mathbb{R}^d$. We employ a stochastic sampling design that can flexibly generate irregularly spaced sampling sites, encompassing both pure increasing and mixed increasing domain frameworks. Specifically, we focus on a spatial trend regression model and a nonparametric regression model with spatially dependent covariates. For these models, we investigate $L^2$-penalized series estimation of the trend and regression functions. We establish uniform and $L^2$ convergence rates and multivariate central limit theorems for general series estimators as main results. Additionally, we show that spline and wavelet series estimators achieve optimal uniform and $L^2$ convergence rates and propose methods for constructing confidence intervals for these estimators. Finally, we demonstrate that our dependence structure conditions on the underlying spatial processes include a broad class of random fields, including Lévy-driven continuous autoregressive and moving average random fields.
△ Less
Submitted 4 March, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Hierarchical Regression Discontinuity Design: Pursuing Subgroup Treatment Effects
Authors:
Shonosuke Sugasawa,
Takuya Ishihara,
Daisuke Kurisu
Abstract:
Regression discontinuity design (RDD) is widely adopted for causal inference under intervention determined by a continuous variable. While one is interested in treatment effect heterogeneity by subgroups in many applications, RDD typically suffers from small subgroup-wise sample sizes, which makes the estimation results highly instable. To solve this issue, we introduce hierarchical RDD (HRDD), a…
▽ More
Regression discontinuity design (RDD) is widely adopted for causal inference under intervention determined by a continuous variable. While one is interested in treatment effect heterogeneity by subgroups in many applications, RDD typically suffers from small subgroup-wise sample sizes, which makes the estimation results highly instable. To solve this issue, we introduce hierarchical RDD (HRDD), a hierarchical Bayes approach for pursuing treatment effect heterogeneity in RDD. A key feature of HRDD is to employ a pseudo-model based on a loss function to estimate subgroup-level parameters of treatment effects under RDD, and assign a hierarchical prior distribution to ''borrow strength'' from other subgroups. The posterior computation can be easily done by a simple Gibbs sampling, and the optimal bandwidth can be automatically selected by the Hyvärinen scores for unnormalized models. We demonstrate the proposed HRDD through simulation and real data analysis, and show that HRDD provides much more stable point and interval estimation than separately applying the standard RDD method to each subgroup.
△ Less
Submitted 19 June, 2024; v1 submitted 4 September, 2023;
originally announced September 2023.
-
Local polynomial trend regression for spatial data on $\mathbb{R}^d$
Authors:
Daisuke Kurisu,
Yasumasa Matsuda
Abstract:
This paper develops a general asymptotic theory of local polynomial (LP) regression for spatial data observed at irregularly spaced locations in a sampling region $R_n \subset \mathbb{R}^d$. We adopt a stochastic sampling design that can generate irregularly spaced sampling sites in a flexible manner including both pure increasing and mixed increasing domain frameworks. We first introduce a nonpar…
▽ More
This paper develops a general asymptotic theory of local polynomial (LP) regression for spatial data observed at irregularly spaced locations in a sampling region $R_n \subset \mathbb{R}^d$. We adopt a stochastic sampling design that can generate irregularly spaced sampling sites in a flexible manner including both pure increasing and mixed increasing domain frameworks. We first introduce a nonparametric regression model for spatial data defined on $\mathbb{R}^d$ and then establish the asymptotic normality of LP estimators with general order $p \geq 1$. We also propose methods for constructing confidence intervals and establishing uniform convergence rates of LP estimators. Our dependence structure conditions on the underlying processes cover a wide class of random fields such as Lévy-driven continuous autoregressive moving average random fields. As an application of our main results, we discuss a two-sample testing problem for mean functions and their partial derivatives.
△ Less
Submitted 23 December, 2023; v1 submitted 24 November, 2022;
originally announced November 2022.
-
Shrinkage Methods for Treatment Choice
Authors:
Takuya Ishihara,
Daisuke Kurisu
Abstract:
This study examines the problem of determining whether to treat individuals based on observed covariates. The most common decision rule is the conditional empirical success (CES) rule proposed by Manski (2004), which assigns individuals to treatments that yield the best experimental outcomes conditional on the observed covariates. Conversely, using shrinkage estimators, which shrink unbiased but n…
▽ More
This study examines the problem of determining whether to treat individuals based on observed covariates. The most common decision rule is the conditional empirical success (CES) rule proposed by Manski (2004), which assigns individuals to treatments that yield the best experimental outcomes conditional on the observed covariates. Conversely, using shrinkage estimators, which shrink unbiased but noisy preliminary estimates toward the average of these estimates, is a common approach in statistical estimation problems because it is well-known that shrinkage estimators have smaller mean squared errors than unshrunk estimators. Inspired by this idea, we propose a computationally tractable shrinkage rule that selects the shrinkage factor by minimizing the upper bound of the maximum regret. Then, we compare the maximum regret of the proposed shrinkage rule with that of CES and pooling rules when the parameter space is correctly specified or misspecified. Our theoretical results demonstrate that the shrinkage rule performs well in many cases and these findings are further supported by numerical experiments. Specifically, we show that the maximum regret of the shrinkage rule can be strictly smaller than that of the CES and pooling rules in certain cases when the parameter space is correctly specified. In addition, we find that the shrinkage rule is robust against misspecifications of the parameter space. Finally, we apply our method to experimental data from the National Job Training Partnership Act Study.
△ Less
Submitted 20 June, 2024; v1 submitted 31 October, 2022;
originally announced October 2022.
-
Adaptive deep learning for nonlinear time series models
Authors:
Daisuke Kurisu,
Riku Fukami,
Yuta Koike
Abstract:
In this paper, we develop a general theory for adaptive nonparametric estimation of the mean function of a non-stationary and nonlinear time series model using deep neural networks (DNNs). We first consider two types of DNN estimators, non-penalized and sparse-penalized DNN estimators, and establish their generalization error bounds for general non-stationary time series. We then derive minimax lo…
▽ More
In this paper, we develop a general theory for adaptive nonparametric estimation of the mean function of a non-stationary and nonlinear time series model using deep neural networks (DNNs). We first consider two types of DNN estimators, non-penalized and sparse-penalized DNN estimators, and establish their generalization error bounds for general non-stationary time series. We then derive minimax lower bounds for estimating mean functions belonging to a wide class of nonlinear autoregressive (AR) models that include nonlinear generalized additive AR, single index, and threshold AR models. Building upon the results, we show that the sparse-penalized DNN estimator is adaptive and attains the minimax optimal rates up to a poly-logarithmic factor for many nonlinear AR models. Through numerical simulations, we demonstrate the usefulness of the DNN methods for estimating nonlinear AR models with intrinsic low-dimensional structures and discontinuous or rough mean functions, which is consistent with our theory.
△ Less
Submitted 2 May, 2024; v1 submitted 6 July, 2022;
originally announced July 2022.
-
Adaptively Robust Small Area Estimation: Balancing Robustness and Efficiency of Empirical Bayes Confidence Intervals
Authors:
Daisuke Kurisu,
Takuya Ishihara,
Shonosuke Sugasawa
Abstract:
Empirical Bayes small area estimation based on the well-known Fay-Herriot model may produce unreliable estimates when outlying areas exist. Existing robust methods against outliers or model misspecification are generally inefficient when the assumed distribution is plausible. This paper proposes a simple modification of the standard empirical Bayes methods with adaptively balancing robustness and…
▽ More
Empirical Bayes small area estimation based on the well-known Fay-Herriot model may produce unreliable estimates when outlying areas exist. Existing robust methods against outliers or model misspecification are generally inefficient when the assumed distribution is plausible. This paper proposes a simple modification of the standard empirical Bayes methods with adaptively balancing robustness and efficiency. The proposed method employs gamma-divergence instead of the marginal log-likelihood and optimizes a tuning parameter controlling robustness by pursuing the efficiency of empirical Bayes confidence intervals for areal parameters. We provide an asymptotic theory of the proposed method under both the correct specification of the assumed distribution and the existence of outlying areas. We investigate the numerical performance of the proposed method through simulations and an application to small area estimation of average crime numbers.
△ Less
Submitted 27 June, 2022; v1 submitted 25 August, 2021;
originally announced August 2021.
-
On the estimation of locally stationary functional time series
Authors:
Daisuke Kurisu
Abstract:
This study develops an asymptotic theory for estimating the time-varying characteristics of locally stationary functional time series (LSFTS). We investigate a kernel-based method to estimate the time-varying covariance operator and the time-varying mean function of an LSFTS. In particular, we derive the convergence rate of the kernel estimator of the covariance operator and associated eigenvalue…
▽ More
This study develops an asymptotic theory for estimating the time-varying characteristics of locally stationary functional time series (LSFTS). We investigate a kernel-based method to estimate the time-varying covariance operator and the time-varying mean function of an LSFTS. In particular, we derive the convergence rate of the kernel estimator of the covariance operator and associated eigenvalue and eigenfunctions and establish a central limit theorem for the kernel-based locally weighted sample mean. As applications of our results, we discuss methods for testing the equality of time-varying mean functions in two functional samples.
△ Less
Submitted 22 May, 2023; v1 submitted 25 May, 2021;
originally announced May 2021.
-
Nonparametric regression for locally stationary functional time series
Authors:
Daisuke Kurisu
Abstract:
In this study, we develop an asymptotic theory of nonparametric regression for a locally stationary functional time series. First, we introduce the notion of a locally stationary functional time series (LSFTS) that takes values in a semi-metric space. Then, we propose a nonparametric model for LSFTS with a regression function that changes smoothly over time. We establish the uniform convergence ra…
▽ More
In this study, we develop an asymptotic theory of nonparametric regression for a locally stationary functional time series. First, we introduce the notion of a locally stationary functional time series (LSFTS) that takes values in a semi-metric space. Then, we propose a nonparametric model for LSFTS with a regression function that changes smoothly over time. We establish the uniform convergence rates of a class of kernel estimators, the Nadaraya-Watson (NW) estimator of the regression function, and a central limit theorem of the NW estimator.
△ Less
Submitted 1 July, 2022; v1 submitted 17 May, 2021;
originally announced May 2021.
-
Gaussian approximation and spatially dependent wild bootstrap for high-dimensional spatial data
Authors:
Daisuke Kurisu,
Kengo Kato,
Xiaofeng Shao
Abstract:
In this paper, we establish a high-dimensional CLT for the sample mean of $p$-dimensional spatial data observed over irregularly spaced sampling sites in $\mathbb{R}^d$, allowing the dimension $p$ to be much larger than the sample size $n$. We adopt a stochastic sampling scheme that can generate irregularly spaced sampling sites in a flexible manner and include both pure increasing domain and mixe…
▽ More
In this paper, we establish a high-dimensional CLT for the sample mean of $p$-dimensional spatial data observed over irregularly spaced sampling sites in $\mathbb{R}^d$, allowing the dimension $p$ to be much larger than the sample size $n$. We adopt a stochastic sampling scheme that can generate irregularly spaced sampling sites in a flexible manner and include both pure increasing domain and mixed increasing domain frameworks. To facilitate statistical inference, we develop the spatially dependent wild bootstrap (SDWB) and justify its asymptotic validity in high dimensions by deriving error bounds that hold almost surely conditionally on the stochastic sampling sites. Our dependence conditions on the underlying random field cover a wide class of random fields such as Gaussian random fields and continuous autoregressive moving average random fields. Through numerical simulations and a real data analysis, we demonstrate the usefulness of our bootstrap-based inference in several applications, including joint confidence interval construction for high-dimensional spatial data and change-point detection for spatio-temporal data.
△ Less
Submitted 26 March, 2021; v1 submitted 19 March, 2021;
originally announced March 2021.
-
Nonparametric regression for locally stationary random fields under stochastic sampling design
Authors:
Daisuke Kurisu
Abstract:
In this study, we develop an asymptotic theory of nonparametric regression for locally stationary random fields (LSRFs) $\{{\bf X}_{{\bf s}, A_{n}}: {\bf s} \in R_{n} \}$ in $\mathbb{R}^{p}$ observed at irregularly spaced locations in $R_{n} =[0,A_{n}]^{d} \subset \mathbb{R}^{d}$. We first derive the uniform convergence rate of general kernel estimators, followed by the asymptotic normality of an…
▽ More
In this study, we develop an asymptotic theory of nonparametric regression for locally stationary random fields (LSRFs) $\{{\bf X}_{{\bf s}, A_{n}}: {\bf s} \in R_{n} \}$ in $\mathbb{R}^{p}$ observed at irregularly spaced locations in $R_{n} =[0,A_{n}]^{d} \subset \mathbb{R}^{d}$. We first derive the uniform convergence rate of general kernel estimators, followed by the asymptotic normality of an estimator for the mean function of the model. Moreover, we consider additive models to avoid the curse of dimensionality arising from the dependence of the convergence rate of estimators on the number of covariates. Subsequently, we derive the uniform convergence rate and joint asymptotic normality of the estimators for additive functions. We also introduce approximately $m_{n}$-dependent RFs to provide examples of LSRFs. We find that these RFs include a wide class of Lévy-driven moving average RFs.
△ Less
Submitted 6 July, 2022; v1 submitted 13 May, 2020;
originally announced May 2020.
-
Particulate Air Pollution, Birth Outcomes, and Infant Mortality: Evidence from Japan's Automobile Emission Control Law of 1992
Authors:
Tatsuki Inoue,
Nana Nunokawa,
Daisuke Kurisu,
Kota Ogasawara
Abstract:
This study investigates the impacts of the Automobile NOx Law of 1992 on ambient air pollutants and fetal and infant health outcomes in Japan. Using panel data taken from more than 1,500 monitoring stations between 1987 and 1997, we find that NOx and SO2 levels reduced by 87% and 52%, respectively in regulated areas following the 1992 regulation. In addition, using a municipal-level Vital Statisti…
▽ More
This study investigates the impacts of the Automobile NOx Law of 1992 on ambient air pollutants and fetal and infant health outcomes in Japan. Using panel data taken from more than 1,500 monitoring stations between 1987 and 1997, we find that NOx and SO2 levels reduced by 87% and 52%, respectively in regulated areas following the 1992 regulation. In addition, using a municipal-level Vital Statistics panel dataset and adopting the regression differences-in-differences method, we find that the enactment of the regulation explained most of the improvements in the fetal death rate between 1991 and 1993. This study is the first to provide evidence on the positive impacts of this large-scale automobile regulation policy on fetal health.
△ Less
Submitted 9 December, 2019; v1 submitted 10 May, 2019;
originally announced May 2019.
-
On nonparametric inference for spatial regression models under domain expanding and infill asymptotics
Authors:
Daisuke Kurisu
Abstract:
In this paper, we develop nonparametric inference on spatial regression models as an extension of Lu and Tj\ostheim(2014), which develops nonparametric inference on density functions of stationary spatial processes under domain expanding and infill (DEI) asymptotics. In particular, we derive multivariate central limit theorems of mean and variance functions of nonparametric spatial regression mode…
▽ More
In this paper, we develop nonparametric inference on spatial regression models as an extension of Lu and Tj\ostheim(2014), which develops nonparametric inference on density functions of stationary spatial processes under domain expanding and infill (DEI) asymptotics. In particular, we derive multivariate central limit theorems of mean and variance functions of nonparametric spatial regression models. Built upon those results, we propose a method to construct confidence bands for mean and variance functions.
△ Less
Submitted 11 July, 2019; v1 submitted 25 April, 2018;
originally announced April 2018.
-
Nonparametric inference on Lévy measures of compound Poisson-driven Ornstein-Uhlenbeck processes under macroscopic discrete observations
Authors:
Daisuke Kurisu
Abstract:
This study examines a nonparametric inference on a stationary Lévy-driven Ornstein-Uhlenbeck (OU) process $X = (X_{t})_{t \geq 0}$ with a compound Poisson subordinator. We propose a new spectral estimator for the Lévy measure of the Lévy-driven OU process $X$ under macroscopic observations. We also derive, for the estimator, multivariate central limit theorems over a finite number of design points…
▽ More
This study examines a nonparametric inference on a stationary Lévy-driven Ornstein-Uhlenbeck (OU) process $X = (X_{t})_{t \geq 0}$ with a compound Poisson subordinator. We propose a new spectral estimator for the Lévy measure of the Lévy-driven OU process $X$ under macroscopic observations. We also derive, for the estimator, multivariate central limit theorems over a finite number of design points, and high-dimensional central limit theorems in the case wherein the number of design points increases with an increase in the sample size. Built on these asymptotic results, we develop methods to construct confidence bands for the Lévy measure and propose a practical method for bandwidth selection.
△ Less
Submitted 11 July, 2019; v1 submitted 23 March, 2018;
originally announced March 2018.
-
Bootstrap confidence bands for spectral estimation of Lévy densities under high-frequency observations
Authors:
Kengo Kato,
Daisuke Kurisu
Abstract:
This paper develops bootstrap methods to construct uniform confidence bands for nonparametric spectral estimation of Lévy densities under high-frequency observations. We assume that we observe $n$ discrete observations at frequency $1/Δ> 0$, and work with the high-frequency setup where $Δ= Δ_{n} \to 0$ and $nΔ\to \infty$ as $n \to \infty$. We employ a spectral (or Fourier-based) estimator of the L…
▽ More
This paper develops bootstrap methods to construct uniform confidence bands for nonparametric spectral estimation of Lévy densities under high-frequency observations. We assume that we observe $n$ discrete observations at frequency $1/Δ> 0$, and work with the high-frequency setup where $Δ= Δ_{n} \to 0$ and $nΔ\to \infty$ as $n \to \infty$. We employ a spectral (or Fourier-based) estimator of the Lévy density, and develop novel implementations of Gaussian multiplier (or wild) and empirical (or Efron's) bootstraps to construct confidence bands for the spectral estimator on a compact set that does not intersect the origin. We provide conditions under which the proposed confidence bands are asymptotically valid. Our confidence bands are shown to be asymptotically valid for a wide class of Lévy processes. We also develop a practical method for bandwidth selection, and conduct simulation studies to investigate the finite sample performance of the proposed confidence bands.
△ Less
Submitted 29 May, 2017; v1 submitted 1 May, 2017;
originally announced May 2017.
-
Discretization of Self-Exciting Peaks Over Threshold Models
Authors:
Daisuke Kurisu
Abstract:
In this paper, a framework on a discrete observation of (marked) point processes under the high-frequency observation is developed. Based on this framework, we first clarify the relation between random coefficient integer-valued autoregressive process with infinite order (RCINAR($\infty$)) and i.i.d.-marked self-exciting process, known as marked Hawkes process. For this purpose, we show that the p…
▽ More
In this paper, a framework on a discrete observation of (marked) point processes under the high-frequency observation is developed. Based on this framework, we first clarify the relation between random coefficient integer-valued autoregressive process with infinite order (RCINAR($\infty$)) and i.i.d.-marked self-exciting process, known as marked Hawkes process. For this purpose, we show that the point process constructed of the sum of a RCINAR($\infty$) converge weakly to a marked Hawkes process. This limit theorem establish that RCINAR($\infty$) processes can be seen as a discretely observed marked Hawkes processes when the observation frequency increases and thus build a bridge between discrete-time series analysis and the analysis of continuous-time stochastic process and give a new perspective in the point process approach in extreme value theory. Second, we give a necessary and sufficient condition of the stationarity of RCINAR($\infty$) process and give its random coefficient autoregressive (RCAR) representation. Finally, as an application of our results, we establish a rigorous theoretical justification of self-exciting peaks over threshold (SEPOT) model, which is a well-known as a (marked) Hawkes process model for the empirical analysis of extremal events in financial econometrics and of which, however, the theoretical validity have rarely discussed. Simulation results of the asymptotic properties of RCINAR($\infty$) shows some interesting implications for statistical applications.
△ Less
Submitted 9 April, 2017; v1 submitted 19 December, 2016;
originally announced December 2016.
-
Power variations and testing for co-jumps: the small noise approach
Authors:
Daisuke Kurisu
Abstract:
In this paper we study the effects of noise on the bipower variation (BPV), realized volatility (RV) and testing for co-jumps in high-frequency data under the small noise framework. We first establish asymptotic properties of the BPV in this framework. In the presence of the small noise, the RV is asymptotically biased and the additional asymptotic conditional variance term appears in its limit di…
▽ More
In this paper we study the effects of noise on the bipower variation (BPV), realized volatility (RV) and testing for co-jumps in high-frequency data under the small noise framework. We first establish asymptotic properties of the BPV in this framework. In the presence of the small noise, the RV is asymptotically biased and the additional asymptotic conditional variance term appears in its limit distribution. We also give feasible estimation methods of the asymptotic conditional variances of the RV. Second, we derive the asymptotic distribution of the test statistic proposed in Jacod and Todorov(2009) under the presence of small noise for testing the presence of co-jumps in two dimensional Itô semimartingale. In contrast to the setting in Jacod and Todorov(2009), we show that the additional conditional asymptotic variance terms appear, and give consistent estimation procedures for the asymptotic conditional variances in order to make the test feasible. Simulation experiments show that our asymptotic results give reasonable approximations in the finite sample cases.
△ Less
Submitted 20 June, 2016; v1 submitted 9 May, 2016;
originally announced May 2016.