-
Integrated empirical measures and generalizations of classical goodness-of-fit statistics
Authors:
Hsien-Kuei Hwang,
Satoshi Kuriki
Abstract:
Based on $m$-fold integrated empirical measures, we study three new classes of goodness-of-fits tests, generalizing Anderson-Darling, Cramér-von Mises, and Watson statistics, respectively, and examine the corresponding limiting stochastic processes. The limiting null distributions of the statistics all lead to explicitly solvable cases with closed-form expressions for the corresponding Karhunen-Lo…
▽ More
Based on $m$-fold integrated empirical measures, we study three new classes of goodness-of-fits tests, generalizing Anderson-Darling, Cramér-von Mises, and Watson statistics, respectively, and examine the corresponding limiting stochastic processes. The limiting null distributions of the statistics all lead to explicitly solvable cases with closed-form expressions for the corresponding Karhunen-Loève expansions and covariance kernels. In particular, the eigenvalues are shown to be $\frac1{k(k+1)\cdots (k+2m-1)}$ for the generalized Anderson-Darling, $\frac1{(πk)^{2m}}$ for the generalized Cramér-von Mises, and $\frac1{2π\lceil k/2\rceil^{2m}}$ for the generalized Watson statistics, respectively. The infinite products of the resulting moment generating functions are further simplified to finite ones so as to facilitate efficient numerical calculations. These statistics are capable of detecting different features of the distributions and thus provide a useful toolbox for goodness-of-fit testing.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
EM Estimation of the B-spline Copula with Penalized Log-Likelihood Function
Authors:
Xiaoling Dou,
Satoshi Kuriki,
Gwo Dong Lin,
Donald Richards
Abstract:
The B-spline copula function is defined by a linear combination of elements of the normalized B-spline basis. We develop a modified EM algorithm, to maximize the penalized log-likelihood function, wherein we use the smoothly clipped absolute deviation (SCAD) penalty function for the penalization term. We conduct simulation studies to demonstrate the stability of the proposed numerical procedure, s…
▽ More
The B-spline copula function is defined by a linear combination of elements of the normalized B-spline basis. We develop a modified EM algorithm, to maximize the penalized log-likelihood function, wherein we use the smoothly clipped absolute deviation (SCAD) penalty function for the penalization term. We conduct simulation studies to demonstrate the stability of the proposed numerical procedure, show that penalization yields estimates with smaller mean-square errors when the true parameter matrix is sparse, and provide methods for determining tuning parameters and for model selection. We analyze as an example a data set consisting of birth and death rates from 237 countries, available at the website, ''Our World in Data,'' and we estimate the marginal density and distribution functions of those rates together with all parameters of our B-spline copula model.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Expected Euler characteristic method for the largest eigenvalue: (Skew-)orthogonal polynomial approach
Authors:
Satoshi Kuriki
Abstract:
The expected Euler characteristic (EEC) method is an integral-geometric method used to approximate the tail probability of the maximum of a random field on a manifold. Noting that the largest eigenvalue of a real-symmetric or Hermitian matrix is the maximum of the quadratic form of a unit vector, we provide EEC approximation formulas for the tail probability of the largest eigenvalue of orthogonal…
▽ More
The expected Euler characteristic (EEC) method is an integral-geometric method used to approximate the tail probability of the maximum of a random field on a manifold. Noting that the largest eigenvalue of a real-symmetric or Hermitian matrix is the maximum of the quadratic form of a unit vector, we provide EEC approximation formulas for the tail probability of the largest eigenvalue of orthogonally invariant random matrices of a large class. For this purpose, we propose a version of a skew-orthogonal polynomial by adding a side condition such that it is uniquely defined, and describe the EEC formulas in terms of the (skew-)orthogonal polynomials. In addition, for the classical random matrices (Gaussian, Wishart, and multivariate beta matrices), we analyze the limiting behavior of the EEC approximation as the matrix size goes to infinity under the so-called edge-asymptotic normalization. It is shown that the limit of the EEC formula approximates well the Tracy-Widom distributions in the upper tail area, as does the EEC formula when the matrix size is finite.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
Random eigenvalues of graphenes and the triangulation of plane
Authors:
Artur Bille,
Victor Buchstaber,
Simon Coste,
Satoshi Kuriki,
Evgeny Spodarev
Abstract:
We analyse the numbers of closed paths of length $k\in\mathbb{N}$ on two important regular lattices: the hexagonal lattice (also called $\textit{graphene}$ in chemistry) and its dual triangular lattice. These numbers form a moment sequence of specific random variables connected to the distance of a position of a planar random flight (in three steps) from the origin. Here, we refer to such a random…
▽ More
We analyse the numbers of closed paths of length $k\in\mathbb{N}$ on two important regular lattices: the hexagonal lattice (also called $\textit{graphene}$ in chemistry) and its dual triangular lattice. These numbers form a moment sequence of specific random variables connected to the distance of a position of a planar random flight (in three steps) from the origin. Here, we refer to such a random variable as a $\textit{random eigenvalue}$ of the underlying lattice. Explicit formulas for the probability density and characteristic functions of these random eigenvalues are given for both the hexagonal and the triangular lattice. Furthermore, it is proven that both probability distributions can be approximated by a functional of the random variable uniformly distributed on increasing intervals $[0,b]$ as $b\to\infty$. This yields a straightforward method to simulate these random eigenvalues without generating graphene and triangular lattice graphs. To demonstrate this approximation, we first prove a key integral identity for a specific series containing the third powers of the modified Bessel functions $I_n$ of $n$th order, $n\in\mathbb{Z}$. Such series play a crucial role in various contexts, in particular, in analysis, combinatorics, and theoretical physics.
△ Less
Submitted 2 June, 2023;
originally announced June 2023.
-
Robust Topological Inference in the Presence of Outliers
Authors:
Siddharth Vishwanath,
Bharath K. Sriperumbudur,
Kenji Fukumizu,
Satoshi Kuriki
Abstract:
The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular, the sublevel sets of the distance function are used in the computation of persistent homology -- a backbone of the topological data analysis pipeline. Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers. In this w…
▽ More
The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular, the sublevel sets of the distance function are used in the computation of persistent homology -- a backbone of the topological data analysis pipeline. Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers. In this work, we develop a framework of statistical inference for persistent homology in the presence of outliers. Drawing inspiration from recent developments in robust statistics, we propose a $\textit{median-of-means}$ variant of the distance function ($\textsf{MoM Dist}$), and establish its statistical properties. In particular, we show that, even in the presence of outliers, the sublevel filtrations and weighted filtrations induced by $\textsf{MoM Dist}$ are both consistent estimators of the true underlying population counterpart, and their rates of convergence in the bottleneck metric are controlled by the fraction of outliers in the data. Finally, we demonstrate the advantages of the proposed methodology through simulations and applications.
△ Less
Submitted 3 June, 2022;
originally announced June 2022.
-
The volume-of-tube method for Gaussian random fields with inhomogeneous variance
Authors:
Satoshi Kuriki,
Akimichi Takemura,
Jonathan E. Taylor
Abstract:
The tube method or the volume-of-tube method approximates the tail probability of the maximum of a smooth Gaussian random field with zero mean and unit variance. This method evaluates the volume of a spherical tube about the index set, and then transforms it to the tail probability. In this study, we generalize the tube method to a case in which the variance is not constant. We provide the volume…
▽ More
The tube method or the volume-of-tube method approximates the tail probability of the maximum of a smooth Gaussian random field with zero mean and unit variance. This method evaluates the volume of a spherical tube about the index set, and then transforms it to the tail probability. In this study, we generalize the tube method to a case in which the variance is not constant. We provide the volume formula for a spherical tube with a non-constant radius in terms of curvature tensors, and the tail probability formula of the maximum of a Gaussian random field with inhomogeneous variance, as well as its Laplace approximation. In particular, the critical radius of the tube is generalized for evaluation of the asymptotic approximation error. As an example, we discuss the approximation of the largest eigenvalue distribution of the Wishart matrix with a non-identity matrix parameter. The Bonferroni method is the tube method when the index set is a finite set. We provide the formula for the asymptotic approximation error for the Bonferroni method when the variance is not constant.
△ Less
Submitted 9 September, 2021; v1 submitted 4 August, 2021;
originally announced August 2021.
-
Minkowski functionals and the nonlinear perturbation theory in the large-scale structure: second-order effects
Authors:
Takahiko Matsubara,
Chiaki Hikage,
Satoshi Kuriki
Abstract:
The second-order formula of Minkowski functionals in weakly non-Gaussian fields is compared with the numerical $N$-body simulations. Recently, weakly non-Gaussian formula of Minkowski functionals is extended to include the second-order effects of non-Gaussianity in general dimensions. We apply this formula to the three-dimensional density field in the large-scale structure of the Universe. The par…
▽ More
The second-order formula of Minkowski functionals in weakly non-Gaussian fields is compared with the numerical $N$-body simulations. Recently, weakly non-Gaussian formula of Minkowski functionals is extended to include the second-order effects of non-Gaussianity in general dimensions. We apply this formula to the three-dimensional density field in the large-scale structure of the Universe. The parameters of the second-order formula include several kinds of skewness and kurtosis parameters. We apply the tree-level nonlinear perturbation theory to estimate these parameters. First we compare the theoretical values with those of numerical simulations on the basis of parameter values, and next we test the performance of the analytic formula combined with the perturbation theory. The second-order formula outperforms the first-order formula in general. The performance of the perturbation theory depends on the smoothing radius applied in defining the Minkowski functionals. The quantitative comparisons are presented in detail.
△ Less
Submitted 30 November, 2020;
originally announced December 2020.
-
Weakly non-Gaussian formula for the Minkowski functionals in general dimensions
Authors:
Takahiko Matsubara,
Satoshi Kuriki
Abstract:
The Minkowski functionals are useful statistics to quantify the morphology of various random fields. They have been applied to numerous analyses of geometrical patterns, including various types of cosmic fields, morphological image processing, etc. In some cases, including cosmological applications, small deviations from the Gaussianity of the distribution are of fundamental importance. Analytic f…
▽ More
The Minkowski functionals are useful statistics to quantify the morphology of various random fields. They have been applied to numerous analyses of geometrical patterns, including various types of cosmic fields, morphological image processing, etc. In some cases, including cosmological applications, small deviations from the Gaussianity of the distribution are of fundamental importance. Analytic formulas for the expectation values of Minkowski functionals with small non-Gaussianity have been derived in limited cases to date. We generalize these previous works to derive an analytic expression for expectation values of Minkowski functionals up to second-order corrections of non-Gaussianity in a space of general dimensions. The derived formula has sufficient generality to be applied to any random fields with weak non-Gaussianity in a statistically homogeneous and isotropic space of any dimensions.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Asymptotic expansion of the expected Minkowski functional for isotropic central limit random fields
Authors:
Satoshi Kuriki,
Takahiko Matsubara
Abstract:
The Minkowski functionals, including the Euler characteristic statistics, are standard tools for morphological analysis in cosmology. Motivated by cosmic research, we examine the Minkowski functional of the excursion set for an isotropic central limit random field, the $k$-point correlation functions ($k$th order cumulants) of which have the same structure as that assumed in cosmic research. Using…
▽ More
The Minkowski functionals, including the Euler characteristic statistics, are standard tools for morphological analysis in cosmology. Motivated by cosmic research, we examine the Minkowski functional of the excursion set for an isotropic central limit random field, the $k$-point correlation functions ($k$th order cumulants) of which have the same structure as that assumed in cosmic research. Using 3- and 4-point correlation functions, we derive the asymptotic expansions of the Euler characteristic density, which is the building block of the Minkowski functional. The resulting formula reveals the types of non-Gaussianity that cannot be captured by the Minkowski functionals. As an example, we consider an isotropic chi-square random field and confirm that the asymptotic expansion accurately approximates the true Euler characteristic density.
△ Less
Submitted 18 January, 2023; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Parameterization of quantum walks on cycles
Authors:
Shuji Kuriki,
Md Sams Afif Nirjhor,
Hiromichi Ohno
Abstract:
This study investigate the unitary equivalence classes of quantum walks on cycles. We show that unitary equivalence classes of quantum walks on a cycle with $N$ vertices are parameterized by $2N$ real parameters. Moreover, the ranges of two of the parameters are restricted, and the ranges depend on the parity of $N$.
This study investigate the unitary equivalence classes of quantum walks on cycles. We show that unitary equivalence classes of quantum walks on a cycle with $N$ vertices are parameterized by $2N$ real parameters. Moreover, the ranges of two of the parameters are restricted, and the ranges depend on the parity of $N$.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.
-
Robust Persistence Diagrams using Reproducing Kernels
Authors:
Siddharth Vishwanath,
Kenji Fukumizu,
Satoshi Kuriki,
Bharath Sriperumbudur
Abstract:
Persistent homology has become an important tool for extracting geometric and topological features from data, whose multi-scale features are summarized in a persistence diagram. From a statistical perspective, however, persistence diagrams are very sensitive to perturbations in the input space. In this work, we develop a framework for constructing robust persistence diagrams from superlevel filtra…
▽ More
Persistent homology has become an important tool for extracting geometric and topological features from data, whose multi-scale features are summarized in a persistence diagram. From a statistical perspective, however, persistence diagrams are very sensitive to perturbations in the input space. In this work, we develop a framework for constructing robust persistence diagrams from superlevel filtrations of robust density estimators constructed using reproducing kernels. Using an analogue of the influence function on the space of persistence diagrams, we establish the proposed framework to be less sensitive to outliers. The robust persistence diagrams are shown to be consistent estimators in bottleneck distance, with the convergence rate controlled by the smoothness of the kernel. This, in turn, allows us to construct uniform confidence bands in the space of persistence diagrams. Finally, we demonstrate the superiority of the proposed approach on benchmark datasets.
△ Less
Submitted 3 June, 2022; v1 submitted 17 June, 2020;
originally announced June 2020.
-
Existence and Uniqueness of the Kronecker Covariance MLE
Authors:
Mathias Drton,
Satoshi Kuriki,
Peter Hoff
Abstract:
In matrix-valued datasets the sampled matrices often exhibit correlations among both their rows and their columns. A useful and parsimonious model of such dependence is the matrix normal model, in which the covariances among the elements of a random matrix are parameterized in terms of the Kronecker product of two covariance matrices, one representing row covariances and one representing column co…
▽ More
In matrix-valued datasets the sampled matrices often exhibit correlations among both their rows and their columns. A useful and parsimonious model of such dependence is the matrix normal model, in which the covariances among the elements of a random matrix are parameterized in terms of the Kronecker product of two covariance matrices, one representing row covariances and one representing column covariance. An appealing feature of such a matrix normal model is that the Kronecker covariance structure allows for standard likelihood inference even when only a very small number of data matrices is available. For instance, in some cases a likelihood ratio test of dependence may be performed with a sample size of one. However, more generally the sample size required to ensure boundedness of the matrix normal likelihood or the existence of a unique maximizer depends in a complicated way on the matrix dimensions. This motivates the study of how large a sample size is needed to ensure that maximum likelihood estimators exist, and exist uniquely with probability one. Our main result gives precise sample size thresholds in the paradigm where the number of rows and the number of columns of the data matrices differ by at most a factor of two. Our proof uses invariance properties that allow us to consider data matrices in canonical form, as obtained from the Kronecker canonical form for matrix pencils.
△ Less
Submitted 14 January, 2021; v1 submitted 12 March, 2020;
originally announced March 2020.
-
On the Limits of Topological Data Analysis for Statistical Inference
Authors:
Siddharth Vishwanath,
Kenji Fukumizu,
Satoshi Kuriki,
Bharath Sriperumbudur
Abstract:
Topological data analysis has emerged as a powerful tool for extracting the metric, geometric and topological features underlying the data as a multi-resolution summary statistic, and has found applications in several areas where data arises from complex sources. In this paper, we examine the use of topological summary statistics through the lens of statistical inference. We investigate necessary…
▽ More
Topological data analysis has emerged as a powerful tool for extracting the metric, geometric and topological features underlying the data as a multi-resolution summary statistic, and has found applications in several areas where data arises from complex sources. In this paper, we examine the use of topological summary statistics through the lens of statistical inference. We investigate necessary and sufficient conditions under which \textit{valid statistical inference} is possible using {topological summary statistics}. Additionally, we provide examples of models that demonstrate invariance with respect to topological summaries.
△ Less
Submitted 15 February, 2024; v1 submitted 1 January, 2020;
originally announced January 2020.
-
Computation of the Expected Euler Characteristic for the Largest Eigenvalue of a Real Non-central Wishart Matrix
Authors:
Nobuki Takayama,
Lin Jiu,
Satoshi Kuriki,
Yi Zhang
Abstract:
We give an approximate formula for the distribution of the largest eigenvalue of real Wishart matrices by the expected Euler characteristic method for the general dimension. The formula is expressed in terms of a definite integral with parameters. We derive a differential equation satisfied by the integral for the $2 \times 2$ matrix case and perform a numerical analysis of it.
We give an approximate formula for the distribution of the largest eigenvalue of real Wishart matrices by the expected Euler characteristic method for the general dimension. The formula is expressed in terms of a definite integral with parameters. We derive a differential equation satisfied by the integral for the $2 \times 2$ matrix case and perform a numerical analysis of it.
△ Less
Submitted 21 May, 2020; v1 submitted 24 March, 2019;
originally announced March 2019.
-
Dependence Properties of B-Spline Copulas
Authors:
Xiaoling Dou,
Satoshi Kuriki,
Gwo Dong Lin,
Donald Richards
Abstract:
We construct by using B-spline functions a class of copulas that includes the Bernstein copulas arising in Baker's distributions. The range of correlation of the B-spline copulas is examined, and the Frechet--Hoeffding upper bound is proved to be attained when the number of B-spline functions goes to infinity. As the B-spline functions are well-known to be an order-complete weak Tchebycheff system…
▽ More
We construct by using B-spline functions a class of copulas that includes the Bernstein copulas arising in Baker's distributions. The range of correlation of the B-spline copulas is examined, and the Frechet--Hoeffding upper bound is proved to be attained when the number of B-spline functions goes to infinity. As the B-spline functions are well-known to be an order-complete weak Tchebycheff system from which the property of total positivity of any order follows for the maximum correlation case, the results given here extend classical results for the Bernstein copulas. In addition, we derive in terms of the Stirling numbers of the second kind an explicit formula for the moments of the related B-spline functions for nonnegative real numbers.
△ Less
Submitted 13 February, 2019;
originally announced February 2019.
-
Optimal experimental design that minimizes the width of simultaneous confidence bands
Authors:
Satoshi Kuriki,
Henry P. Wynn
Abstract:
We propose an optimal experimental design for a curvilinear regression model that minimizes the band-width of simultaneous confidence bands. Simultaneous confidence bands for curvilinear regression are constructed by evaluating the volume of a tube about a curve that is defined as a trajectory of a regression basis vector (Naiman, 1986). The proposed criterion is constructed based on the volume of…
▽ More
We propose an optimal experimental design for a curvilinear regression model that minimizes the band-width of simultaneous confidence bands. Simultaneous confidence bands for curvilinear regression are constructed by evaluating the volume of a tube about a curve that is defined as a trajectory of a regression basis vector (Naiman, 1986). The proposed criterion is constructed based on the volume of a tube, and the corresponding optimal design that minimizes the volume of tube is referred to as the tube-volume optimal (TV-optimal) design. For Fourier and weighted polynomial regressions, the problem is formalized as one of minimization over the cone of Hankel positive definite matrices, and the criterion to minimize is expressed as an elliptic integral. We show that the Möbius group keeps our problem invariant, and hence, minimization can be conducted over cross-sections of orbits. We demonstrate that for the weighted polynomial regression and the Fourier regression with three bases, the tube-volume optimal design forms an orbit of the Möbius group containing D-optimal designs as representative elements.
△ Less
Submitted 30 March, 2019; v1 submitted 13 April, 2017;
originally announced April 2017.
-
Use of spurious correlation for multiplicity adjustment
Authors:
Yoshiyuki Ninomiya,
Satoshi Kuriki,
Toshihiko Shiroishi,
Toyoyuki Takada
Abstract:
We consider one of the most basic multiple testing problems that compares expectations of multivariate data among several groups. As a test statistic, a conventional (approximate) $t$-statistic is considered, and we determine its rejection region using a common rejection limit. When there are unknown correlations among test statistics, the multiplicity adjusted $p$-values are dependent on the unkn…
▽ More
We consider one of the most basic multiple testing problems that compares expectations of multivariate data among several groups. As a test statistic, a conventional (approximate) $t$-statistic is considered, and we determine its rejection region using a common rejection limit. When there are unknown correlations among test statistics, the multiplicity adjusted $p$-values are dependent on the unknown correlations. They are usually replaced with their estimates that are always consistent under any hypothesis. In this paper, we propose the use of estimates, which are not necessarily consistent and are referred to as spurious correlations, in order to improve statistical power. Through simulation studies, we verify that the proposed method asymptotically controls the family-wise error rate and clearly provides higher statistical power than existing methods. In addition, the proposed and existing methods are applied to a real multiple testing problem that compares quantitative traits among groups of mice and the results are compared.
△ Less
Submitted 18 December, 2016;
originally announced December 2016.
-
The Bivariate Lack-of-Memory Distributions
Authors:
Gwo Dong Lin,
Xiaoling Dou,
Satoshi Kuriki
Abstract:
We treat all the bivariate lack-of-memory (BLM) distributions in a unified approach and develop some new general properties of the BLM distributions, including joint moment generating function, product moments and dependence structure. Necessary and sufficient conditions for the survival functions of BLM distributions to be totally positive of order two are given. Some previous results about speci…
▽ More
We treat all the bivariate lack-of-memory (BLM) distributions in a unified approach and develop some new general properties of the BLM distributions, including joint moment generating function, product moments and dependence structure. Necessary and sufficient conditions for the survival functions of BLM distributions to be totally positive of order two are given. Some previous results about specific BLM distributions are improved. In particular, we show that both the Marshall--Olkin survival copula and survival function are totally positive of all orders, regardless of parameters. Besides, we point out that Slepian's inequality also holds true for BLM distributions.
△ Less
Submitted 16 December, 2017; v1 submitted 16 June, 2016;
originally announced June 2016.
-
Chi-Square Mixture Representations for the Distribution of the Scalar Schur Complement in a Noncentral Wishart Matrix
Authors:
Constantin Siriteanu,
Satoshi Kuriki,
Donald Richards,
Akimichi Takemura
Abstract:
We show that the distribution of the scalar Schur complement in a noncentral Wishart matrix is a mixture of central chi-square distributions with different degrees of freedom. For the case of a rank-1 noncentrality matrix, the weights of the mixture representation arise from a noncentral beta mixture of Poisson distributions.
We show that the distribution of the scalar Schur complement in a noncentral Wishart matrix is a mixture of central chi-square distributions with different degrees of freedom. For the case of a rank-1 noncentrality matrix, the weights of the mixture representation arise from a noncentral beta mixture of Poisson distributions.
△ Less
Submitted 26 December, 2015;
originally announced December 2015.
-
Recursive computation for evaluating the exact $p$-values of temporal and spatial scan statistics
Authors:
Satoshi Kuriki,
Kunihiko Takahashi,
Hisayuki Hara
Abstract:
Let $V$ be a finite set of indices, and let $B_i$, $i=1,\ldots,m$, be subsets of $V$ such that $V=\bigcup_{i=1}^{m}B_i$. Let $X_i$, $i\in V$, be independent random variables, and let $X_{B_i}=(X_j)_{j\in B_i}$. In this paper, we propose a recursive computation method to calculate the conditional expectation $E\bigl[\prod_{i=1}^mχ_i(X_{B_i}) \,|\, N\bigr]$ with $N=\sum_{i\in V}X_i$ given, where…
▽ More
Let $V$ be a finite set of indices, and let $B_i$, $i=1,\ldots,m$, be subsets of $V$ such that $V=\bigcup_{i=1}^{m}B_i$. Let $X_i$, $i\in V$, be independent random variables, and let $X_{B_i}=(X_j)_{j\in B_i}$. In this paper, we propose a recursive computation method to calculate the conditional expectation $E\bigl[\prod_{i=1}^mχ_i(X_{B_i}) \,|\, N\bigr]$ with $N=\sum_{i\in V}X_i$ given, where $χ_i$ is an arbitrary function. Our method is based on the recursive summation/integration technique using the Markov property in statistics. To extract the Markov property, we define an undirected graph whose cliques are $B_j$, and obtain its chordal extension, from which we present the expressions of the recursive formula. This methodology works for a class of distributions including the Poisson distribution (that is, the conditional distribution is the multinomial). This problem is motivated from the evaluation of the multiplicity-adjusted $p$-value of scan statistics in spatial epidemiology. As an illustration of the approach, we present the real data analyses to detect temporal and spatial clustering.
△ Less
Submitted 31 October, 2015;
originally announced November 2015.
-
Simultaneous confidence bands for contrasts between several nonlinear regression curves
Authors:
Xiaolei Lu,
Satoshi Kuriki
Abstract:
We propose simultaneous confidence bands of the hyperbolic-type for the contrasts between several nonlinear (curvilinear) regression curves. The critical value of a confidence band is determined from the distribution of the maximum of a chi-square random process defined on the domain of explanatory variables. We use the volume-of-tube method to derive an upper tail probability formula of the maxim…
▽ More
We propose simultaneous confidence bands of the hyperbolic-type for the contrasts between several nonlinear (curvilinear) regression curves. The critical value of a confidence band is determined from the distribution of the maximum of a chi-square random process defined on the domain of explanatory variables. We use the volume-of-tube method to derive an upper tail probability formula of the maximum of a chi-square random process, which is asymptotically exact and sufficiently accurate in commonly used tail regions. Moreover, we prove that the formula obtained is equivalent to the expectation of the Euler-Poincare characteristic of the excursion set of the chi-square random process, and hence conservative. This result is therefore a generalization of Naiman's inequality for Gaussian random processes. As an illustrative example, growth curves of consomic mice are analyzed.
△ Less
Submitted 10 January, 2017; v1 submitted 17 October, 2015;
originally announced October 2015.
-
$A$-Hypergeometric Distributions and Newton Polytopes
Authors:
Nobuki Takayama,
Satoshi Kuriki,
Akimichi Takemura
Abstract:
We give a bijection between a quotient space of the parameters and the space of moments for any $A$-hypergeometric distribution. An algorithmic method to compute the inverse image of the map is proposed utilizing the holonomic gradient method and an asymptotic equivalence of the map and the iterative proportional scaling. The algorithm gives a method to solve a conditional maximum likelihood estim…
▽ More
We give a bijection between a quotient space of the parameters and the space of moments for any $A$-hypergeometric distribution. An algorithmic method to compute the inverse image of the map is proposed utilizing the holonomic gradient method and an asymptotic equivalence of the map and the iterative proportional scaling. The algorithm gives a method to solve a conditional maximum likelihood estimation problem in statistics. Our interplay between the theory of hypergeometric functions and statistics gives some new formulas of $A$-hypergeometric polynomials.
△ Less
Submitted 11 November, 2015; v1 submitted 8 October, 2015;
originally announced October 2015.
-
Exact ZF Analysis and Computer-Algebra-Aided Evaluation in Rank-1 LoS Rician Fading
Authors:
Constantin Siriteanu,
Akimichi Takemura,
Christoph Koutschan,
Satoshi Kuriki,
Donald St. P. Richards,
Hyundong Shin
Abstract:
We study zero-forcing detection (ZF) for multiple-input/multiple-output (MIMO) spatial multiplexing under transmit-correlated Rician fading for an N_R X N_T channel matrix with rank-1 line-of-sight (LoS) component. By using matrix transformations and multivariate statistics, our exact analysis yields the signal-to-noise ratio moment generating function (m.g.f.) as an infinite series of gamma distr…
▽ More
We study zero-forcing detection (ZF) for multiple-input/multiple-output (MIMO) spatial multiplexing under transmit-correlated Rician fading for an N_R X N_T channel matrix with rank-1 line-of-sight (LoS) component. By using matrix transformations and multivariate statistics, our exact analysis yields the signal-to-noise ratio moment generating function (m.g.f.) as an infinite series of gamma distribution m.g.f.'s and analogous series for ZF performance measures, e.g., outage probability and ergodic capacity. However, their numerical convergence is inherently problematic with increasing Rician K-factor, N_R , and N_T. We circumvent this limitation as follows. First, we derive differential equations satisfied by the performance measures with a novel automated approach employing a computer-algebra tool which implements Groebner basis computation and creative telesco**. These differential equations are then solved with the holonomic gradient method (HGM) from initial conditions computed with the infinite series. We demonstrate that HGM yields more reliable performance evaluation than by infinite series alone and more expeditious than by simulation, for realistic values of K , and even for N_R and N_T relevant to large MIMO systems. We envision extending the proposed approaches for exact analysis and reliable evaluation to more general Rician fading and other transceiver methods.
△ Less
Submitted 19 May, 2016; v1 submitted 24 July, 2015;
originally announced July 2015.
-
MIMO Zero-Forcing Performance Evaluation Using the Holonomic Gradient Method
Authors:
Constantin Siriteanu,
Akimichi Takemura,
Satoshi Kuriki,
Hyundong Shin,
Christoph Koutschan
Abstract:
For multiple-input multiple-output (MIMO) spatial-multiplexing transmission, zero-forcing detection (ZF) is appealing because of its low complexity. Our recent MIMO ZF performance analysis for Rician--Rayleigh fading, which is relevant in heterogeneous networks, has yielded for the ZF outage probability and ergodic capacity infinite-series expressions. Because they arose from expanding the conflue…
▽ More
For multiple-input multiple-output (MIMO) spatial-multiplexing transmission, zero-forcing detection (ZF) is appealing because of its low complexity. Our recent MIMO ZF performance analysis for Rician--Rayleigh fading, which is relevant in heterogeneous networks, has yielded for the ZF outage probability and ergodic capacity infinite-series expressions. Because they arose from expanding the confluent hypergeometric function $ {_1\! F_1} (\cdot, \cdot, σ) $ around 0, they do not converge numerically at realistically-high Rician $ K $-factor values. Therefore, herein, we seek to take advantage of the fact that $ {_1\! F_1} (\cdot, \cdot, σ) $ satisfies a differential equation, i.e., it is a \textit{holonomic} function. Holonomic functions can be computed by the \textit{holonomic gradient method} (HGM), i.e., by numerically solving the satisfied differential equation. Thus, we first reveal that the moment generating function (m.g.f.) and probability density function (p.d.f.) of the ZF signal-to-noise ratio (SNR) are holonomic. Then, from the differential equation for $ {_1\! F_1} (\cdot, \cdot, σ) $, we deduce those satisfied by the SNR m.g.f. and p.d.f., and demonstrate that the HGM helps compute the p.d.f. accurately at practically-relevant values of $ K $. Finally, numerical integration of the SNR p.d.f. produced by HGM yields accurate ZF outage probability and ergodic capacity results.
△ Less
Submitted 15 April, 2015; v1 submitted 15 March, 2014;
originally announced March 2014.
-
Schur Complement Based Analysis of MIMO Zero-Forcing for Rician Fading
Authors:
Constantin Siriteanu,
Akimichi Takemura,
Satoshi Kuriki,
Donald St. P. Richards,
Hyundong Shin
Abstract:
For multiple-input/multiple-output (MIMO) spatial multiplexing with zero-forcing detection (ZF), signal-to-noise ratio (SNR) analysis for Rician fading involves the cumbersome noncentral-Wishart distribution (NCWD) of the transmit sample-correlation (Gramian) matrix. An \textsl{approximation} with a \textsl{virtual} CWD previously yielded for the ZF SNR an approximate (virtual) Gamma distribution.…
▽ More
For multiple-input/multiple-output (MIMO) spatial multiplexing with zero-forcing detection (ZF), signal-to-noise ratio (SNR) analysis for Rician fading involves the cumbersome noncentral-Wishart distribution (NCWD) of the transmit sample-correlation (Gramian) matrix. An \textsl{approximation} with a \textsl{virtual} CWD previously yielded for the ZF SNR an approximate (virtual) Gamma distribution. However, analytical conditions qualifying the accuracy of the SNR-distribution approximation were unknown. Therefore, we have been attempting to exactly characterize ZF SNR for Rician fading. Our previous attempts succeeded only for the sole Rician-fading stream under Rician--Rayleigh fading, by writing it as scalar Schur complement (SC) in the Gramian. Herein, we pursue a more general, matrix-SC-based analysis to characterize SNRs when several streams may undergo Rician fading. On one hand, for full-Rician fading, the SC distribution is found to be exactly a CWD if and only if a channel-mean--correlation \textsl{condition} holds. Interestingly, this CWD then coincides with the \textsl{virtual} CWD ensuing from the \textsl{approximation}. Thus, under the \textsl{condition}, the actual and virtual SNR-distributions coincide. On the other hand, for Rician--Rayleigh fading, the matrix-SC distribution is characterized in terms of determinant of matrix with elementary-function entries, which also yields a new characterization of the ZF SNR. Average error probability results validate our analysis vs.~simulation.
△ Less
Submitted 26 September, 2014; v1 submitted 2 January, 2014;
originally announced January 2014.
-
Exact MIMO Zero-Forcing Detection Analysis for Transmit-Correlated Rician Fading
Authors:
Constantin Siriteanu,
Steven Blostein,
Akimichi Takemura,
Hyundong Shin,
Shahram Yousefi,
Satoshi Kuriki
Abstract:
We analyze the performance of multiple input/multiple output (MIMO) communications systems employing spatial multiplexing and zero-forcing detection (ZF). The distribution of the ZF signal-to-noise ratio (SNR) is characterized when either the intended stream or interfering streams experience Rician fading, and when the fading may be correlated on the transmit side. Previously, exact ZF analysis ba…
▽ More
We analyze the performance of multiple input/multiple output (MIMO) communications systems employing spatial multiplexing and zero-forcing detection (ZF). The distribution of the ZF signal-to-noise ratio (SNR) is characterized when either the intended stream or interfering streams experience Rician fading, and when the fading may be correlated on the transmit side. Previously, exact ZF analysis based on a well-known SNR expression has been hindered by the noncentrality of the Wishart distribution involved. In addition, approximation with a central-Wishart distribution has not proved consistently accurate. In contrast, the following exact ZF study proceeds from a lesser-known SNR expression that separates the intended and interfering channel-gain vectors. By first conditioning on, and then averaging over the interference, the ZF SNR distribution for Rician-Rayleigh fading is shown to be an infinite linear combination of gamma distributions. On the other hand, for Rayleigh-Rician fading, the ZF SNR is shown to be gamma-distributed. Based on the SNR distribution, we derive new series expressions for the ZF average error probability, outage probability, and ergodic capacity. Numerical results confirm the accuracy of our new expressions, and reveal effects of interference and channel statistics on performance.
△ Less
Submitted 2 January, 2014; v1 submitted 10 July, 2013;
originally announced July 2013.
-
EM algorithms for estimating the Bernstein copula
Authors:
Xiaoling Dou,
Satoshi Kuriki,
Gwo Dong Lin,
Donald Richards
Abstract:
A method that uses order statistics to construct multivariate distributions with fixed marginals and which utilizes a representation of the Bernstein copula in terms of a finite mixture distribution is proposed. Expectation-maximization (EM) algorithms to estimate the Bernstein copula are proposed, and a local convergence property is proved. Moreover, asymptotic properties of the proposed semipara…
▽ More
A method that uses order statistics to construct multivariate distributions with fixed marginals and which utilizes a representation of the Bernstein copula in terms of a finite mixture distribution is proposed. Expectation-maximization (EM) algorithms to estimate the Bernstein copula are proposed, and a local convergence property is proved. Moreover, asymptotic properties of the proposed semiparametric estimators are provided. Illustrative examples are presented using three real data sets and a 3-dimensional simulated data set. These studies show that the Bernstein copula is able to represent various distributions flexibly and that the proposed EM algorithms work well for such data.
△ Less
Submitted 15 January, 2014; v1 submitted 12 January, 2013;
originally announced January 2013.
-
Abstract tubes associated with perturbed polyhedra with applications to multidimensional normal probability computations
Authors:
Satoshi Kuriki,
Tetsuhisa Miwa,
Anthony J. Hayter
Abstract:
Let $K$ be a closed convex polyhedron defined by a finite number of linear inequalities. In this paper we refine the theory of abstract tubes (Naiman and Wynn, 1997) associated with $K$ when $K$ is perturbed. In particular, we focus on the perturbation that is lexicographic and in an outer direction. An algorithm for constructing the abstract tube by means of linear programming and its implementat…
▽ More
Let $K$ be a closed convex polyhedron defined by a finite number of linear inequalities. In this paper we refine the theory of abstract tubes (Naiman and Wynn, 1997) associated with $K$ when $K$ is perturbed. In particular, we focus on the perturbation that is lexicographic and in an outer direction. An algorithm for constructing the abstract tube by means of linear programming and its implementation are discussed. Using the abstract tube for perturbed $K$ combined with the recursive integration technique proposed by Miwa, Hayter and Kuriki (2003), we show that the multidimensional normal probability for a polyhedral region $K$ can be computed efficiently. In addition, abstract tubes and the distribution functions of studentized range statistics are exhibited as numerical examples.
△ Less
Submitted 12 October, 2011;
originally announced October 2011.
-
Likelihood ratio tests for positivity in polynomial regressions
Authors:
Naohiro Kato,
Satoshi Kuriki
Abstract:
A polynomial that is nonnegative over a given interval is called a positive polynomial. The set of such positive polynomials forms a closed convex cone $K$. In this paper, we consider the likelihood ratio test for the hypothesis of positivity that the estimand polynomial regression curve is a positive polynomial. By considering hierarchical hypotheses including the hypothesis of positivity, we def…
▽ More
A polynomial that is nonnegative over a given interval is called a positive polynomial. The set of such positive polynomials forms a closed convex cone $K$. In this paper, we consider the likelihood ratio test for the hypothesis of positivity that the estimand polynomial regression curve is a positive polynomial. By considering hierarchical hypotheses including the hypothesis of positivity, we define nested likelihood ratio tests, and derive their null distributions as mixtures of chi-square distributions by using the volume-of-tubes method. The mixing probabilities are obtained by utilizing the parameterizations for the cone $K$ and its dual provided in the framework of Tchebycheff systems for polynomials of degree at most 4. For polynomials of degree greater than 4, the upper and lower bounds for the null distributions are provided. Moreover, we propose associated simultaneous confidence bounds for polynomial regression curves. Regarding computation, we demonstrate that symmetric cone programming is useful to obtain the test statistics. As an illustrative example, we conduct data analysis on growth curves of two groups. We examine the hypothesis that the growth rate (the derivative of growth curve) of one group is always higher than the other.
△ Less
Submitted 14 November, 2012; v1 submitted 4 August, 2011;
originally announced August 2011.
-
Approximate tail probabilities of the maximum of a chi-square field on multi-dimensional lattice points and their applications to detection of loci interactions
Authors:
Satoshi Kuriki,
Yoshiaki Harushima,
Hironori Fujisawa,
Nori Kurata
Abstract:
Define a chi-square random field on a multi-dimensional lattice points index set with a direct-product covariance structure, and consider the distribution of the maximum of this random field. We provide two approximate formulas for the upper tail probability of the distribution based on nonlinear renewal theory and an integral-geometric approach called the volume-of-tube method. This study is moti…
▽ More
Define a chi-square random field on a multi-dimensional lattice points index set with a direct-product covariance structure, and consider the distribution of the maximum of this random field. We provide two approximate formulas for the upper tail probability of the distribution based on nonlinear renewal theory and an integral-geometric approach called the volume-of-tube method. This study is motivated by the detection problem of the interactive loci pairs which play an important role in forming biological species. The joint distribution of scan statistics for detecting the pairs is regarded as the chi-square random field above, and hence the multiplicity-adjusted $p$-value can be calculated by using the proposed approximate formulas. By using these formulas, we examine the data of Mizuta, et al. (2010) who reported a new interactive loci pair of rice inter-subspecies.
△ Less
Submitted 30 March, 2013; v1 submitted 22 December, 2010;
originally announced December 2010.
-
Distributions of the largest singular values of skew-symmetric random matrices and their applications to paired comparisons
Authors:
Satoshi Kuriki
Abstract:
Let $A$ be a real skew-symmetric Gaussian random matrix whose upper triangular elements are independently distributed according to the standard normal distribution. We provide the distribution of the largest singular value $σ_1$ of $A$. Moreover, by acknowledging the fact that the largest singular value can be regarded as the maximum of a Gaussian field, we deduce the distribution of the standa…
▽ More
Let $A$ be a real skew-symmetric Gaussian random matrix whose upper triangular elements are independently distributed according to the standard normal distribution. We provide the distribution of the largest singular value $σ_1$ of $A$. Moreover, by acknowledging the fact that the largest singular value can be regarded as the maximum of a Gaussian field, we deduce the distribution of the standardized largest singular value $σ_1/\sqrt{\mathrm{tr}(A'A)/2}$. These distributional results are utilized in Scheffé's paired comparisons model. We propose tests for the hypothesis of subtractivity based on the largest singular value of the skew-symmetric residual matrix. Professional baseball league data are analyzed as an illustrative example.
△ Less
Submitted 13 March, 2010;
originally announced March 2010.
-
Graph presentations for moments of noncentral Wishart distributions and their applications
Authors:
Satoshi Kuriki,
Yasuhide Numata
Abstract:
We provide formulas for the moments of the real and complex noncentral Wishart distributions of general degrees. The obtained formulas for the real and complex cases are described in terms of the undirected and directed graphs, respectively. By considering degenerate cases, we give explicit formulas for the moments of bivariate chi-square distributions and $2\times 2$ Wishart distributions by en…
▽ More
We provide formulas for the moments of the real and complex noncentral Wishart distributions of general degrees. The obtained formulas for the real and complex cases are described in terms of the undirected and directed graphs, respectively. By considering degenerate cases, we give explicit formulas for the moments of bivariate chi-square distributions and $2\times 2$ Wishart distributions by enumerating the graphs. Noting that the Laguerre polynomials can be considered to be moments of a noncentral chi-square distributions formally, we demonstrate a combinatorial interpretation of the coefficients of the Laguerre polynomials.
△ Less
Submitted 22 January, 2010; v1 submitted 3 December, 2009;
originally announced December 2009.
-
The tube method for the moment index in projection pursuit
Authors:
Satoshi Kuriki,
Akimichi Takemura
Abstract:
The projection pursuit index defined by a sum of squares of the third and the fourth sample cumulants is known as the moment index proposed by Jones and Sibson. Limiting distribution of the maximum of the moment index under the null hypothesis that the population is multivariate normal is shown to be the maximum of a Gaussian random field with a finite Karhunen-Loeve expansion. An approximate fo…
▽ More
The projection pursuit index defined by a sum of squares of the third and the fourth sample cumulants is known as the moment index proposed by Jones and Sibson. Limiting distribution of the maximum of the moment index under the null hypothesis that the population is multivariate normal is shown to be the maximum of a Gaussian random field with a finite Karhunen-Loeve expansion. An approximate formula for tail probability of the maximum, which corresponds to the p-value, is given by virtue of the tube method through determining Weyl's invariants of all degrees and the critical radius of the index manifold of the Gaussian random field.
△ Less
Submitted 25 November, 2007;
originally announced November 2007.
-
Skewness and kurtosis as locally best invariant tests of normality
Authors:
Akimichi Takemura,
Muneya Matsui,
Satoshi Kuriki
Abstract:
Consider testing normality against a one-parameter family of univariate distributions containing the normal distribution as the boundary, e.g., the family of $t$-distributions or an infinitely divisible family with finite variance. We prove that under mild regularity conditions, the sample skewness is the locally best invariant (LBI) test of normality against a wide class of asymmetric families…
▽ More
Consider testing normality against a one-parameter family of univariate distributions containing the normal distribution as the boundary, e.g., the family of $t$-distributions or an infinitely divisible family with finite variance. We prove that under mild regularity conditions, the sample skewness is the locally best invariant (LBI) test of normality against a wide class of asymmetric families and the kurtosis is the LBI test against symmetric families. We also discuss non-regular cases such as testing normality against the stable family and some related results in the multivariate cases.
△ Less
Submitted 20 August, 2006;
originally announced August 2006.
-
Star-shaped distributions and their generalizations
Authors:
Hidehiko Kamiya,
Akimichi Takemura,
Satoshi Kuriki
Abstract:
Elliptically contoured distributions can be considered to be the distributions for which the contours of the density functions are proportional ellipsoids. We generalize elliptically contoured densities to ``star-shaped distributions'' with concentric star-shaped contours and show that many results in the former case continue to hold in the more general case. We develop a general theory in the f…
▽ More
Elliptically contoured distributions can be considered to be the distributions for which the contours of the density functions are proportional ellipsoids. We generalize elliptically contoured densities to ``star-shaped distributions'' with concentric star-shaped contours and show that many results in the former case continue to hold in the more general case. We develop a general theory in the framework of abstract group invariance so that the results can be applied to other cases as well, especially those involving random matrices.
△ Less
Submitted 22 May, 2006;
originally announced May 2006.