Search | arXiv e-print repository

A Geometrical Analysis of Kernel Ridge Regression and its Applications

Authors: Georgios Gavrilopoulos, Guillaume Lecué, Zong Shang

Abstract: We obtain upper bounds for the estimation error of Kernel Ridge Regression (KRR) for all non-negative regularization parameters, offering a geometric perspective on various phenomena in KRR. As applications: 1. We address the multiple descent problem, unifying the proofs of arxiv:1908.10292 and arxiv:1904.12191 for polynomial kernels and we establish multiple descent for the upper bound of estimat… ▽ More We obtain upper bounds for the estimation error of Kernel Ridge Regression (KRR) for all non-negative regularization parameters, offering a geometric perspective on various phenomena in KRR. As applications: 1. We address the multiple descent problem, unifying the proofs of arxiv:1908.10292 and arxiv:1904.12191 for polynomial kernels and we establish multiple descent for the upper bound of estimation error of KRR under sub-Gaussian design and non-asymptotic regimes. 2. For a sub-Gaussian design vector and for non-asymptotic scenario, we prove the Gaussian Equivalent Conjecture. 3. We offer a novel perspective on the linearization of kernel matrices of non-linear kernel, extending it to the power regime for polynomial kernels. 4. Our theory is applicable to data-dependent kernels, providing a convenient and accurate tool for the feature learning regime in deep learning theory. 5. Our theory extends the results in arxiv:2009.14286 under weak moment assumption. Our proof is based on three mathematical tools developed in this paper that can be of independent interest: 1. Dvoretzky-Milman theorem for ellipsoids under (very) weak moment assumptions. 2. Restricted Isomorphic Property in Reproducing Kernel Hilbert Spaces with embedding index conditions. 3. A concentration inequality for finite-degree polynomial kernel functions. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2402.14517 [pdf, ps, other]

The elliptical invariant tori of nearly integrable Hamiltonian system through symplectic algorithms

Authors: Zaijiu Shang, Yang Xu

Abstract: In this paper we apply symplectic algorithms to nearly integrable Hamiltonian system, and prove it can maintain lots of elliptic lower dimensional invariant tori. We are committed to consider the elliptic lower dimensional invariant tori for symplectic map** with a small twist under the Rüssmann's non-degenerate condition, and focus on its measure estimation. And then apply it to the nearly in… ▽ More In this paper we apply symplectic algorithms to nearly integrable Hamiltonian system, and prove it can maintain lots of elliptic lower dimensional invariant tori. We are committed to consider the elliptic lower dimensional invariant tori for symplectic map** with a small twist under the Rüssmann's non-degenerate condition, and focus on its measure estimation. And then apply it to the nearly integrable Hamiltonian system to obtain lots of elliptic lower dimensional invariant tori. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2402.14478 [pdf, ps, other]

A KAM theorem of symplectic algorithms for nearly integrabel Hamiltonian systems

Authors: Zaijiu Shang, Yang Xu

Abstract: In this paper we prove a KAM-like theorem of symplectic algorithms for nearly integrable Hamiltonian systems which generalises the result of \cite{r1} and \cite{r6} for the case of integrable systems. In this paper we prove a KAM-like theorem of symplectic algorithms for nearly integrable Hamiltonian systems which generalises the result of \cite{r1} and \cite{r6} for the case of integrable systems. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2310.00881 [pdf, other]

Scalable Statistical Inference in Non-parametric Least Squares

Authors: Meimei Liu, Zuofeng Shang, Yun Yang

Abstract: Stochastic approximation (SA) is a powerful and scalable computational method for iteratively estimating the solution of optimization problems in the presence of randomness, particularly well-suited for large-scale and streaming data settings. In this work, we propose a theoretical framework for stochastic approximation (SA) applied to non-parametric least squares in reproducing kernel Hilbert spa… ▽ More Stochastic approximation (SA) is a powerful and scalable computational method for iteratively estimating the solution of optimization problems in the presence of randomness, particularly well-suited for large-scale and streaming data settings. In this work, we propose a theoretical framework for stochastic approximation (SA) applied to non-parametric least squares in reproducing kernel Hilbert spaces (RKHS), enabling online statistical inference in non-parametric regression models. We achieve this by constructing asymptotically valid pointwise (and simultaneous) confidence intervals (bands) for local (and global) inference of the nonlinear regression function, via employing an online multiplier bootstrap approach to functional stochastic gradient descent (SGD) algorithm in the RKHS. Our main theoretical contributions consist of a unified framework for characterizing the non-asymptotic behavior of the functional SGD estimator and demonstrating the consistency of the multiplier bootstrap method. The proof techniques involve the development of a higher-order expansion of the functional SGD estimator under the supremum norm metric and the Gaussian approximation of suprema of weighted and non-identically distributed empirical processes. Our theory specifically reveals an interesting relationship between the tuning of step sizes in SGD for estimation and the accuracy of uncertainty quantification. △ Less

Submitted 1 October, 2023; originally announced October 2023.

arXiv:2303.02470 [pdf, ps, other]

doi 10.1002/sta4.482

Minimax optimal high-dimensional classification using deep neural networks

Authors: Shuoyang Wang, Zuofeng Shang

Abstract: High-dimensional classification is a fundamentally important research problem in high-dimensional data analysis. In this paper, we derive a nonasymptotic rate for the minimax excess misclassification risk when feature dimension exponentially diverges with the sample size and the Bayes classifier possesses a complicated modular structure. We also show that classifiers based on deep neural networks… ▽ More High-dimensional classification is a fundamentally important research problem in high-dimensional data analysis. In this paper, we derive a nonasymptotic rate for the minimax excess misclassification risk when feature dimension exponentially diverges with the sample size and the Bayes classifier possesses a complicated modular structure. We also show that classifiers based on deep neural networks can attain the above rate, hence, are minimax optimal. △ Less

Submitted 4 March, 2023; originally announced March 2023.

MSC Class: 62H30; 62G99

Journal ref: Stat. 11(2022)

arXiv:2212.06505 [pdf, other]

Multiscale topology classifies and quantifies cell types in subcellular spatial transcriptomics

Authors: Katherine Benjamin, Aneesha Bhandari, Zhouchun Shang, Yanan Xing, Yanru An, Nannan Zhang, Yong Hou, Ulrike Tillmann, Katherine R. Bull, Heather A. Harrington

Abstract: Spatial transcriptomics has the potential to transform our understanding of RNA expression in tissues. Classical array-based technologies produce multiple-cell-scale measurements requiring deconvolution to recover single cell information. However, rapid advances in subcellular measurement of RNA expression at whole-transcriptome depth necessitate a fundamentally different approach. To integrate si… ▽ More Spatial transcriptomics has the potential to transform our understanding of RNA expression in tissues. Classical array-based technologies produce multiple-cell-scale measurements requiring deconvolution to recover single cell information. However, rapid advances in subcellular measurement of RNA expression at whole-transcriptome depth necessitate a fundamentally different approach. To integrate single-cell RNA-seq data with nanoscale spatial transcriptomics, we present a topological method for automatic cell type identification (TopACT). Unlike popular decomposition approaches to multicellular resolution data, TopACT is able to pinpoint the spatial locations of individual sparsely dispersed cells without prior knowledge of cell boundaries. Pairing TopACT with multiparameter persistent homology landscapes predicts immune cells forming a peripheral ring structure within kidney glomeruli in a murine model of lupus nephritis, which we experimentally validate with immunofluorescent imaging. The proposed topological data analysis unifies multiple biological scales, from subcellular gene expression to multicellular tissue organization. △ Less

Submitted 13 December, 2022; originally announced December 2022.

Comments: Main text: 8 pages, 4 figures. Supplement: 12 pages, 5 figures

MSC Class: 92-08; 55N31; 62R40; 68T09

arXiv:2207.01602 [pdf, other]

Minimax Optimal Deep Neural Network Classifiers Under Smooth Decision Boundary

Authors: Tianyang Hu, Ruiqi Liu, Zuofeng Shang, Guang Cheng

Abstract: Deep learning has gained huge empirical successes in large-scale classification problems. In contrast, there is a lack of statistical understanding about deep learning methods, particularly in the minimax optimality perspective. For instance, in the classical smooth decision boundary setting, existing deep neural network (DNN) approaches are rate-suboptimal, and it remains elusive how to construct… ▽ More Deep learning has gained huge empirical successes in large-scale classification problems. In contrast, there is a lack of statistical understanding about deep learning methods, particularly in the minimax optimality perspective. For instance, in the classical smooth decision boundary setting, existing deep neural network (DNN) approaches are rate-suboptimal, and it remains elusive how to construct minimax optimal DNN classifiers. Moreover, it is interesting to explore whether DNN classifiers can circumvent the curse of dimensionality in handling high-dimensional data. The contributions of this paper are two-fold. First, based on a localized margin framework, we discover the source of suboptimality of existing DNN approaches. Motivated by this, we propose a new deep learning classifier using a divide-and-conquer technique: DNN classifiers are constructed on each local region and then aggregated to a global one. We further propose a localized version of the classical Tsybakov's noise condition, under which statistical optimality of our new classifier is established. Second, we show that DNN classifiers can adapt to low-dimensional data structures and circumvent the curse of dimensionality in the sense that the minimax rate only depends on the effective dimension, potentially much smaller than the actual data dimension. Numerical experiments are conducted on simulated data to corroborate our theoretical results. △ Less

Submitted 4 July, 2022; originally announced July 2022.

arXiv:2204.09097 [pdf, other]

Information-theoretic Limits for Testing Community Structures in Weighted Networks

Authors: Mingao Yuan, Zuofeng Shang

Abstract: Community detection refers to the problem of clustering the nodes of a network into groups. Existing inferential methods for community structure mainly focus on unweighted (binary) networks. Many real-world networks are nonetheless weighted and a common practice is to dichotomize a weighted network to an unweighted one which is known to result in information loss. Literature on hypothesis testing… ▽ More Community detection refers to the problem of clustering the nodes of a network into groups. Existing inferential methods for community structure mainly focus on unweighted (binary) networks. Many real-world networks are nonetheless weighted and a common practice is to dichotomize a weighted network to an unweighted one which is known to result in information loss. Literature on hypothesis testing in the latter situation is still missing. In this paper, we study the problem of testing the existence of community structure in weighted networks. Our contributions are threefold: (a). We use the (possibly infinite-dimensional) exponential family to model the weights and derive the sharp information-theoretic limit for the existence of consistent test. Within the limit, any test is inconsistent; and beyond the limit, we propose a useful consistent test. (b). Based on the information-theoretic limits, we provide the first formal way to quantify the loss of information incurred by dichotomizing weighted graphs into unweighted graphs in the context of hypothesis testing. (c). We propose several new and practically useful test statistics. Simulation study show that the proposed tests have good performance. Finally, we apply the proposed tests to an animal social network. △ Less

Submitted 19 April, 2022; originally announced April 2022.

arXiv:2203.05873 [pdf, ps, other]

A geometrical viewpoint on the benign overfitting property of the minimum $l_2$-norm interpolant estimator and its universality

Authors: Guillaume Lecué, Zong Shang

Abstract: In the linear regression model, the minimum l2-norm interpolant estimator has received much attention since it was proved to be consistent even though it fits noisy data perfectly under some condition on the covariance matrix $Σ$ of the input vector, known as benign overfitting. Motivated by this phenomenon, we study the generalization property of this estimator from a geometrical viewpoint. Our m… ▽ More In the linear regression model, the minimum l2-norm interpolant estimator has received much attention since it was proved to be consistent even though it fits noisy data perfectly under some condition on the covariance matrix $Σ$ of the input vector, known as benign overfitting. Motivated by this phenomenon, we study the generalization property of this estimator from a geometrical viewpoint. Our main results extend and improve the convergence rates as well as the deviation probability from [Tsigler and Bartlett]. Our proof differs from the classical bias/variance analysis and is based on the self-induced regularization property introduced in [Bartlett, Montanari and Rakhlin]: the minimum l2-norm interpolant estimator can be written as a sum of a ridge estimator and an overfitting component. The two geometrical properties of random Gaussian matrices at the heart of our analysis are the Dvoretsky-Milman theorem and isomorphic and restricted isomorphic properties. In particular, the Dvoretsky dimension appearing naturally in our geometrical viewpoint, coincides with the effective rank and is the key tool for handling the behavior of the design matrix restricted to the sub-space where overfitting happens. We extend these results to heavy-tailed scenarii proving the universality of this phenomenon beyond exponential moment assumptions. This phenomenon is unknown before and is widely believed to be a significant challenge. This follows from an anistropic version of the probabilistic Dvoretsky-Milman theorem that holds for heavy-tailed vectors which is of independent interest. △ Less

Submitted 21 September, 2023; v1 submitted 11 March, 2022; originally announced March 2022.

arXiv:2202.11747 [pdf, other]

Statistical Inference for Functional Linear Quantile Regression

Authors: Peijun Sang, Zuofeng Shang, Pang Du

Abstract: We propose inferential tools for functional linear quantile regression where the conditional quantile of a scalar response is assumed to be a linear functional of a functional covariate. In contrast to conventional approaches, we employ kernel convolution to smooth the original loss function. The coefficient function is estimated under a reproducing kernel Hilbert space framework. A gradient desce… ▽ More We propose inferential tools for functional linear quantile regression where the conditional quantile of a scalar response is assumed to be a linear functional of a functional covariate. In contrast to conventional approaches, we employ kernel convolution to smooth the original loss function. The coefficient function is estimated under a reproducing kernel Hilbert space framework. A gradient descent algorithm is designed to minimize the smoothed loss function with a roughness penalty. With the aid of the Banach fixed-point theorem, we show the existence and uniqueness of our proposed estimator as the minimizer of the regularized loss function in an appropriate Hilbert space. Furthermore, we establish the convergence rate as well as the weak convergence of our estimator. As far as we know, this is the first weak convergence result for a functional quantile regression model. Pointwise confidence intervals and a simultaneous confidence band for the true coefficient function are then developed based on these theoretical properties. Numerical studies including both simulations and a data application are conducted to investigate the performance of our estimator and inference tools in finite sample. △ Less

Submitted 23 February, 2022; originally announced February 2022.

arXiv:2202.05888 [pdf, ps, other]

Statistical Limits for Testing Correlation of Hypergraphs

Authors: Mingao Yuan, Zuofeng Shang

Abstract: In this paper, we consider the hypothesis testing of correlation between two $m$-uniform hypergraphs on $n$ unlabelled nodes. Under the null hypothesis, the hypergraphs are independent, while under the alternative hypothesis, the hyperdges have the same marginal distributions as in the null hypothesis but are correlated after some unknown node permutation. We focus on two scenarios: the hypergraph… ▽ More In this paper, we consider the hypothesis testing of correlation between two $m$-uniform hypergraphs on $n$ unlabelled nodes. Under the null hypothesis, the hypergraphs are independent, while under the alternative hypothesis, the hyperdges have the same marginal distributions as in the null hypothesis but are correlated after some unknown node permutation. We focus on two scenarios: the hypergraphs are generated from the Gaussian-Wigner model and the dense Erdös-Rényi model. We derive the sharp information-theoretic testing threshold. Above the threshold, there exists a powerful test to distinguish the alternative hypothesis from the null hypothesis. Below the threshold, the alternative hypothesis and the null hypothesis are not distinguishable. The threshold involves $m$ and decreases as $m$ gets larger. This indicates testing correlation of hypergraphs ($m\geq3$) becomes easier than testing correlation of graphs ($m=2$) △ Less

Submitted 11 February, 2022; originally announced February 2022.

Comments: 20pages

arXiv:2105.02259 [pdf, other]

Information Limits for Detecting a Subhypergraph

Authors: Mingao Yuan, Zuofeng Shang

Abstract: We consider the problem of recovering a subhypergraph based on an observed adjacency tensor corresponding to a uniform hypergraph. The uniform hypergraph is assumed to contain a subset of vertices called as subhypergraph. The edges restricted to the subhypergraph are assumed to follow a different probability distribution than other edges. We consider both weak recovery and exact recovery of the su… ▽ More We consider the problem of recovering a subhypergraph based on an observed adjacency tensor corresponding to a uniform hypergraph. The uniform hypergraph is assumed to contain a subset of vertices called as subhypergraph. The edges restricted to the subhypergraph are assumed to follow a different probability distribution than other edges. We consider both weak recovery and exact recovery of the subhypergraph, and establish information-theoretic limits in each case. Specifically, we establish sharp conditions for the possibility of weakly or exactly recovering the subhypergraph from an information-theoretic point of view. These conditions are fundamentally different from their counterparts derived in hypothesis testing literature. △ Less

Submitted 5 May, 2021; originally announced May 2021.

arXiv:2104.04047 [pdf, ps, other]

Heterogeneous Dense Subhypergraph Detection

Authors: Mingao Yuan, Zuofeng Shang

Abstract: We study the problem of testing the existence of a heterogeneous dense subhypergraph. The null hypothesis corresponds to a heterogeneous Erdös-Rényi uniform random hypergraph and the alternative hypothesis corresponds to a heterogeneous uniform random hypergraph that contains a dense subhypergraph. We establish detection boundaries when the edge probabilities are known and construct an asymptotica… ▽ More We study the problem of testing the existence of a heterogeneous dense subhypergraph. The null hypothesis corresponds to a heterogeneous Erdös-Rényi uniform random hypergraph and the alternative hypothesis corresponds to a heterogeneous uniform random hypergraph that contains a dense subhypergraph. We establish detection boundaries when the edge probabilities are known and construct an asymptotically powerful test for distinguishing the hypotheses. We also construct an adaptive test which does not involve edge probabilities, and hence, is more practically useful. △ Less

Submitted 8 April, 2021; originally announced April 2021.

arXiv:2101.04584 [pdf, other]

Sharp detection boundaries on testing dense subhypergraph

Authors: Mingao Yuan, Zuofeng Shang

Abstract: We study the problem of testing the existence of a dense subhypergraph. The null hypothesis is an Erdos-Renyi uniform random hypergraph and the alternative hypothesis is a uniform random hypergraph that contains a dense subhypergraph. We establish sharp detection boundaries in both scenarios: (1) the edge probabilities are known; (2) the edge probabilities are unknown. In both scenarios, sharp det… ▽ More We study the problem of testing the existence of a dense subhypergraph. The null hypothesis is an Erdos-Renyi uniform random hypergraph and the alternative hypothesis is a uniform random hypergraph that contains a dense subhypergraph. We establish sharp detection boundaries in both scenarios: (1) the edge probabilities are known; (2) the edge probabilities are unknown. In both scenarios, sharp detectable boundaries are characterized by the appropriate model parameters. Asymptotically powerful tests are provided when the model parameters fall in the detectable regions. Our results indicate that the detectable regions for general hypergraph models are dramatically different from their graph counterparts. △ Less

Submitted 12 January, 2021; originally announced January 2021.

arXiv:2101.00914 [pdf, ps, other]

Benign overfitting without concentration

Authors: Zong Shang

Abstract: We obtain a sufficient condition for benign overfitting of linear regression problem. Our result does not rely on concentration argument but on small-ball assumption and thus can holds in heavy-tailed case. The basic idea is to establish a coordinate small-ball estimate in terms of effective rank so that we can calibrate the balance of epsilon-Net and exponential probability. Our result indicates… ▽ More We obtain a sufficient condition for benign overfitting of linear regression problem. Our result does not rely on concentration argument but on small-ball assumption and thus can holds in heavy-tailed case. The basic idea is to establish a coordinate small-ball estimate in terms of effective rank so that we can calibrate the balance of epsilon-Net and exponential probability. Our result indicates that benign overfitting is not depending on concentration property of the input vector. Finally, we discuss potential difficulties for benign overfitting beyond linear model and a benign overfitting result without truncated effective rank. △ Less

Submitted 4 January, 2021; originally announced January 2021.

arXiv:2011.04147 [pdf, ps, other]

A Computationally Efficient Classification Algorithm in Posterior Drift Model: Phase Transition and Minimax Adaptivity

Authors: Ruiqi Liu, Kexuan Li, Zuofeng Shang

Abstract: In massive data analysis, training and testing data often come from very different sources, and their probability distributions are not necessarily identical. A feature example is nonparametric classification in posterior drift model where the conditional distributions of the label given the covariates are possibly different. In this paper, we derive minimax rate of the excess risk for nonparametr… ▽ More In massive data analysis, training and testing data often come from very different sources, and their probability distributions are not necessarily identical. A feature example is nonparametric classification in posterior drift model where the conditional distributions of the label given the covariates are possibly different. In this paper, we derive minimax rate of the excess risk for nonparametric classification in posterior drift model in the setting that both training and testing data have smooth distributions, extending a recent work by Cai and Wei (2019) who only impose smoothness condition on the distribution of testing data. The minimax rate demonstrates a phase transition characterized by the mutual relationship between the smoothness orders of the training and testing data distributions. We also propose a computationally efficient and data-driven nearest neighbor classifier which achieves the minimax excess risk (up to a logarithm factor). Simulation studies and a real-world application are conducted to demonstrate our approach. △ Less

Submitted 8 November, 2020; originally announced November 2020.

arXiv:2009.12761 [pdf, ps, other]

Construction of Hecke Characters for Three-dimensional CM Abelian Varieties

Authors: Zhengyuan Shang

Abstract: It is well-known for an elliptic curve with complex multiplication that the existence of a $\mathbb{Q}$-rational model is equivalent to its field of moduli being equal to $\mathbb{Q}$, or its endomorphism ring being the ring of integers of 9 possible fields ($\ast$). Murabayashi and Umegaki proved analogous results for abelian surfaces. For three dimensional CM abelian varieties with rational fiel… ▽ More It is well-known for an elliptic curve with complex multiplication that the existence of a $\mathbb{Q}$-rational model is equivalent to its field of moduli being equal to $\mathbb{Q}$, or its endomorphism ring being the ring of integers of 9 possible fields ($\ast$). Murabayashi and Umegaki proved analogous results for abelian surfaces. For three dimensional CM abelian varieties with rational fields of moduli, Chun narrowed down to a list of 37 possible CM fields. In this paper, we show that his list is exact. By constructing certain Hecke characters that satisfy a theorem of Shimura, we prove that precisely 28 isogeny classes of these abelian varieties have $\mathbb{Q}$-models. Therefore the complete analogy to $(\ast)$ fails here. △ Less

Submitted 27 September, 2020; originally announced September 2020.

Comments: 11 pages, 4 tables

arXiv:2004.14954 [pdf, ps, other]

On Deep Instrumental Variables Estimate

Authors: Ruiqi Liu, Zuofeng Shang, Guang Cheng

Abstract: The endogeneity issue is fundamentally important as many empirical applications may suffer from the omission of explanatory variables, measurement error, or simultaneous causality. Recently, \cite{hllt17} propose a "Deep Instrumental Variable (IV)" framework based on deep neural networks to address endogeneity, demonstrating superior performances than existing approaches. The aim of this paper is… ▽ More The endogeneity issue is fundamentally important as many empirical applications may suffer from the omission of explanatory variables, measurement error, or simultaneous causality. Recently, \cite{hllt17} propose a "Deep Instrumental Variable (IV)" framework based on deep neural networks to address endogeneity, demonstrating superior performances than existing approaches. The aim of this paper is to theoretically understand the empirical success of the Deep IV. Specifically, we consider a two-stage estimator using deep neural networks in the linear instrumental variables model. By imposing a latent structural assumption on the reduced form equation between endogenous variables and instrumental variables, the first-stage estimator can automatically capture this latent structure and converge to the optimal instruments at the minimax optimal rate, which is free of the dimension of instrumental variables and thus mitigates the curse of dimensionality. Additionally, in comparison with classical methods, due to the faster convergence rate of the first-stage estimator, the second-stage estimator has {a smaller (second order) estimation error} and requires a weaker condition on the smoothness of the optimal instruments. Given that the depth and width of the employed deep neural network are well chosen, we further show that the second-stage estimator achieves the semiparametric efficiency bound. Simulation studies on synthetic data and application to automobile market data confirm our theory. △ Less

Submitted 30 April, 2020; originally announced April 2020.

arXiv:1911.02171 [pdf, other]

Minimax Nonparametric Two-sample Test under Smoothing

Authors: Xin Xing, Zuofeng Shang, Pang Du, ** Ma, Wenxuan Zhong, Jun S. Liu

Abstract: We consider the problem of comparing probability densities between two groups. A new probabilistic tensor product smoothing spline framework is developed to model the joint density of two variables. Under such a framework, the probability density comparison is equivalent to testing the presence/absence of interactions. We propose a penalized likelihood ratio test for such interaction testing and s… ▽ More We consider the problem of comparing probability densities between two groups. A new probabilistic tensor product smoothing spline framework is developed to model the joint density of two variables. Under such a framework, the probability density comparison is equivalent to testing the presence/absence of interactions. We propose a penalized likelihood ratio test for such interaction testing and show that the test statistic is asymptotically chi-square distributed under the null hypothesis. Furthermore, we derive a sharp minimax testing rate based on the Bernstein width for nonparametric two-sample tests and show that our proposed test statistics is minimax optimal. In addition, a data-adaptive tuning criterion is developed to choose the penalty parameter. Simulations and real applications demonstrate that the proposed test outperforms the conventional approaches under various scenarios. △ Less

Submitted 11 January, 2021; v1 submitted 5 November, 2019; originally announced November 2019.

arXiv:1909.02252 [pdf, other]

Knots with Prism Manifold Surgeries

Authors: Zhengyuan Shang

Abstract: Ballinger et al. have determined the list of all prism manifolds that are possibly realizable by Dehn surgeries on knots in $S^3$. In this paper, we explicitly find braid words of primitive/Seifert-fibered knots on which surface slope surgeries yield all the prism manifolds listed above. This completes the solution to the prism manifold realization problem. Ballinger et al. have determined the list of all prism manifolds that are possibly realizable by Dehn surgeries on knots in $S^3$. In this paper, we explicitly find braid words of primitive/Seifert-fibered knots on which surface slope surgeries yield all the prism manifolds listed above. This completes the solution to the prism manifold realization problem. △ Less

Submitted 5 September, 2019; originally announced September 2019.

Comments: 14 pages, 3 figures

arXiv:1901.08571 [pdf, other]

Nonparametric Inference under B-bits Quantization

Authors: Kexuan Li, Ruiqi Liu, Ganggang Xu, Zuofeng Shang

Abstract: Statistical inference based on lossy or incomplete samples is often needed in research areas such as signal/image processing, medical image storage, remote sensing, signal transmission. In this paper, we propose a nonparametric testing procedure based on samples quantized to $B$ bits through a computationally efficient algorithm. Under mild technical conditions, we establish the asymptotic propert… ▽ More Statistical inference based on lossy or incomplete samples is often needed in research areas such as signal/image processing, medical image storage, remote sensing, signal transmission. In this paper, we propose a nonparametric testing procedure based on samples quantized to $B$ bits through a computationally efficient algorithm. Under mild technical conditions, we establish the asymptotic properties of the proposed test statistic and investigate how the testing power changes as $B$ increases. In particular, we show that if $B$ exceeds a certain threshold, the proposed nonparametric testing procedure achieves the classical minimax rate of testing (Shang and Cheng, 2015) for spline models. We further extend our theoretical investigations to a nonparametric linearity test and an adaptive nonparametric test, expanding the applicability of the proposed methods. Extensive simulation studies {together with a real-data analysis} are used to demonstrate the validity and effectiveness of the proposed tests. △ Less

Submitted 11 August, 2023; v1 submitted 24 January, 2019; originally announced January 2019.

arXiv:1810.04617 [pdf, other]

Testing Community Structures for Hypergraphs

Authors: Mingao Yuan, Ruiqi Liu, Yang Feng, Zuofeng Shang

Abstract: Many complex networks in real world can be formulated as hypergraphs where community detection has been widely used. However, the fundamental question of whether communities exist or not in an observed hypergraph still remains unresolved. The aim of the present paper is to tackle this important problem. Specifically, we study when a hypergraph with community structure can be successfully distingui… ▽ More Many complex networks in real world can be formulated as hypergraphs where community detection has been widely used. However, the fundamental question of whether communities exist or not in an observed hypergraph still remains unresolved. The aim of the present paper is to tackle this important problem. Specifically, we study when a hypergraph with community structure can be successfully distinguished from its Erdös-Renyi counterpart, and propose concrete test statistics based on hypergraph cycles when the models are distinguishable. Our contributions are summarized as follows. For uniform hypergraphs, we show that successful testing is always impossible when average degree tends to zero, might be possible when average degree is bounded, and is possible when average degree is growing. We obtain asymptotic distributions of the proposed test statistics and analyze their power. Our results for growing degree case are further extended to nonuniform hypergraphs in which a new test involving both edge and hyperedge information is proposed. The novel aspect of our new test is that it is provably more powerful than the classic test involving only edge information. Simulation and real data analysis support our theoretical findings. The proofs rely on Janson's contiguity theory (\cite{J95}) and a high-moments driven asymptotic normality result by Gao and Wormald (\cite{GWALD}). △ Less

Submitted 4 June, 2021; v1 submitted 10 October, 2018; originally announced October 2018.

Comments: A revised version

arXiv:1808.05548 [pdf, ps, other]

Symmetric-adjoint and symplectic-adjoint methods and their applications

Authors: Geng Sun, Siqing Gan, Hongyu Liu, Zaijiu Shang

Abstract: Symmetric method and symplectic method are classical notions in the theory of Runge-Kutta methods. They can generate numerical flows that respectively preserve the symmetry and symplecticity of the continuous flows in the phase space. Adjoint method is an important way of constructing a new Runge-Kutta method via the symmetrisation of another Runge-Kutta method. In this paper, we introduce a new n… ▽ More Symmetric method and symplectic method are classical notions in the theory of Runge-Kutta methods. They can generate numerical flows that respectively preserve the symmetry and symplecticity of the continuous flows in the phase space. Adjoint method is an important way of constructing a new Runge-Kutta method via the symmetrisation of another Runge-Kutta method. In this paper, we introduce a new notion, called symplectic-adjoint Runge-Kutta method. We prove some interesting properties of the symmetric-adjoint and symplectic-adjoint methods. These properties reveal some intrinsic connections among several classical classes of Runge-Kutta methods. In particular, the newly introduced notion and the corresponding properties enable us to develop a novel and practical approach of constructing high-order explicit Runge-Kutta methods, which is a challenging and longly overlooked topic in the theory of Runge-Kutta methods. △ Less

Submitted 16 August, 2018; originally announced August 2018.

Comments: 20 pages, comments are welcome

arXiv:1807.04426 [pdf, ps, other]

A likelihood-ratio type test for stochastic block models with bounded degrees

Authors: Mingao Yuan, Yang Feng, Zuofeng Shang

Abstract: A fundamental problem in network data analysis is to test Erdös-Rényi model $\mathcal{G}\left(n,\frac{a+b}{2n}\right)$ versus a bisection stochastic block model $\mathcal{G}\left(n,\frac{a}{n},\frac{b}{n}\right)$, where $a,b>0$ are constants that represent the expected degrees of the graphs and $n$ denotes the number of nodes. This problem serves as the foundation of many other problems such as te… ▽ More A fundamental problem in network data analysis is to test Erdös-Rényi model $\mathcal{G}\left(n,\frac{a+b}{2n}\right)$ versus a bisection stochastic block model $\mathcal{G}\left(n,\frac{a}{n},\frac{b}{n}\right)$, where $a,b>0$ are constants that represent the expected degrees of the graphs and $n$ denotes the number of nodes. This problem serves as the foundation of many other problems such as testing-based methods for determining the number of communities (\cite{BS16,L16}) and community detection (\cite{MS16}). Existing work has been focusing on growing-degree regime $a,b\to\infty$ (\cite{BS16,L16,MS16,BM17,B18,GL17a,GL17b}) while leaving the bounded-degree regime untreated. In this paper, we propose a likelihood-ratio (LR) type procedure based on regularization to test stochastic block models with bounded degrees. We derive the limit distributions as power Poisson laws under both null and alternative hypotheses, based on which the limit power of the test is carefully analyzed. We also examine a Monte-Carlo method that partly resolves the computational cost issue. The proposed procedures are examined by both simulated and real-world data. The proof depends on a contiguity theory developed by Janson \cite{J95}. △ Less

Submitted 22 November, 2018; v1 submitted 12 July, 2018; originally announced July 2018.

Comments: In this new submission, we add a comment in introduction stating that > the classic test based on counting the $k_n$-cycles with > $k_n=\log^{1/4}{n}$ is unrealistic in practice, which is also the > motivation of our regularized LR test

arXiv:1805.09948 [pdf, other]

How Many Machines Can We Use in Parallel Computing for Kernel Ridge Regression?

Authors: Meimei Liu, Zuofeng Shang, Guang Cheng

Abstract: This paper aims to solve a basic problem in distributed statistical inference: how many machines can we use in parallel computing? In kernel ridge regression, we address this question in two important settings: nonparametric estimation and hypothesis testing. Specifically, we find a range for the number of machines under which optimal estimation/testing is achievable. The employed empirical proces… ▽ More This paper aims to solve a basic problem in distributed statistical inference: how many machines can we use in parallel computing? In kernel ridge regression, we address this question in two important settings: nonparametric estimation and hypothesis testing. Specifically, we find a range for the number of machines under which optimal estimation/testing is achievable. The employed empirical processes method provides a unified framework, that allows us to handle various regression problems (such as thin-plate splines and nonparametric additive regression) under different settings (such as univariate, multivariate and diverging-dimensional designs). It is worth noting that the upper bounds of the number of machines are proven to be un-improvable (upto a logarithmic factor) in two important cases: smoothing spline regression and Gaussian RKHS regression. Our theoretical findings are backed by thorough numerical studies. △ Less

Submitted 23 February, 2019; v1 submitted 24 May, 2018; originally announced May 2018.

Comments: This work extends the work in arXiv:1512.09226 to random and multivariate design

arXiv:1805.03425 [pdf, ps, other]

Numerical invariant tori of symplectic integrators for integrable Hamiltonian systems

Authors: Zhaodong Ding, Zaijiu Shang

Abstract: In this paper, we study the persistence of invariant tori of integrable Hamiltonian systems satisfying Rüssmann's non-degeneracy condition when symplectic integrators are applied to them. Meanwhile, we give an estimate of the measure of the set occupied by the invariant tori in the phase space. On an invariant torus, the one-step map of the scheme is conjugate to a one parameter family of linear r… ▽ More In this paper, we study the persistence of invariant tori of integrable Hamiltonian systems satisfying Rüssmann's non-degeneracy condition when symplectic integrators are applied to them. Meanwhile, we give an estimate of the measure of the set occupied by the invariant tori in the phase space. On an invariant torus, the one-step map of the scheme is conjugate to a one parameter family of linear rotations with a step size dependent frequency vector in terms of iteration. These results are a generalization of Shang's theorems (1999, 2000), where the non-degeneracy condition is assumed in the sense of Kolmogorov. In comparison, Rüssmann's condition is the weakest non-degeneracy condition for the persistence of invariant tori in Hamiltonian systems. These results provide new insight into the nonlinear stability of symplectic integrators. △ Less

Submitted 9 May, 2018; originally announced May 2018.

Comments: It has been accepted for publication in SCIENCE CHINA Mathematics

MSC Class: 37J35; 37J40; 65L07; 65L20; 65P10; 65P40

arXiv:1805.03355 [pdf, ps, other]

Exponential Stability Estimate of Symplectic Integrators for Integrable Hamiltonian Systems

Authors: Zhaodong Ding, Zaijiu Shang, Bo Xie

Abstract: We prove a Nekhoroshev-type theorem for nearly integrable symplectic map. As an application of the theorem, we obtain the exponential stability symplectic algorithms. Meanwhile, we can get the bounds for the perturbation, the variation of the action variables, and the exponential time respectively. These results provide a new insight into the nonlinear stability analysis of symplectic algorithms.… ▽ More We prove a Nekhoroshev-type theorem for nearly integrable symplectic map. As an application of the theorem, we obtain the exponential stability symplectic algorithms. Meanwhile, we can get the bounds for the perturbation, the variation of the action variables, and the exponential time respectively. These results provide a new insight into the nonlinear stability analysis of symplectic algorithms. Combined with our previous results on the numerical KAM theorem for symplectic algorithms (2018), we give a more complete characterization on the complex nonlinear dynamical behavior of symplectic algorithms. △ Less

Submitted 8 May, 2018; originally announced May 2018.

Comments: 32 pages

arXiv:1804.00950 [pdf, other]

A KAM-Theorem for Persistence of Quasi-periodic Invariant Tori in Bifurcation Theory of Equilibrium Points

Authors: Xuemei Li, Zaijiu Shang

Abstract: In this paper, we establish a KAM-theorem for ordinary differential equations with finitely differentiable vector fields and multiple degeneracies. The theorem can be used to deal with the persistence of quasi-periodic invariant tori in multiple Hopf and zero-multiple Hopf bifurcations, as well as their subordinate bifurcations, of equilibrium points of continuous dynamical systems. In this paper, we establish a KAM-theorem for ordinary differential equations with finitely differentiable vector fields and multiple degeneracies. The theorem can be used to deal with the persistence of quasi-periodic invariant tori in multiple Hopf and zero-multiple Hopf bifurcations, as well as their subordinate bifurcations, of equilibrium points of continuous dynamical systems. △ Less

Submitted 16 October, 2018; v1 submitted 3 April, 2018; originally announced April 2018.

Comments: 38pages

arXiv:1802.06308 [pdf, other]

Nonparametric Testing under Random Projection

Authors: Meimei Liu, Zuofeng Shang, Guang Cheng

Abstract: A common challenge in nonparametric inference is its high computational complexity when data volume is large. In this paper, we develop computationally efficient nonparametric testing by employing a random projection strategy. In the specific kernel ridge regression setup, a simple distance-based test statistic is proposed. Notably, we derive the minimum number of random projections that is suffic… ▽ More A common challenge in nonparametric inference is its high computational complexity when data volume is large. In this paper, we develop computationally efficient nonparametric testing by employing a random projection strategy. In the specific kernel ridge regression setup, a simple distance-based test statistic is proposed. Notably, we derive the minimum number of random projections that is sufficient for achieving testing optimality in terms of the minimax rate. An adaptive testing procedure is further established without prior knowledge of regularity. One technical contribution is to establish upper bounds for a range of tail sums of empirical kernel eigenvalues. Simulations and real data analysis are conducted to support our theory. △ Less

Submitted 17 February, 2018; originally announced February 2018.

arXiv:1710.01181 [pdf, other]

Quasi-periodic solutions for differential equations with an elliptic-type degenerate equilibrium point under small perturbations

Authors: Xuemei Li, Zaijiu Shang

Abstract: This work focuses on the existence of quasi-periodic solutions for ordinary and delay differential equations (ODEs and DDEs for short) with an elliptic-type degenerate equilibrium point under quasi-periodic perturbations. We prove that under appropriate hypotheses there exist quasi-periodic solutions for perturbed ODEs and DDEs near the equilibrium point for most parameter values, then apply these… ▽ More This work focuses on the existence of quasi-periodic solutions for ordinary and delay differential equations (ODEs and DDEs for short) with an elliptic-type degenerate equilibrium point under quasi-periodic perturbations. We prove that under appropriate hypotheses there exist quasi-periodic solutions for perturbed ODEs and DDEs near the equilibrium point for most parameter values, then apply these results to the delayed van der Pol's oscillator with zero-Hopf singularity. △ Less

Submitted 3 October, 2017; originally announced October 2017.

Comments: 32pages

arXiv:1703.03031 [pdf, other]

Statistical Inference on Panel Data Models: A Kernel Ridge Regression Method

Authors: Shunan Zhao, Ruiqi Liu, Zuofeng Shang

Abstract: We propose statistical inferential procedures for panel data models with interactive fixed effects in a kernel ridge regression framework.Compared with traditional sieve methods, our method is automatic in the sense that it does not require the choice of basis functions and truncation parameters.Model complexity is controlled by a continuous regularization parameter which can be automatically sele… ▽ More We propose statistical inferential procedures for panel data models with interactive fixed effects in a kernel ridge regression framework.Compared with traditional sieve methods, our method is automatic in the sense that it does not require the choice of basis functions and truncation parameters.Model complexity is controlled by a continuous regularization parameter which can be automatically selected by generalized cross validation. Based on empirical processes theory and functional analysis tools, we derive joint asymptotic distributions for the estimators in the heterogeneous setting. These joint asymptotic results are then used to construct confidence intervals for the regression means and prediction intervals for the future observations, both being the first provably valid intervals in literature. Marginal asymptotic normality of the functional estimators in homogeneous setting is also obtained. Simulation and real data analysis demonstrate the advantages of our method. △ Less

Submitted 8 March, 2017; originally announced March 2017.

arXiv:1702.01330 [pdf, other]

Non-asymptotic theory for nonparametric testing

Authors: Yun Yang, Zuofeng Shang, Guang Cheng

Abstract: We consider nonparametric testing in a non-asymptotic framework. Our statistical guarantees are exact in the sense that Type I and II errors are controlled for any finite sample size. Meanwhile, one proposed test is shown to achieve minimax optimality in the asymptotic sense. An important consequence of this non-asymptotic theory is a new and practically useful formula for selecting the optimal sm… ▽ More We consider nonparametric testing in a non-asymptotic framework. Our statistical guarantees are exact in the sense that Type I and II errors are controlled for any finite sample size. Meanwhile, one proposed test is shown to achieve minimax optimality in the asymptotic sense. An important consequence of this non-asymptotic theory is a new and practically useful formula for selecting the optimal smoothing parameter in nonparametric testing. The leading example in this paper is smoothing spline models under Gaussian errors. The results obtained therein can be further generalized to the kernel ridge regression framework under possibly non-Gaussian errors. Simulations demonstrate that our proposed test improves over the conventional asymptotic test when sample size is small to moderate. △ Less

Submitted 4 February, 2017; originally announced February 2017.

arXiv:1611.06575 [pdf, other]

A maximum smoothed likelihood based estimation for two component semiparametric density mixtures with a known component

Authors: Zhou Shen, Michael Levine, Zuofeng Shang

Abstract: We consider a semiparametric mixture of two univariate density functions where one of them is known while the weight and the other function are unknown. Such mixtures have a history of application to the problem of detecting differentially expressed genes under two or more conditions in microarray data. Until now, some additional knowledge about the unknown component (e.g. the fact that it belongs… ▽ More We consider a semiparametric mixture of two univariate density functions where one of them is known while the weight and the other function are unknown. Such mixtures have a history of application to the problem of detecting differentially expressed genes under two or more conditions in microarray data. Until now, some additional knowledge about the unknown component (e.g. the fact that it belongs to a location family) has been assumed. As opposed to this approach, we do not assume any additional structure on the unknown density function. For this mixture model, we derive a new sufficient identifiability condition and pinpoint a specific class of distributions describing the unknown component for which this condition is mostly satisfied. Our approach to estimation of this model is based on an idea of applying a maximum smoothed likelihood to what would otherwise have been an ill-posed problem. We introduce an iterative MM (Majorization-Minimization) algorithm that estimates all of the model parameters. We establish that the algorithm possesses a descent property with respect to a log-likelihood objective functional and prove that the algorithm converges to a minimizer of such an objective functional. Finally, we also illustrate the performance of our algorithm in a simulation study and using a real dataset. △ Less

Submitted 29 July, 2017; v1 submitted 20 November, 2016; originally announced November 2016.

MSC Class: 62G07

arXiv:1512.09226 [pdf, other]

Computational Limits of A Distributed Algorithm For Smoothing Spline

Authors: Zuofeng Shang, Guang Cheng

Abstract: In this paper, we explore statistical versus computational trade-off to address a basic question in the application of a distributed algorithm: what is the minimal computational cost in obtaining statistical optimality? In smoothing spline setup, we observe a phase transition phenomenon for the number of deployed machines that ends up being a simple proxy for computing cost. Specifically, a sharp… ▽ More In this paper, we explore statistical versus computational trade-off to address a basic question in the application of a distributed algorithm: what is the minimal computational cost in obtaining statistical optimality? In smoothing spline setup, we observe a phase transition phenomenon for the number of deployed machines that ends up being a simple proxy for computing cost. Specifically, a sharp upper bound for the number of machines is established: when the number is below this bound, statistical optimality (in terms of nonparametric estimation or testing) is achievable; otherwise, statistical optimality becomes impossible. These sharp bounds partly capture intrinsic computational limits of the distributed algorithm considered in this paper, and turn out to be fully determined by the smoothness of the regression function. As a side remark, we argue that sample splitting may be viewed as an alternative form of regularization, playing a similar role as smoothing parameter. △ Less

Submitted 21 July, 2017; v1 submitted 31 December, 2015; originally announced December 2015.

Comments: To Appear in Journal of Machine Learning Research

arXiv:1508.04175 [pdf, other]

Nonparametric Bayesian Aggregation for Massive Data

Authors: Zuofeng Shang, Botao Hao, Guang Cheng

Abstract: We develop a set of scalable Bayesian inference procedures for a general class of nonparametric regression models. Specifically, nonparametric Bayesian inferences are separately performed on each subset randomly split from a massive dataset, and then the obtained local results are aggregated into global counterparts. This aggregation step is explicit without involving any additional computation co… ▽ More We develop a set of scalable Bayesian inference procedures for a general class of nonparametric regression models. Specifically, nonparametric Bayesian inferences are separately performed on each subset randomly split from a massive dataset, and then the obtained local results are aggregated into global counterparts. This aggregation step is explicit without involving any additional computation cost. By a careful partition, we show that our aggregated inference results obtain an oracle rule in the sense that they are equivalent to those obtained directly from the entire data (which are computationally prohibitive). For example, an aggregated credible ball achieves desirable credibility level and also frequentist coverage while possessing the same radius as the oracle ball. △ Less

Submitted 4 September, 2019; v1 submitted 17 August, 2015; originally announced August 2015.

Comments: To appear in Journal of Machine Learning Research. arXiv admin note: text overlap with arXiv:1411.3686

arXiv:1411.3686 [pdf, other]

Gaussian Approximation of General Nonparametric Posterior Distributions

Authors: Zuofeng Shang, Guang Cheng

Abstract: In a general class of Bayesian nonparametric models, we prove that the posterior distribution can be asymptotically approximated by a Gaussian process. Our results apply to nonparametric exponential family that contains both Gaussian and non-Gaussian regression, and also hold for both efficient (root-n) and inefficient (non root-n) estimation. Our general approximation theorem does not rely on pos… ▽ More In a general class of Bayesian nonparametric models, we prove that the posterior distribution can be asymptotically approximated by a Gaussian process. Our results apply to nonparametric exponential family that contains both Gaussian and non-Gaussian regression, and also hold for both efficient (root-n) and inefficient (non root-n) estimation. Our general approximation theorem does not rely on posterior conjugacy, and can be verified in a class of Gaussian process priors that has a smoothing spline interpretation [59, 44]. In particular, the limiting posterior measure becomes prior-free under a Bayesian version of "under-smoothing" condition. Finally, we apply our approximation theorem to examine the asymptotic frequentist properties of Bayesian procedures such as credible regions and credible intervals. △ Less

Submitted 30 October, 2017; v1 submitted 13 November, 2014; originally announced November 2014.

Comments: To Appear in Information and Inference. In Memory of Prof. Jayanta Ghosh

arXiv:1405.6655 [pdf, ps, other]

doi 10.1214/15-AOS1322

Nonparametric inference in generalized functional linear models

Authors: Zuofeng Shang, Guang Cheng

Abstract: We propose a roughness regularization approach in making nonparametric inference for generalized functional linear models. In a reproducing kernel Hilbert space framework, we construct asymptotically valid confidence intervals for regression mean, prediction intervals for future response and various statistical procedures for hypothesis testing. In particular, one procedure for testing global beha… ▽ More We propose a roughness regularization approach in making nonparametric inference for generalized functional linear models. In a reproducing kernel Hilbert space framework, we construct asymptotically valid confidence intervals for regression mean, prediction intervals for future response and various statistical procedures for hypothesis testing. In particular, one procedure for testing global behaviors of the slope function is adaptive to the smoothness of the slope function and to the structure of the predictors. As a by-product, a new type of Wilks phenomenon [Ann. Math. Stat. 9 (1938) 60-62; Ann. Statist. 29 (2001) 153-193] is discovered when testing the functional linear models. Despite the generality, our inference procedures are easy to implement. Numerical examples are provided to demonstrate the empirical advantages over the competing methods. A collection of technical tools such as integro-differential equation techniques [Trans. Amer. Math. Soc. (1927) 29 755-800; Trans. Amer. Math. Soc. (1928) 30 453-471; Trans. Amer. Math. Soc. (1930) 32 860-868], Stein's method [Ann. Statist. 41 (2013) 2786-2819] [Stein, Approximate Computation of Expectations (1986) IMS] and functional Bahadur representation [Ann. Statist. 41 (2013) 2608-2638] are employed in this paper. △ Less

Submitted 30 July, 2015; v1 submitted 26 May, 2014; originally announced May 2014.

Comments: Published at http://dx.doi.org/10.1214/15-AOS1322 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS1322

Journal ref: Annals of Statistics 2015, Vol. 43, No. 4, 1742-1773

arXiv:1311.2628 [pdf, ps, other]

doi 10.1214/15-AOS1313

Joint asymptotics for semi-nonparametric regression models with partially linear structure

Authors: Guang Cheng, Zuofeng Shang

Abstract: We consider a joint asymptotic framework for studying semi-nonparametric regression models where (finite-dimensional) Euclidean parameters and (infinite-dimensional) functional parameters are both of interest. The class of models in consideration share a partially linear structure and are estimated in two general contexts: (i) quasi-likelihood and (ii) true likelihood. We first show that the Eucli… ▽ More We consider a joint asymptotic framework for studying semi-nonparametric regression models where (finite-dimensional) Euclidean parameters and (infinite-dimensional) functional parameters are both of interest. The class of models in consideration share a partially linear structure and are estimated in two general contexts: (i) quasi-likelihood and (ii) true likelihood. We first show that the Euclidean estimator and (pointwise) functional estimator, which are re-scaled at different rates, jointly converge to a zero-mean Gaussian vector. This weak convergence result reveals a surprising joint asymptotics phenomenon: these two estimators are asymptotically independent. A major goal of this paper is to gain first-hand insights into the above phenomenon. Moreover, a likelihood ratio testing is proposed for a set of joint local hypotheses, where a new version of the Wilks phenomenon [Ann. Math. Stat. 9 (1938) 60-62; Ann. Statist. 1 (2001) 153-193] is unveiled. A novel technical tool, called a joint Bahadur representation, is developed for studying these joint asymptotics results. △ Less

Submitted 3 June, 2015; v1 submitted 11 November, 2013; originally announced November 2013.

Comments: Published at http://dx.doi.org/10.1214/15-AOS1313 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS1313

Journal ref: Annals of Statistics 2015, Vol. 43, No. 3, 1351-1390

arXiv:1302.1154 [pdf, ps, other]

Bayesian Ultrahigh-Dimensional Screening Via MCMC

Authors: Zuofeng Shang, ** Li

Abstract: We explore the theoretical and numerical property of a fully Bayesian model selection method in sparse ultrahigh-dimensional settings, i.e., $p\gg n$, where $p$ is the number of covariates and $n$ is the sample size. Our method consists of (1) a hierarchical Bayesian model with a novel prior placed over the model space which includes a hyperparameter $t_n$ controlling the model size, and (2) an ef… ▽ More We explore the theoretical and numerical property of a fully Bayesian model selection method in sparse ultrahigh-dimensional settings, i.e., $p\gg n$, where $p$ is the number of covariates and $n$ is the sample size. Our method consists of (1) a hierarchical Bayesian model with a novel prior placed over the model space which includes a hyperparameter $t_n$ controlling the model size, and (2) an efficient MCMC algorithm for automatic and stochastic search of the models. Our theory shows that, when specifying $t_n$ correctly, the proposed method yields selection consistency, i.e., the posterior probability of the true model asymptotically approaches one; when $t_n$ is misspecified, the selected model is still asymptotically nested in the true model. The theory also reveals insensitivity of the selection result with respect to the choice of $t_n$. In implementations, a reasonable prior is further assumed on $t_n$ which allows us to draw its samples stochastically. Our approach conducts selection, estimation and even inference in a unified framework. No additional prescreening or dimension reduction step is needed. Two novel $g$-priors are proposed to make our approach more flexible. A simulation study is given to display the numerical advantage of our method. △ Less

Submitted 12 March, 2013; v1 submitted 5 February, 2013; originally announced February 2013.

arXiv:1301.4560 [pdf, other]

Two Single-shot Methods for Locating Multiple Electromagnetic Scatterers

Authors: **gzhi Li, Hongyu Liu, Zaijiu Shang, Hongpeng Sun

Abstract: We develop two inverse scattering schemes for locating multiple electromagnetic (EM) scatterers by the electric far-field measurement corresponding to a single incident/detecting plane wave. The first scheme is for locating scatterers of small size compared to the wavelength of the detecting plane wave. The multiple scatterers could be extremely general with an unknown number of components, and ea… ▽ More We develop two inverse scattering schemes for locating multiple electromagnetic (EM) scatterers by the electric far-field measurement corresponding to a single incident/detecting plane wave. The first scheme is for locating scatterers of small size compared to the wavelength of the detecting plane wave. The multiple scatterers could be extremely general with an unknown number of components, and each scatterer component could be either an impenetrable perfectly conducting obstacle or a penetrable inhomogeneous medium with an unknown content. The second scheme is for locating multiple perfectly conducting obstacles of regular size compared to the detecting EM wavelength. The number of the obstacle components is not required to be known in advance, but the shape of each component must be from a certain known admissible class. The admissible class may consist of multiple different reference obstacles. The second scheme could also be extended to include the medium components if a certain generic condition is satisfied. Both schemes are based on some novel indicator functions whose indicating behaviors could be used to locate the scatterers. No inversion will be involved in calculating the indicator functions, and the proposed methods are every efficient and robust to noise. Rigorous mathematical justifications are provided and extensive numerical experiments are conducted to illustrate the effectiveness of the imaging schemes. △ Less

Submitted 25 January, 2013; v1 submitted 19 January, 2013; originally announced January 2013.

arXiv:1212.6788 [pdf, ps, other]

doi 10.1214/13-AOS1164

Local and global asymptotic inference in smoothing spline models

Authors: Zuofeng Shang, Guang Cheng

Abstract: This article studies local and global inference for smoothing spline estimation in a unified asymptotic framework. We first introduce a new technical tool called functional Bahadur representation, which significantly generalizes the traditional Bahadur representation in parametric models, that is, Bahadur [Ann. Inst. Statist. Math. 37 (1966) 577-580]. Equipped with this tool, we develop four inter… ▽ More This article studies local and global inference for smoothing spline estimation in a unified asymptotic framework. We first introduce a new technical tool called functional Bahadur representation, which significantly generalizes the traditional Bahadur representation in parametric models, that is, Bahadur [Ann. Inst. Statist. Math. 37 (1966) 577-580]. Equipped with this tool, we develop four interconnected procedures for inference: (i) pointwise confidence interval; (ii) local likelihood ratio testing; (iii) simultaneous confidence band; (iv) global likelihood ratio testing. In particular, our confidence intervals are proved to be asymptotically valid at any point in the support, and they are shorter on average than the Bayesian confidence intervals proposed by Wahba [J. R. Stat. Soc. Ser. B Stat. Methodol. 45 (1983) 133-150] and Nychka [J. Amer. Statist. Assoc. 83 (1988) 1134-1143]. We also discuss a version of the Wilks phenomenon arising from local/global likelihood ratio testing. It is also worth noting that our simultaneous confidence bands are the first ones applicable to general quasi-likelihood models. Furthermore, issues relating to optimality and efficiency are carefully addressed. As a by-product, we discover a surprising relationship between periodic and nonperiodic smoothing splines in terms of inference. △ Less

Submitted 26 November, 2013; v1 submitted 30 December, 2012; originally announced December 2012.

Comments: Published in at http://dx.doi.org/10.1214/13-AOS1164 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS1164

Journal ref: Annals of Statistics 2013, Vol. 41, No. 5, 2608-2638

arXiv:1111.2444 [pdf, ps, other]

doi 10.1007/s10884-012-9270-5

Singular perturbation of reduced wave equation and scattering from an embedded obstacle

Authors: Hongyu Liu, Zaijiu Shang, Hongpeng Sun, Jun Zou

Abstract: We consider time-harmonic wave scattering from an inhomogeneous isotropic medium supported in a bounded domain $Ω\subset\mathbb{R}^N$ ($N\geq 2$). {In a subregion $D\SubsetΩ$, the medium is supposed to be lossy and have a large mass density. We study the asymptotic development of the wave field as the mass density $ρ\rightarrow +\infty$} and show that the wave field inside $D$ will decay exponenti… ▽ More We consider time-harmonic wave scattering from an inhomogeneous isotropic medium supported in a bounded domain $Ω\subset\mathbb{R}^N$ ($N\geq 2$). {In a subregion $D\SubsetΩ$, the medium is supposed to be lossy and have a large mass density. We study the asymptotic development of the wave field as the mass density $ρ\rightarrow +\infty$} and show that the wave field inside $D$ will decay exponentially while the wave filed outside the medium will converge to the one corresponding to a sound-hard obstacle $D\SubsetΩ$ buried in the medium supported in $Ω\backslash\bar{D}$. Moreover, the normal velocity of the wave field on $\partial D$ from outside $D$ is shown to be vanishing as $ρ\rightarrow +\infty$. {We derive very accurate estimates for the wave field inside and outside $D$ and on $\partial D$ in terms of $ρ$, and show that the asymptotic estimates are sharp. The implication of the obtained results is given for an inverse scattering problem of reconstructing a complex scatterer.} △ Less

Submitted 14 February, 2012; v1 submitted 10 November, 2011; originally announced November 2011.

arXiv:1102.0826 [pdf, ps, other]

Consistency of Bayesian Linear Model Selection With a Growing Number of Parameters

Authors: Zuofeng Shang, Murray K. Clayton

Abstract: Linear models with a growing number of parameters have been widely used in modern statistics. One important problem about this kind of model is the variable selection issue. Bayesian approaches, which provide a stochastic search of informative variables, have gained popularity. In this paper, we will study the asymptotic properties related to Bayesian model selection when the model dimension $p$ i… ▽ More Linear models with a growing number of parameters have been widely used in modern statistics. One important problem about this kind of model is the variable selection issue. Bayesian approaches, which provide a stochastic search of informative variables, have gained popularity. In this paper, we will study the asymptotic properties related to Bayesian model selection when the model dimension $p$ is growing with the sample size $n$. We consider $p\le n$ and provide sufficient conditions under which: (1) with large probability, the posterior probability of the true model (from which samples are drawn) uniformly dominates the posterior probability of any incorrect models; and (2) with large probability, the posterior probability of the true model converges to one. Both (1) and (2) guarantee that the true model will be selected under a Bayesian framework. We also demonstrate several situations when (1) holds but (2) fails, which illustrates the difference between these two properties. Simulated examples are provided to illustrate the main results. △ Less

Submitted 2 February, 2012; v1 submitted 3 February, 2011; originally announced February 2011.

arXiv:0802.2121 [pdf, ps, other]

Preservation of stability properties near fixed points of linear hamiltonian systems by symplectic integrators

Authors: Xiaohua Ding, Hongyu Liu, Zaijiu Shang, Geng Sun, Lingshu Wang

Abstract: Based on reasonable testing model problems, we study the preservation by symplectic Runge-Kutta method (SRK) and symplectic partitioned Runge-Kutta method (SPRK) of structures for fixed points of linear Hamiltonian systems. The structure-preservation region provides a practical criterion for choosing step-size in symplectic computation. Examples are given to justify the investigation. Based on reasonable testing model problems, we study the preservation by symplectic Runge-Kutta method (SRK) and symplectic partitioned Runge-Kutta method (SPRK) of structures for fixed points of linear Hamiltonian systems. The structure-preservation region provides a practical criterion for choosing step-size in symplectic computation. Examples are given to justify the investigation. △ Less

Submitted 14 February, 2008; originally announced February 2008.

MSC Class: 37M15; 65P10

Showing 1–44 of 44 results for author: Shang, Z