Search | arXiv e-print repository

Smoothed NPMLEs in nonparametric Poisson mixtures and beyond

Abstract: We discuss nonparametric mixing distribution estimation under the Gaussian-smoothed optimal transport (GOT) distance. It is shown that a recently formulated conjecture -- that the Poisson nonparametric maximum likelihood estimator can achieve root-$n$ rate of convergence under the GOT distance -- holds up to some logarithmic terms. We also establish the same conclusion for other minimum-distance e… ▽ More We discuss nonparametric mixing distribution estimation under the Gaussian-smoothed optimal transport (GOT) distance. It is shown that a recently formulated conjecture -- that the Poisson nonparametric maximum likelihood estimator can achieve root-$n$ rate of convergence under the GOT distance -- holds up to some logarithmic terms. We also establish the same conclusion for other minimum-distance estimators, and discuss mixture models beyond the Poisson. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 20 pages

arXiv:2403.02584 [pdf, other]

A Direct Sampling Method and Its Integration with Deep Learning for Inverse Scattering Problems with Phaseless Data

Authors: Jianfeng Ning, Fuqun Han, Jun Zou

Abstract: We consider in this work an inverse acoustic scattering problem when only phaseless data is available. The inverse problem is highly nonlinear and ill-posed due to the lack of the phase information. Solving inverse scattering problems with phaseless data is important in applications as the collection of physically acceptable phased data is usually difficult and expensive. A novel direct sampling m… ▽ More We consider in this work an inverse acoustic scattering problem when only phaseless data is available. The inverse problem is highly nonlinear and ill-posed due to the lack of the phase information. Solving inverse scattering problems with phaseless data is important in applications as the collection of physically acceptable phased data is usually difficult and expensive. A novel direct sampling method (DSM) will be developed to effectively estimate the locations and geometric shapes of the unknown scatterers from phaseless data generated by a very limited number of incident waves. With a careful theoretical analysis of the behavior of the index function and some representative numerical examples, the new DSM is shown to be computationally efficient, easy to implement, robust to large noise, and does not require any prior knowledge of the unknown scatterers. Furthermore, to fully exploit the index functions obtained from the DSM, we also propose to integrate the DSM with a deep learning technique (DSM-DL) to achieve high-quality reconstructions. Several challenging and representative numerical experiments are carried out to demonstrate the accuracy and robustness of reconstructions by DSM-DL. The DSM-DL networks trained by phased data are further theoretically and numerically shown to be able to solve problems with phaseless data. Additionally, our numerical experiments also show the DSM-DL can solve inverse scattering problems with mixed types of scatterers, which renders its applications in many important practical scenarios. △ Less

Submitted 4 March, 2024; originally announced March 2024.

arXiv:2401.13125 [pdf, ps, other]

Tensor train based sampling algorithms for approximating regularized Wasserstein proximal operators

Authors: Fuqun Han, Stanley Osher, Wuchen Li

Abstract: We present a tensor train (TT) based algorithm designed for sampling from a target distribution and employ TT approximation to capture the high-dimensional probability density evolution of overdamped Langevin dynamics. This involves utilizing the regularized Wasserstein proximal operator, which exhibits a simple kernel integration formulation, i.e., the softmax formula of the traditional proximal… ▽ More We present a tensor train (TT) based algorithm designed for sampling from a target distribution and employ TT approximation to capture the high-dimensional probability density evolution of overdamped Langevin dynamics. This involves utilizing the regularized Wasserstein proximal operator, which exhibits a simple kernel integration formulation, i.e., the softmax formula of the traditional proximal operator. The integration, performed in $\mathbb{R}^d$, poses a challenge in practical scenarios, making the algorithm practically implementable only with the aid of TT approximation. In the specific context of Gaussian distributions, we rigorously establish the unbiasedness and linear convergence of our sampling algorithm towards the target distribution. To assess the effectiveness of our proposed methods, we apply them to various scenarios, including Gaussian families, Gaussian mixtures, bimodal distributions, and Bayesian inverse problems in numerical examples. The sampling algorithm exhibits superior accuracy and faster convergence when compared to classical Langevin dynamics-type sampling algorithms. △ Less

Submitted 25 January, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

arXiv:2312.07683 [pdf, ps, other]

On Rosenbaum's Rank-based Matching Estimator

Authors: Matias D. Cattaneo, Fang Han, Zhexiao Lin

Abstract: In two influential contributions, Rosenbaum (2005, 2020) advocated for using the distances between component-wise ranks, instead of the original data values, to measure covariate similarity when constructing matching estimators of average treatment effects. While the intuitive benefits of using covariate ranks for matching estimation are apparent, there is no theoretical understanding of such proc… ▽ More In two influential contributions, Rosenbaum (2005, 2020) advocated for using the distances between component-wise ranks, instead of the original data values, to measure covariate similarity when constructing matching estimators of average treatment effects. While the intuitive benefits of using covariate ranks for matching estimation are apparent, there is no theoretical understanding of such procedures in the literature. We fill this gap by demonstrating that Rosenbaum's rank-based matching estimator, when coupled with a regression adjustment, enjoys the properties of double robustness and semiparametric efficiency without the need to enforce restrictive covariate moment assumptions. Our theoretical findings further emphasize the statistical virtues of employing ranks for estimation and inference, more broadly aligning with the insights put forth by Peter Bickel in his 2004 Rietz lecture (Bickel, 2004). △ Less

Submitted 6 January, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

Comments: Assumption 4.1 is slightly weakened in this version

arXiv:2312.05996 [pdf, other]

Achieving Fairness and Accuracy in Regressive Property Taxation

Authors: Ozan Candogan, Feiyu Han, Haihao Lu

Abstract: Regressivity in property taxation, or the disproportionate overassessment of lower-valued properties compared to higher-valued ones, results in an unfair taxation burden for Americans living in poverty. To address regressivity and enhance both the accuracy and fairness of property assessments, we introduce a scalable property valuation model called the $K$-segment model. Our study formulates a mat… ▽ More Regressivity in property taxation, or the disproportionate overassessment of lower-valued properties compared to higher-valued ones, results in an unfair taxation burden for Americans living in poverty. To address regressivity and enhance both the accuracy and fairness of property assessments, we introduce a scalable property valuation model called the $K$-segment model. Our study formulates a mathematical framework for the $K$-segment model, which divides a single model into $K$ segments and employs submodels for each segment. Smoothing methods are incorporated to balance and smooth the multiple submodels within the overall model. To assess the fairness of our proposed model, we introduce two innovative fairness measures for property evaluation and taxation, focusing on group-level fairness and extreme sales price portions where unfairness typically arises. Compared to the model employed currently in practice, our study demonstrates that the $K$-segment model effectively improves fairness based on the proposed measures. Furthermore, we investigate the accuracy--fairness trade-off in property assessments and illustrate how the $K$-segment model balances high accuracy with fairness for all properties. Our work uncovers the practical impacts of the $K$-segment models in addressing regressivity in property taxation, offering a tangible solution for policymakers and property owners. By implementing this model, we pave the way for a fairer taxation system, ensuring a more equitable distribution of tax burdens. △ Less

Submitted 10 December, 2023; originally announced December 2023.

arXiv:2311.16486 [pdf, ps, other]

On the adaptation of causal forests to manifold data

Authors: Yiyi Huo, Yingying Fan, Fang Han

Abstract: Researchers often hold the belief that random forests are "the cure to the world's ills" (Bickel, 2010). But how exactly do they achieve this? Focused on the recently introduced causal forests (Athey and Imbens, 2016; Wager and Athey, 2018), this manuscript aims to contribute to an ongoing research trend towards answering this question, proving that causal forests can adapt to the unknown covariat… ▽ More Researchers often hold the belief that random forests are "the cure to the world's ills" (Bickel, 2010). But how exactly do they achieve this? Focused on the recently introduced causal forests (Athey and Imbens, 2016; Wager and Athey, 2018), this manuscript aims to contribute to an ongoing research trend towards answering this question, proving that causal forests can adapt to the unknown covariate manifold structure. In particular, our analysis shows that a causal forest estimator can achieve the optimal rate of convergence for estimating the conditional average treatment effect, with the covariate dimension automatically replaced by the manifold dimension. These findings align with analogous observations in the realm of deep learning and resonate with the insights presented in Peter Bickel's 2004 Rietz lecture. △ Less

Submitted 26 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: This version adds more references and corrects some minor typos

arXiv:2311.14766 [pdf, other]

Reinforcement Learning from Statistical Feedback: the Journey from AB Testing to ANT Testing

Authors: Feiyang Han, Yimin Wei, Zhaofeng Liu, Yanxing Qi

Abstract: Reinforcement Learning from Human Feedback (RLHF) has played a crucial role in the success of large models such as ChatGPT. RLHF is a reinforcement learning framework which combines human feedback to improve learning effectiveness and performance. However, obtaining preferences feedback manually is quite expensive in commercial applications. Some statistical commercial indicators are usually more… ▽ More Reinforcement Learning from Human Feedback (RLHF) has played a crucial role in the success of large models such as ChatGPT. RLHF is a reinforcement learning framework which combines human feedback to improve learning effectiveness and performance. However, obtaining preferences feedback manually is quite expensive in commercial applications. Some statistical commercial indicators are usually more valuable and always ignored in RLHF. There exists a gap between commercial target and model training. In our research, we will attempt to fill this gap with statistical business feedback instead of human feedback, using AB testing which is a well-established statistical method. Reinforcement Learning from Statistical Feedback (RLSF) based on AB testing is proposed. Statistical inference methods are used to obtain preferences for training the reward network, which fine-tunes the pre-trained model in reinforcement learning framework, achieving greater business value. Furthermore, we extend AB testing with double selections at a single time-point to ANT testing with multiple selections at different feedback time points. Moreover, we design numerical experiences to validate the effectiveness of our algorithm framework. △ Less

Submitted 24 November, 2023; originally announced November 2023.

arXiv:2311.01915 [pdf, ps, other]

Discrete infinity Laplace equations on graphs and tug-of-war games

Authors: Fengwen Han, Tao Wang

Abstract: We study the Dirichlet problem of the following discrete infinity Laplace equation on a subgraph with finite width $$Δ_{\infty} u(x) = \inf_{y \sim x}u(y)+\sup_{y \sim x}u(y)-2u(x) = f(x).$$ We say that a subgraph has finite width if the distances from all vertices to the boundary are uniformly bounded. By Perron's method, we show the existence of bounded solutions. We also prove the uniqueness if… ▽ More We study the Dirichlet problem of the following discrete infinity Laplace equation on a subgraph with finite width $$Δ_{\infty} u(x) = \inf_{y \sim x}u(y)+\sup_{y \sim x}u(y)-2u(x) = f(x).$$ We say that a subgraph has finite width if the distances from all vertices to the boundary are uniformly bounded. By Perron's method, we show the existence of bounded solutions. We also prove the uniqueness if $f$ is nonnegative by establishing a comparison result, and hence obtain the existence of game values for corresponding tug-of-war games introduced by Peres, Schramm, Sheffield, and Wilson (2009). As an application we show a strong Liouville property for infinity harmonic functions. By an argument of Arzela-Ascoli, we prove the convergence of solutions of $\varepsilon$-tug-of-war games as $\varepsilon$ tends to 0. Correspondingly, we obtain the existence of bounded solutions to normalized infinity Laplace equations on Euclidean domains with finite width. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2310.14142 [pdf, ps, other]

On propensity score matching with a diverging number of matches

Authors: Yihui He, Fang Han

Abstract: This paper reexamines Abadie and Imbens (2016)'s work on propensity score matching for average treatment effect estimation. We explore the asymptotic behavior of these estimators when the number of nearest neighbors, $M$, grows with the sample size. It is shown, hardly surprising but technically nontrivial, that the modified estimators can improve upon the original fixed-$M$ estimators in terms of… ▽ More This paper reexamines Abadie and Imbens (2016)'s work on propensity score matching for average treatment effect estimation. We explore the asymptotic behavior of these estimators when the number of nearest neighbors, $M$, grows with the sample size. It is shown, hardly surprising but technically nontrivial, that the modified estimators can improve upon the original fixed-$M$ estimators in terms of efficiency. Additionally, we demonstrate the potential to attain the semiparametric efficiency lower bound when the propensity score achieves "sufficient" dimension reduction, echoing Hahn (1998)'s insight about the role of dimension reduction in propensity score-based causal inference. △ Less

Submitted 14 November, 2023; v1 submitted 21 October, 2023; originally announced October 2023.

Comments: This version corrects some typos

arXiv:2306.16574 [pdf, other]

On Lengths of $\mathbb{F}_2[x,y,z]/(x^{d_1}, y^{d_2},z^{d_3}, x+y+z)$

Authors: Fiona Han, Jennifer Kenkel, Daniel Li, Sridhar Venkatesh, Ashley Wiles

Abstract: In this paper, we provide a formula for the vector space dimension of the ring $\mathbb{F}_2[x,y,z]/(x^{d_1}, y^{d_2},z^{d_3}, x+y+z)$ over $\mathbb{F}_2$ when $d_1,d_2,d_3$ all lie between successive powers of $2$. For general $d_1,d_2,d_3$, we provide a simple algorithm to calculate the vector space dimension of $\mathbb{F}_2[x,y,z]/(x^{d_1}, y^{d_2},z^{d_3}, x+y+z)$ by combining our formula wit… ▽ More In this paper, we provide a formula for the vector space dimension of the ring $\mathbb{F}_2[x,y,z]/(x^{d_1}, y^{d_2},z^{d_3}, x+y+z)$ over $\mathbb{F}_2$ when $d_1,d_2,d_3$ all lie between successive powers of $2$. For general $d_1,d_2,d_3$, we provide a simple algorithm to calculate the vector space dimension of $\mathbb{F}_2[x,y,z]/(x^{d_1}, y^{d_2},z^{d_3}, x+y+z)$ by combining our formula with certain results of Chungsim Han (1992). △ Less

Submitted 11 March, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

arXiv:2306.13167 [pdf, ps, other]

Iterated residue, toric forms and Witten genus

Authors: Fei Han, Hao Li, Zhi Lü

Abstract: We introduce the notion of {\em iterated residue} to study generalized Bott manifolds. When applying the iterated residues to compute the Borisov-Gunnells toric form and the Witten genus of certain toric varieties as well as complete intersections, we obtain interesting vanishing results and some theta function identities, one of which is a twisted version of a classical Rogers-Ramanujan type form… ▽ More We introduce the notion of {\em iterated residue} to study generalized Bott manifolds. When applying the iterated residues to compute the Borisov-Gunnells toric form and the Witten genus of certain toric varieties as well as complete intersections, we obtain interesting vanishing results and some theta function identities, one of which is a twisted version of a classical Rogers-Ramanujan type formula. △ Less

Submitted 22 June, 2023; originally announced June 2023.

Comments: 19 pages

MSC Class: 55S99; 11Z99

arXiv:2306.04240 [pdf, other]

T-ADAF: Adaptive Data Augmentation Framework for Image Classification Network based on Tensor T-product Operator

Authors: Feiyang Han, Yun Miao, Zhaoyi Sun, Yimin Wei

Abstract: Image classification is one of the most fundamental tasks in Computer Vision. In practical applications, the datasets are usually not as abundant as those in the laboratory and simulation, which is always called as Data Hungry. How to extract the information of data more completely and effectively is very important. Therefore, an Adaptive Data Augmentation Framework based on the tensor T-product O… ▽ More Image classification is one of the most fundamental tasks in Computer Vision. In practical applications, the datasets are usually not as abundant as those in the laboratory and simulation, which is always called as Data Hungry. How to extract the information of data more completely and effectively is very important. Therefore, an Adaptive Data Augmentation Framework based on the tensor T-product Operator is proposed in this paper, to triple one image data to be trained and gain the result from all these three images together with only less than 0.1% increase in the number of parameters. At the same time, this framework serves the functions of column image embedding and global feature intersection, enabling the model to obtain information in not only spatial but frequency domain, and thus improving the prediction accuracy of the model. Numerical experiments have been designed for several models, and the results demonstrate the effectiveness of this adaptive framework. Numerical experiments show that our data augmentation framework can improve the performance of original neural network model by 2%, which provides competitive results to state-of-the-art methods. △ Less

Submitted 7 June, 2023; originally announced June 2023.

arXiv:2305.00250 [pdf, other]

A Direct Sampling-Based Deep Learning Approach for Inverse Medium Scattering Problems

Authors: Jianfeng Ning, Fuqun Han, Jun Zou

Abstract: In this work, we focus on the inverse medium scattering problem (IMSP), which aims to recover unknown scatterers based on measured scattered data. Motivated by the efficient direct sampling method (DSM) introduced in [23], we propose a novel direct sampling-based deep learning approach (DSM-DL)for reconstructing inhomogeneous scatterers. In particular, we use the U-Net neural network to learn the… ▽ More In this work, we focus on the inverse medium scattering problem (IMSP), which aims to recover unknown scatterers based on measured scattered data. Motivated by the efficient direct sampling method (DSM) introduced in [23], we propose a novel direct sampling-based deep learning approach (DSM-DL)for reconstructing inhomogeneous scatterers. In particular, we use the U-Net neural network to learn the relation between the index functions and the true contrasts. Our proposed DSM-DL is computationally efficient, robust to noise, easy to implement, and able to naturally incorporate multiple measured data to achieve high-quality reconstructions. Some representative tests are carried out with varying numbers of incident waves and different noise levels to evaluate the performance of the proposed method. The results demonstrate the promising benefits of combining deep learning techniques with the DSM for IMSP. △ Less

Submitted 29 April, 2023; originally announced May 2023.

arXiv:2303.14088 [pdf, ps, other]

On the failure of the bootstrap for Chatterjee's rank correlation

Authors: Zhexiao Lin, Fang Han

Abstract: While researchers commonly use the bootstrap for statistical inference, many of us have realized that the standard bootstrap, in general, does not work for Chatterjee's rank correlation. In this paper, we provide proof of this issue under an additional independence assumption, and complement our theory with simulation evidence for general settings. Chatterjee's rank correlation thus falls into a c… ▽ More While researchers commonly use the bootstrap for statistical inference, many of us have realized that the standard bootstrap, in general, does not work for Chatterjee's rank correlation. In this paper, we provide proof of this issue under an additional independence assumption, and complement our theory with simulation evidence for general settings. Chatterjee's rank correlation thus falls into a category of statistics that are asymptotically normal but bootstrap inconsistent. Valid inferential methods in this case are Chatterjee's original proposal (for testing independence) and Lin and Han (2022)'s analytic asymptotic variance estimator (for more general purposes). △ Less

Submitted 5 April, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

Comments: This revised version enhances the literature review of bootstrap inconsistency, adding, in particular, Beran's two papers that explore the connection between bootstrap inconsistency and superefficiency

arXiv:2301.00168 [pdf, other]

Blowup dynamics for equivariant critical Landau--Lifshitz flow

Authors: Fangyu Han, Zhong Tan

Abstract: The existence of finite time blowup solutions for the two-dimensional Landau--Lifshitz equation is a long-standing problem, which exists in the literature at least since 2001 (E, Mathematics Unlimited--2001 and Beyond, Springer, Berlin, P.410, 2001). A more refined description in the equivariant class is given in (van den Berg and Williams, European J. Appl. Math., 24(6), 912--948, 2013). In this… ▽ More The existence of finite time blowup solutions for the two-dimensional Landau--Lifshitz equation is a long-standing problem, which exists in the literature at least since 2001 (E, Mathematics Unlimited--2001 and Beyond, Springer, Berlin, P.410, 2001). A more refined description in the equivariant class is given in (van den Berg and Williams, European J. Appl. Math., 24(6), 912--948, 2013). In this paper, we consider the blowup dynamics of the Landau--Lifshitz equation $$ \partial_tu=\mathfrak{a}_1u\timesΔu-\mathfrak{a}_2u\times(u\timesΔu),\quad x\in\mathbb{R}^2, $$ where $u\in\mathbb{S}^2$, $\mathfrak{a}_1+i\mathfrak{a}_2\in\mathbb{C}$ with $\mathfrak{a}_2\geq0$ and $\mathfrak{a}_1+\mathfrak{a}_2=1$. We prove the existence of 1-equivariant Krieger--Schlag--Tataru type blowup solutions near the lowest energy steady state. More precisely, we prove that for any $ν>1$, there exists a 1-equivariant finite-time blowup solution of the form $$ u(x,t)=φ(λ(t)x)+ζ(x,t),\quad λ(t)=t^{-1/2-ν}, $$ where $φ$ is a lowest energy steady state and $ζ(t)$ is arbitrary small in $\dot{H}^1\cap\dot{H}^2$. The proof is accomplished by renormalizing the blowup profile and a perturbative analysis in the spirit of (Krieger, Schlag and Tataru, Invent. Math., 171(3), 543--615, 2008), (Perelman, Comm. Math. Phys., 330(1), 69--105, 2014) and (Ortoleva and Perelman, Algebra i Analiz, 25(2), 271--294, 2013). △ Less

Submitted 31 December, 2022; originally announced January 2023.

arXiv:2212.11775 [pdf, other]

An efficient peridynamics-based statistical multiscale method for fracture in composite structure with randomly distributed particles

Authors: Zihao Yang, Shaoqi Zheng, Shangkun Shen, Fei Han

Abstract: The fracture simulation of random particle reinforced composite structures remains a challenge. Current techniques either assumed a homogeneous model, ignoring the microstructure characteristics of composite structures, or considered a micro-mechanical model, involving intractable computational costs. This paper proposes a peridynamics-based statistical multiscale (PSM) framework to simulate the m… ▽ More The fracture simulation of random particle reinforced composite structures remains a challenge. Current techniques either assumed a homogeneous model, ignoring the microstructure characteristics of composite structures, or considered a micro-mechanical model, involving intractable computational costs. This paper proposes a peridynamics-based statistical multiscale (PSM) framework to simulate the macroscopic structure fracture with high efficiency. The heterogeneities of composites, including the shape, spatial distribution and volume fraction of particles, are characterized within the representative volume elements (RVEs), and their impact on structure failure are extracted as two types of peridynamic parameters, namely, statistical critical stretch and equivalent micromodulus. At the microscale level, a bond-based peridynamic (BPD) model with energy-based micromodulus correction technique is introduced to simulate the fracture in RVEs, and then the computational model of statistical critical stretch is established through micromechanical analysis. Moreover, based on the statistical homogenization approach, the computational model of effective elastic tensor is also established. Then, the equivalent micromodulus can be derived from the effective elastic tensor, according to the energy density equivalence between classical continuum mechanics (CCM) and BPD models. At the macroscale level, a macroscale BPD model with the statistical critical stretch and the equivalent micromodulus is constructed to simulate the fracture in the macroscopic homogenized structures. The algorithm framework of the PSM method is also described. Two- and three-dimensional numerical examples illustrate the validity, accuracy and efficiency of the proposed method. △ Less

Submitted 15 November, 2022; originally announced December 2022.

arXiv:2212.05424 [pdf, ps, other]

On regression-adjusted imputation estimators of the average treatment effect

Authors: Zhexiao Lin, Fang Han

Abstract: Imputing missing potential outcomes using an estimated regression function is a natural idea for estimating causal effects. In the literature, estimators that combine imputation and regression adjustments are believed to be comparable to augmented inverse probability weighting. Accordingly, people for a long time conjectured that such estimators, while avoiding directly constructing the weights, a… ▽ More Imputing missing potential outcomes using an estimated regression function is a natural idea for estimating causal effects. In the literature, estimators that combine imputation and regression adjustments are believed to be comparable to augmented inverse probability weighting. Accordingly, people for a long time conjectured that such estimators, while avoiding directly constructing the weights, are also doubly robust (Imbens, 2004; Stuart, 2010). Generalizing an earlier result of the authors (Lin et al., 2021), this paper formalizes this conjecture, showing that a large class of regression-adjusted imputation methods are indeed doubly robust for estimating the average treatment effect. In addition, they are provably semiparametrically efficient as long as both the density and regression models are correctly specified. Notable examples of imputation methods covered by our theory include kernel matching, (weighted) nearest neighbor matching, local linear matching, and (honest) random forests. △ Less

Submitted 19 January, 2023; v1 submitted 11 December, 2022; originally announced December 2022.

Comments: more references were added in this version

arXiv:2211.00217 [pdf, other]

Tensor Regularized Total Least Squares Methods with Applications to Image and Video Deblurring

Authors: F. Han, Y. Wei, P. Xie

Abstract: Total least squares (TLS) is an effective method for solving linear equations with the situations, when noise is not just in observation matrices but also in map** matrices. Moreover, the Tikhonov regularization is widely used in plenty of ill-posed problems. In this paper, we extend the regularized total least squares (RTLS) method from the matrix form due to Golub, Hansen and O'Leary, to the t… ▽ More Total least squares (TLS) is an effective method for solving linear equations with the situations, when noise is not just in observation matrices but also in map** matrices. Moreover, the Tikhonov regularization is widely used in plenty of ill-posed problems. In this paper, we extend the regularized total least squares (RTLS) method from the matrix form due to Golub, Hansen and O'Leary, to the tensor form proposing the tensor regularized total least squares (TR-TLS) method for solving ill-conditioned tensor systems of equations. Properties and algorithms about the solution of the TR-TLS problem, which might be similar to those of the RTLS, are also presented and proved. Based on this method, some applications in image and video deblurring are explored. Numerical examples illustrate the TR-TLS, compared with the existing methods. △ Less

Submitted 11 November, 2022; v1 submitted 31 October, 2022; originally announced November 2022.

arXiv:2209.11951 [pdf, ps, other]

Almost Nonnegative Ricci curvature and new vanishing theorems for genera

Authors: Xiaoyang Chen, Jian Ge, Fei Han

Abstract: We derive several vanishing theorems for genera under almost nonnegative Ricci curvature and infinite fundamental group, which includes Todd genus, $\widehat{A}$-genus, elliptic genera and Witten genus. A vanishing theorem of Euler characteristic number for almost nonnegatively curved Alexandrov spaces is also proved. We derive several vanishing theorems for genera under almost nonnegative Ricci curvature and infinite fundamental group, which includes Todd genus, $\widehat{A}$-genus, elliptic genera and Witten genus. A vanishing theorem of Euler characteristic number for almost nonnegatively curved Alexandrov spaces is also proved. △ Less

Submitted 24 September, 2022; originally announced September 2022.

arXiv:2209.11156 [pdf, ps, other]

Azadkia-Chatterjee's correlation coefficient adapts to manifold data

Authors: Fang Han, Zhihan Huang

Abstract: In their seminal work, Azadkia and Chatterjee (2021) initiated graph-based methods for measuring variable dependence strength. By appealing to nearest neighbor graphs, they gave an elegant solution to a problem of Rényi (Rényi, 1959). Their idea was later developed in Deb et al. (2020) and the authors there proved that, quite interestingly, Azadkia and Chatterjee's correlation coefficient can auto… ▽ More In their seminal work, Azadkia and Chatterjee (2021) initiated graph-based methods for measuring variable dependence strength. By appealing to nearest neighbor graphs, they gave an elegant solution to a problem of Rényi (Rényi, 1959). Their idea was later developed in Deb et al. (2020) and the authors there proved that, quite interestingly, Azadkia and Chatterjee's correlation coefficient can automatically adapt to the manifold structure of the data. This paper furthers their study in terms of calculating the statistic's limiting variance under independence and showing that it only depends on the manifold dimension. △ Less

Submitted 22 September, 2022; originally announced September 2022.

Comments: 25 pages

arXiv:2209.01003 [pdf, other]

Discrete Schwarz rearrangement in lattice graphs

Authors: Hichem Hajaiej, Fengwen Han, Bobo Hua

Abstract: In this paper, we prove a discrete version of the generalized Riesz inequality on $\mathbb{Z}^d$. As a consequence, we will derive the extended Hardy-Littlewood and Pólya-Szegö inequalities. We will also establish cases of equality in the latter. Our approach is totally novel and self-contained. In particular, we invented a definition for the discrete rearrangement in higher dimensions. Moreover,… ▽ More In this paper, we prove a discrete version of the generalized Riesz inequality on $\mathbb{Z}^d$. As a consequence, we will derive the extended Hardy-Littlewood and Pólya-Szegö inequalities. We will also establish cases of equality in the latter. Our approach is totally novel and self-contained. In particular, we invented a definition for the discrete rearrangement in higher dimensions. Moreover, we show that the definition "suggested" by Pruss does not work. We solve a long-standing open question raised by Alexander Pruss in [Pru98, p494], Duke Math Journal, and discussed with him in several communications in 2009-2010, [Pru10]. Our method also provides a line of attack to prove other discrete rearrangement inequalities and opens the door to the establishment of optimizers of many important discrete functional inequalities in $\mathbb{Z}^d,$ $d\geq2$. We will also discuss some applications of our findings. To the best of our knowledge, our results are the first ones in the literature dealing with discrete rearrangement on $\mathbb{Z}^d,$ $d\geq2$. △ Less

Submitted 24 September, 2022; v1 submitted 2 September, 2022; originally announced September 2022.

Comments: We revised some expressions; All comments are welcome

MSC Class: 39B62; 35A15; 47J30

arXiv:2207.03134 [pdf, ps, other]

T-duality with $H$-flux for $2d$ $σ$-models

Authors: Fei Han, Varghese Mathai

Abstract: In this paper, we establish graded T-duality for $2d$ $σ$-models with $H$-flux after localization. This establishes the most general version of T-duality for Type II String Theory. The graded T-duality map, which we call {\bf graded Hori morphism}, is compatible with the Jacobi property of the graded fields, that was earlier studied in \cite{HM21}. Also included are some open problems/conjectures. In this paper, we establish graded T-duality for $2d$ $σ$-models with $H$-flux after localization. This establishes the most general version of T-duality for Type II String Theory. The graded T-duality map, which we call {\bf graded Hori morphism}, is compatible with the Jacobi property of the graded fields, that was earlier studied in \cite{HM21}. Also included are some open problems/conjectures. △ Less

Submitted 24 February, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

Comments: 22pp. Section 1.3 added, giving a simpler, equivalent construction of the line bundle on LLM

arXiv:2204.08031 [pdf, ps, other]

Limit theorems of Chatterjee's rank correlation

Authors: Zhexiao Lin, Fang Han

Abstract: Establishing the limiting distribution of Chatterjee's rank correlation for a general, possibly non-independent, pair of random variables has been eagerly awaited to many. This paper shows that (a) Chatterjee's rank correlation is asymptotically normal as long as one variable is not a measurable function of the other, (b) the corresponding asymptotic variance is uniformly bounded by 36, and (c) a… ▽ More Establishing the limiting distribution of Chatterjee's rank correlation for a general, possibly non-independent, pair of random variables has been eagerly awaited to many. This paper shows that (a) Chatterjee's rank correlation is asymptotically normal as long as one variable is not a measurable function of the other, (b) the corresponding asymptotic variance is uniformly bounded by 36, and (c) a consistent variance estimator exists. Similar results also hold for Azadkia-Chatterjee's graph-based correlation coefficient, a multivariate analogue of Chatterjee's original proposal. The proof is given by appealing to Hájek representation and Chatterjee's nearest-neighbor CLT. △ Less

Submitted 3 November, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

Comments: Consistent variance estimators of Chatterjee's rank correlation and Azadkia-Chatterjee's graph-based correlation coefficient (applicable to any fixed and continuous distributions) were added to this version

arXiv:2203.14439 [pdf, ps, other]

Fractional structures on bundle gerbe modules and fractional classifying spaces

Authors: Fei Han, Ruizhi Huang, Varghese Mathai

Abstract: We study the homotopy aspects of the twisted Chern classes of torsion bundle gerbe modules. Using Sullivan's rational homotopy theory, we realize the twisted Chern classes at the level of classifying spaces. The construction suggests a notion, which we call fractional U-structure serving as a universal framework to study the twisted Chern classes of torsion bundle gerbe modules from the perspectiv… ▽ More We study the homotopy aspects of the twisted Chern classes of torsion bundle gerbe modules. Using Sullivan's rational homotopy theory, we realize the twisted Chern classes at the level of classifying spaces. The construction suggests a notion, which we call fractional U-structure serving as a universal framework to study the twisted Chern classes of torsion bundle gerbe modules from the perspective of classifying spaces. Based on this, we introduce and study higher fractional structures on torsion bundle gerbe modules parallel to the higher structures on ordinary vector bundles. △ Less

Submitted 27 March, 2022; originally announced March 2022.

Comments: 54 pages; comments are very welcome

arXiv:2202.09908 [pdf, ps, other]

doi 10.1142/S0129055X22500192

T-duality, vertical holonomy line bundles and loop Hori formulae

Authors: Fei Han, Varghese Mathai

Abstract: This paper is a step towards realizing T-duality and Hori formulae for loop spaces. Here we prove T-duality and Hori formulae for winding q-loop spaces, which are infinite dimensional subspaces of loop spaces. This paper is a step towards realizing T-duality and Hori formulae for loop spaces. Here we prove T-duality and Hori formulae for winding q-loop spaces, which are infinite dimensional subspaces of loop spaces. △ Less

Submitted 20 February, 2022; originally announced February 2022.

Comments: 23 pages

Journal ref: Vol 34 no.7 (2022) 2250019, 25pp

arXiv:2112.13506 [pdf, ps, other]

Estimation based on nearest neighbor matching: from density ratio to average treatment effect

Authors: Zhexiao Lin, Peng Ding, Fang Han

Abstract: Nearest neighbor (NN) matching as a tool to align data sampled from different groups is both conceptually natural and practically well-used. In a landmark paper, Abadie and Imbens (2006) provided the first large-sample analysis of NN matching under, however, a crucial assumption that the number of NNs, $M$, is fixed. This manuscript reveals something new out of their study and shows that, once all… ▽ More Nearest neighbor (NN) matching as a tool to align data sampled from different groups is both conceptually natural and practically well-used. In a landmark paper, Abadie and Imbens (2006) provided the first large-sample analysis of NN matching under, however, a crucial assumption that the number of NNs, $M$, is fixed. This manuscript reveals something new out of their study and shows that, once allowing $M$ to diverge with the sample size, an intrinsic statistic in their analysis actually constitutes a consistent estimator of the density ratio. Furthermore, through selecting a suitable $M$, this statistic can attain the minimax lower bound of estimation over a Lipschitz density function class. Consequently, with a diverging $M$, the NN matching provably yields a doubly robust estimator of the average treatment effect and is semiparametrically efficient if the density functions are sufficiently smooth and the outcome model is appropriately specified. It can thus be viewed as a precursor of double machine learning estimators. △ Less

Submitted 26 December, 2021; originally announced December 2021.

Comments: 73 pages

arXiv:2112.02421 [pdf, ps, other]

Nonparametric mixture MLEs under Gaussian-smoothed optimal transport distance

Authors: Fang Han, Zhen Miao, Yandi Shen

Abstract: The Gaussian-smoothed optimal transport (GOT) framework, pioneered in Goldfeld et al. (2020) and followed up by a series of subsequent papers, has quickly caught attention among researchers in statistics, machine learning, information theory, and related fields. One key observation made therein is that, by adapting to the GOT framework instead of its unsmoothed counterpart, the curse of dimensiona… ▽ More The Gaussian-smoothed optimal transport (GOT) framework, pioneered in Goldfeld et al. (2020) and followed up by a series of subsequent papers, has quickly caught attention among researchers in statistics, machine learning, information theory, and related fields. One key observation made therein is that, by adapting to the GOT framework instead of its unsmoothed counterpart, the curse of dimensionality for using the empirical measure to approximate the true data generating distribution can be lifted. The current paper shows that a related observation applies to the estimation of nonparametric mixing distributions in discrete exponential family models, where under the GOT cost the estimation accuracy of the nonparametric MLE can be accelerated to a polynomial rate. This is in sharp contrast to the classical sub-polynomial rates based on unsmoothed metrics, which cannot be improved from an information-theoretical perspective. A key step in our analysis is the establishment of a new Jackson-type approximation bound of Gaussian-convoluted Lipschitz functions. This insight bridges existing techniques of analyzing the nonparametric MLEs and the new GOT framework. △ Less

Submitted 4 December, 2021; originally announced December 2021.

Comments: 26 pages

arXiv:2111.15567 [pdf, other]

Distribution-free tests of multivariate independence based on center-outward quadrant, Spearman, Kendall, and van der Waerden statistics

Authors: Hongjian Shi, Mathias Drton, Marc Hallin, Fang Han

Abstract: Due to the lack of a canonical ordering in ${\mathbb R}^d$ for $d>1$, defining multivariate generalizations of the classical univariate ranks has been a long-standing open problem in statistics. Optimal transport has been shown to offer a solution in which multivariate ranks are obtained by transporting data points to a grid that approximates a uniform reference measure (Chernozhukov et al., 2017;… ▽ More Due to the lack of a canonical ordering in ${\mathbb R}^d$ for $d>1$, defining multivariate generalizations of the classical univariate ranks has been a long-standing open problem in statistics. Optimal transport has been shown to offer a solution in which multivariate ranks are obtained by transporting data points to a grid that approximates a uniform reference measure (Chernozhukov et al., 2017; Hallin, 2017; Hallin et al., 2021), thereby inducing ranks, signs, and a data-driven ordering of ${\mathbb R}^d$. We take up this new perspective to define and study multivariate analogues of the sign covariance/quadrant statistic, Spearman's rho, Kendall's tau, and van der Waerden covariances. The resulting tests of multivariate independence are fully distribution-free, hence uniformly valid irrespective of the actual (absolutely continuous) distribution of the observations. Our results provide the asymptotic distribution theory for these new test statistics, with asymptotic approximations to critical values to be used for testing independence between random vectors, as well as a power analysis of the resulting tests in an extension of the so-called Konijn model. For the van der Waerden tests, this power analysis includes a multivariate Chernoff--Savage property guaranteeing that, under elliptical generalized Konijn models, the asymptotic relative efficiency with respect to Wilks' classical (pseudo-)Gaussian procedure of our van der Waerden tests is strictly larger than or equal to one, where equality is achieved under Gaussian distributions only. We similarly provide a lower bound for the asymptotic relative efficiency of our Spearman procedure with respect to Wilks' test, thus extending the classical result by Hodges and Lehmann on the asymptotic relative efficiency, in univariate location models, of Wilcoxon tests with respect to the Student ones. △ Less

Submitted 12 January, 2024; v1 submitted 30 November, 2021; originally announced November 2021.

Comments: 44 pages

arXiv:2110.11022 [pdf, ps, other]

doi 10.2140/pjm.2024.328.275

Elliptic genus and string cobordism at dimension $24$

Authors: Fei Han, Ruizhi Huang

Abstract: It is known that spin cobordism can be determined by Stiefel-Whitney numbers and index theory invariants, namely $KO$-theoretic Pontryagin numbers. In this paper, we show that string cobordism at dimension 24 can be determined by elliptic genus, a higher index theory invariant. We also compute the image of 24 dimensional string cobordism under elliptic genus. Using our results, we show that under… ▽ More It is known that spin cobordism can be determined by Stiefel-Whitney numbers and index theory invariants, namely $KO$-theoretic Pontryagin numbers. In this paper, we show that string cobordism at dimension 24 can be determined by elliptic genus, a higher index theory invariant. We also compute the image of 24 dimensional string cobordism under elliptic genus. Using our results, we show that under certain curvature conditions, a compact 24 dimensional string manifold must bound a string manifold. △ Less

Submitted 21 October, 2021; originally announced October 2021.

Comments: 10 pages; comments are very welcome

Journal ref: Pacific J. Math. 328 (2024) 275-286

arXiv:2110.06489 [pdf, other]

Graphs with nonnegative Ricci curvature and maximum degree at most 3

Authors: Fengwen Han, Tao Wang

Abstract: In this paper, we classify graphs with nonnegative Lin-Lu-Yau-Ollivier Ricci curvature, maximum degree at most 3 and diameter at least 6. In this paper, we classify graphs with nonnegative Lin-Lu-Yau-Ollivier Ricci curvature, maximum degree at most 3 and diameter at least 6. △ Less

Submitted 13 October, 2021; originally announced October 2021.

MSC Class: 05C99; 51F99; 52C99; 53A40

arXiv:2108.06828 [pdf, other]

On boosting the power of Chatterjee's rank correlation

Authors: Zhexiao Lin, Fang Han

Abstract: Chatterjee (2021)'s ingenious approach to estimating a measure of dependence first proposed by Dette et al. (2013) based on simple rank statistics has quickly caught attention. This measure of dependence has the unusual property of being between 0 and 1, and being 0 or 1 if and only if the corresponding pair of random variables is independent or one is a measurable function of the other almost sur… ▽ More Chatterjee (2021)'s ingenious approach to estimating a measure of dependence first proposed by Dette et al. (2013) based on simple rank statistics has quickly caught attention. This measure of dependence has the unusual property of being between 0 and 1, and being 0 or 1 if and only if the corresponding pair of random variables is independent or one is a measurable function of the other almost surely. However, more recent studies (Cao and Bickel, 2020; Shi et al., 2021b) showed that independence tests based on Chatterjee's rank correlation are unfortunately rate-inefficient against various local alternatives and they call for variants. We answer this call by proposing revised Chatterjee's rank correlations that still consistently estimate the same dependence measure but provably achieve near-parametric efficiency in testing against Gaussian rotation alternatives. This is possible via incorporating many right nearest neighbors in constructing the correlation coefficients. We thus overcome the "only one disadvantage" of Chatterjee's rank correlation (Chatterjee, 2021, Section 7). △ Less

Submitted 15 August, 2021; originally announced August 2021.

Comments: 65 pages

arXiv:2108.06827 [pdf, ps, other]

On Azadkia-Chatterjee's conditional dependence coefficient

Authors: Hongjian Shi, Mathias Drton, Fang Han

Abstract: In recent work, Azadkia and Chatterjee (2021) laid out an ingenious approach to defining consistent measures of conditional dependence. Their fully nonparametric approach forms statistics based on ranks and nearest neighbor graphs. The appealing nonparametric consistency of the resulting conditional dependence measure and the associated empirical conditional dependence coefficient has quickly prom… ▽ More In recent work, Azadkia and Chatterjee (2021) laid out an ingenious approach to defining consistent measures of conditional dependence. Their fully nonparametric approach forms statistics based on ranks and nearest neighbor graphs. The appealing nonparametric consistency of the resulting conditional dependence measure and the associated empirical conditional dependence coefficient has quickly prompted follow-up work that seeks to study its statistical efficiency. In this paper, we take up the framework of conditional randomization tests (CRT) for conditional independence and conduct a power analysis that considers two types of local alternatives, namely, parametric quadratic mean differentiable alternatives and nonparametric Hölder smooth alternatives. Our local power analysis shows that conditional independence tests using the Azadkia--Chatterjee coefficient remain inefficient even when aided with the CRT framework, and serves as motivation to develop variants of the approach; cf. Lin and Han (2022b). As a byproduct, we resolve a conjecture of Azadkia and Chatterjee by proving central limit theorems for the considered conditional dependence coefficients, with explicit formulas for the asymptotic variances. △ Less

Submitted 22 September, 2022; v1 submitted 15 August, 2021; originally announced August 2021.

Comments: to appear in Bernoulli

arXiv:2103.11413 [pdf, ps, other]

doi 10.1007/s00209-021-02877-6

On characteristic numbers of $24$ dimensional String manifolds

Authors: Fei Han, Ruizhi Huang

Abstract: In this paper, we study the Pontryagin numbers of $24$ dimensional String manifolds. In particular, we find representatives of an integral basis of the String cobrodism group at dimension $24$, based on the work of Mahowald-Hopkins \cite{MH02}, Borel-Hirzebruch \cite{BH58} and Wall \cite{Wall62}. This has immediate applications on the divisibility of various characteristic numbers of the manifolds… ▽ More In this paper, we study the Pontryagin numbers of $24$ dimensional String manifolds. In particular, we find representatives of an integral basis of the String cobrodism group at dimension $24$, based on the work of Mahowald-Hopkins \cite{MH02}, Borel-Hirzebruch \cite{BH58} and Wall \cite{Wall62}. This has immediate applications on the divisibility of various characteristic numbers of the manifolds. In particular, we establish the $2$-primary divisibilities of the signature and of the modified signature coupling with the integral Wu class of Hopkins-Singer \cite{HS05}, and also the $3$-primary divisibility of the twisted signature. Our results provide potential clues to understand a question of Teichner. △ Less

Submitted 23 October, 2021; v1 submitted 21 March, 2021; originally announced March 2021.

Comments: final version

arXiv:2103.10208 [pdf, ps, other]

Twisted Milnor Hypersurface I

Authors: **gfang Lian, Fei Han, Hao Li, Zhi Lü

Abstract: In this paper, we study {\bf twisted Milnor hypersurfaces} and compute their $\hat A$-genus and Atiyah-Singer-Milnor $α$-invariant. Our tool to compute the $α$-invariant is Zhang's analytic Rokhlin congruence formula. We also give some applications about group actions and metrics of positive scalar curvature on twisted Milnor hypersurfaces. In this paper, we study {\bf twisted Milnor hypersurfaces} and compute their $\hat A$-genus and Atiyah-Singer-Milnor $α$-invariant. Our tool to compute the $α$-invariant is Zhang's analytic Rokhlin congruence formula. We also give some applications about group actions and metrics of positive scalar curvature on twisted Milnor hypersurfaces. △ Less

Submitted 18 March, 2021; originally announced March 2021.

arXiv:2010.10945 [pdf, other]

A Direct Sampling Method for the Inversion of the Radon Transform

Authors: Yat Tin Chow, Fuqun Han, Jun Zou

Abstract: We propose a novel direct sampling method (DSM) for the effective and stable inversion of the Radon transform. The DSM is based on a generalization of the important almost orthogonality property in classical DSMs to fractional order Sobolev duality products and to a new family of probing functions. The fractional order duality product proves to be able to greatly enhance the robustness of the reco… ▽ More We propose a novel direct sampling method (DSM) for the effective and stable inversion of the Radon transform. The DSM is based on a generalization of the important almost orthogonality property in classical DSMs to fractional order Sobolev duality products and to a new family of probing functions. The fractional order duality product proves to be able to greatly enhance the robustness of the reconstructions in some practically important but severely ill-posed inverse problems associated with the Radon transform. We present a detailed analysis to better understand the performance of the new probing and index functions, which are crucial to stable and effective numerical reconstructions. The DSM can be computed in a very fast and highly parallel manner. Numerical experiments are carried out to compare the DSM with a popular existing method, and to illustrate the efficiency, stability, and accuracy of the DSM. △ Less

Submitted 21 October, 2020; originally announced October 2020.

MSC Class: 44A12; 65R32; 92C55; 94A08

arXiv:2009.12793 [pdf, other]

Uniqueness class of solutions to a class of linear evolution equations

Authors: Fengwen Han, Bobo Hua

Abstract: In this paper, we study the wave equation on infinite graphs. On one hand, in contrast to the wave equation on manifolds, we construct an example for the non-uniqueness for the Cauchy problem of the wave equation on graphs. On the other hand, we obtain a sharp uniqueness class for the solutions of the wave equation. The result follows from the time analyticity of the solutions to the wave equation… ▽ More In this paper, we study the wave equation on infinite graphs. On one hand, in contrast to the wave equation on manifolds, we construct an example for the non-uniqueness for the Cauchy problem of the wave equation on graphs. On the other hand, we obtain a sharp uniqueness class for the solutions of the wave equation. The result follows from the time analyticity of the solutions to the wave equation in the uniqueness class. In the last part, we extend the result to a wide class of linear evolution equations. △ Less

Submitted 16 November, 2023; v1 submitted 27 September, 2020; originally announced September 2020.

MSC Class: 35R02; 35A02

arXiv:2008.11619 [pdf, ps, other]

On the power of Chatterjee rank correlation

Authors: Hongjian Shi, Mathias Drton, Fang Han

Abstract: Chatterjee (2021) introduced a simple new rank correlation coefficient that has attracted much recent attention. The coefficient has the unusual appeal that it not only estimates a population quantity first proposed by Dette et al. (2013) that is zero if and only if the underlying pair of random variables is independent, but also is asymptotically normal under independence. This paper compares Cha… ▽ More Chatterjee (2021) introduced a simple new rank correlation coefficient that has attracted much recent attention. The coefficient has the unusual appeal that it not only estimates a population quantity first proposed by Dette et al. (2013) that is zero if and only if the underlying pair of random variables is independent, but also is asymptotically normal under independence. This paper compares Chatterjee's new correlation coefficient to three established rank correlations that also facilitate consistent tests of independence, namely, Hoeffding's $D$, Blum-Kiefer-Rosenblatt's $R$, and Bergsma-Dassios-Yanagimoto's $τ^*$. We contrast their computational efficiency in light of recent advances, and investigate their power against local rotation and mixture alternatives. Our main results show that Chatterjee's coefficient is unfortunately rate sub-optimal compared to $D$, $R$, and $τ^*$. The situation is more subtle for a related earlier estimator of Dette et al. (2013). These results favor $D$, $R$, and $τ^*$ over Chatterjee's new correlation coefficient for the purpose of testing independence. △ Less

Submitted 25 April, 2021; v1 submitted 26 August, 2020; originally announced August 2020.

Comments: to appear in Biometrika

arXiv:2007.02186 [pdf, other]

On universally consistent and fully distribution-free rank tests of vector independence

Authors: Hongjian Shi, Marc Hallin, Mathias Drton, Fang Han

Abstract: Rank correlations have found many innovative applications in the last decade. In particular, suitable rank correlations have been used for consistent tests of independence between pairs of random variables. Using ranks is especially appealing for continuous data as tests become distribution-free. However, the traditional concept of ranks relies on ordering data and is, thus, tied to univariate obs… ▽ More Rank correlations have found many innovative applications in the last decade. In particular, suitable rank correlations have been used for consistent tests of independence between pairs of random variables. Using ranks is especially appealing for continuous data as tests become distribution-free. However, the traditional concept of ranks relies on ordering data and is, thus, tied to univariate observations. As a result, it has long remained unclear how one may construct distribution-free yet consistent tests of independence between random vectors. This is the problem addressed in this paper, in which we lay out a general framework for designing dependence measures that give tests of multivariate independence that are not only consistent and distribution-free but which we also prove to be statistically efficient. Our framework leverages the recently introduced concept of center-outward ranks and signs, a multivariate generalization of traditional ranks, and adopts a common standard form for dependence measures that encompasses many popular examples. In a unified study, we derive a general asymptotic representation of center-outward rank-based test statistics under independence, extending to the multivariate setting the classical Hájek asymptotic representation results. This representation permits direct calculation of limiting null distributions and facilitates a local power analysis that provides strong support for the center-outward approach by establishing, for the first time, the nontrivial power of center-outward rank-based tests over root-$n$ neighborhoods within the class of quadratic mean differentiable alternatives. △ Less

Submitted 2 May, 2021; v1 submitted 4 July, 2020; originally announced July 2020.

Comments: 52 pages with title changed and more materials put in, including, particularly, a more general local power analysis covering many smooth alternatives beyond the Konijn ones

arXiv:2005.05499 [pdf, ps, other]

A Direct Sampling Method for Simultaneously Recovering Inhomogeneous Inclusions of Different Nature

Authors: Yat Tin Chow, Fuqun Han, Jun Zou

Abstract: In this work, we investigate a class of elliptic inverse problems and aim to simultaneously recover multiple inhomogeneous inclusions arising from two different physical parameters, using very limited boundary Cauchy data collected only at one or two measurement events. We propose a new fast, stable and highly parallelable direct sampling method (DSM) for the simultaneous reconstruction process. T… ▽ More In this work, we investigate a class of elliptic inverse problems and aim to simultaneously recover multiple inhomogeneous inclusions arising from two different physical parameters, using very limited boundary Cauchy data collected only at one or two measurement events. We propose a new fast, stable and highly parallelable direct sampling method (DSM) for the simultaneous reconstruction process. Two groups of probing and index functions are constructed, and their desired properties are analyzed. In order to identify and decouple the multiple inhomogeneous inclusions of different physical nature, we introduce a new concept of mutually almost orthogonality property that generalizes the important concept of almost orthogonality property in classical DSMs for inhomogeneous inclusions of same physical nature. With the help of this new concept, we develop a reliable strategy to distinguish two different types of inhomogeneous inclusions with noisy data collected at one or two measurement events. We further improve the decoupling effect by choosing an appropriate boundary influx. Numerical experiments are presented to illustrate the robustness and efficiency of the proposed method. △ Less

Submitted 4 June, 2020; v1 submitted 11 May, 2020; originally announced May 2020.

MSC Class: 35J67; 35R30; 65N21; 78M25

arXiv:2005.02344 [pdf, ps, other]

doi 10.1016/j.aim.2021.108023

Cubic forms, anomaly cancellation and modularity

Authors: Fei Han, Ruizhi Huang, Kefeng Liu, Wei** Zhang

Abstract: Motivated by the cubic forms and anomaly cancellation formulas of Witten-Freed-Hopkins, we give some new cubic forms on spin, spin$^c$, spin$^{w_2}$ and orientable 12-manifolds respectively. We relate them to $η$-invariants when the manifolds are with boundary, and mod 2 indices on 10 dimensional characteristic submanifolds when the manifolds are spin$^c$ or spin$^{w_2}$. Our method of producing t… ▽ More Motivated by the cubic forms and anomaly cancellation formulas of Witten-Freed-Hopkins, we give some new cubic forms on spin, spin$^c$, spin$^{w_2}$ and orientable 12-manifolds respectively. We relate them to $η$-invariants when the manifolds are with boundary, and mod 2 indices on 10 dimensional characteristic submanifolds when the manifolds are spin$^c$ or spin$^{w_2}$. Our method of producing these cubic forms is a combination of (generalized) Witten classes and the character of the basic representation of affine $E_8$. △ Less

Submitted 23 October, 2021; v1 submitted 5 May, 2020; originally announced May 2020.

Comments: final version

arXiv:2004.10922 [pdf, other]

On a phase transition in general order spline regression

Authors: Yandi Shen, Qiyang Han, Fang Han

Abstract: In the Gaussian sequence model $Y= θ_0 + \varepsilon$ in $\mathbb{R}^n$, we study the fundamental limit of approximating the signal $θ_0$ by a class $Θ(d,d_0,k)$ of (generalized) splines with free knots. Here $d$ is the degree of the spline, $d_0$ is the order of differentiability at each inner knot, and $k$ is the maximal number of pieces. We show that, given any integer $d\geq 0$ and… ▽ More In the Gaussian sequence model $Y= θ_0 + \varepsilon$ in $\mathbb{R}^n$, we study the fundamental limit of approximating the signal $θ_0$ by a class $Θ(d,d_0,k)$ of (generalized) splines with free knots. Here $d$ is the degree of the spline, $d_0$ is the order of differentiability at each inner knot, and $k$ is the maximal number of pieces. We show that, given any integer $d\geq 0$ and $d_0\in\{-1,0,\ldots,d-1\}$, the minimax rate of estimation over $Θ(d,d_0,k)$ exhibits the following phase transition: \begin{equation*} \begin{aligned} \inf_{\widetildeθ}\sup_{θ\inΘ(d,d_0, k)}\mathbb{E}_θ\|\widetildeθ - θ\|^2 \asymp_d \begin{cases} k\log\log(16n/k), & 2\leq k\leq k_0,\\ k\log(en/k), & k \geq k_0+1. \end{cases} \end{aligned} \end{equation*} The transition boundary $k_0$, which takes the form $\lfloor{(d+1)/(d-d_0)\rfloor} + 1$, demonstrates the critical role of the regularity parameter $d_0$ in the separation between a faster $\log \log(16n)$ and a slower $\log(en)$ rate. We further show that, once encouraging an additional '$d$-monotonicity' shape constraint (including monotonicity for $d = 0$ and convexity for $d=1$), the above phase transition is eliminated and the faster $k\log\log(16n/k)$ rate can be achieved for all $k$. These results provide theoretical support for develo** $\ell_0$-penalized (shape-constrained) spline regression procedures as useful alternatives to $\ell_1$- and $\ell_2$-penalized ones. △ Less

Submitted 6 May, 2020; v1 submitted 22 April, 2020; originally announced April 2020.

arXiv:2003.09897 [pdf, ps, other]

New Bochner type theorems

Authors: Xiaoyang Chen, Fei Han

Abstract: A classical theorem of Bochner asserts that the isometry group of a compact Riemannian manifold with negative Ricci curvature is finite. In this paper we give several extensions of Bochner's theorem by allowing "small" positive Ricci curvature. A classical theorem of Bochner asserts that the isometry group of a compact Riemannian manifold with negative Ricci curvature is finite. In this paper we give several extensions of Bochner's theorem by allowing "small" positive Ricci curvature. △ Less

Submitted 3 August, 2022; v1 submitted 22 March, 2020; originally announced March 2020.

Comments: 23pages

MSC Class: 53C20

arXiv:2001.01297 [pdf, ps, other]

Exponential inequalities for dependent V-statistics via random Fourier features

Authors: Yandi Shen, Fang Han, Daniela Witten

Abstract: We establish exponential inequalities for a class of V-statistics under strong mixing conditions. Our theory is developed via a novel kernel expansion based on random Fourier features and the use of a probabilistic method. This type of expansion is new and useful for handling many notorious classes of kernels. We establish exponential inequalities for a class of V-statistics under strong mixing conditions. Our theory is developed via a novel kernel expansion based on random Fourier features and the use of a probabilistic method. This type of expansion is new and useful for handling many notorious classes of kernels. △ Less

Submitted 5 January, 2020; originally announced January 2020.

Comments: This is the first part of the arxiv preprint (arXiv:1902.02761), and is to appear in Electronic Journal of Probability (EJP). The second part of the arxiv preprint will be submitted to a statistical journal

arXiv:2001.00322 [pdf, ps, other]

doi 10.4310/ATMP.2021.v25.n5.a3

T-Duality, Jacobi Forms and Witten Gerbe Modules

Authors: Fei Han, Varghese Mathai

Abstract: In this paper, we extend the T-duality Hori maps in [arXiv:hep-th/0306062], inducing isomorphisms of twisted cohomologies on T-dual circle bundles, to graded Hori maps and show that they induce isomorphisms of two-variable series of twisted cohomologies on the T-dual circle bundles, preserving Jacobi form properties. The composition of the graded Hori map with its dual equals the Euler operator. W… ▽ More In this paper, we extend the T-duality Hori maps in [arXiv:hep-th/0306062], inducing isomorphisms of twisted cohomologies on T-dual circle bundles, to graded Hori maps and show that they induce isomorphisms of two-variable series of twisted cohomologies on the T-dual circle bundles, preserving Jacobi form properties. The composition of the graded Hori map with its dual equals the Euler operator. We also construct Witten gerbe modules arising from gerbe modules and show that their graded twisted Chern characters are Jacobi forms under an anomaly vanishing condition on gerbe modules, thereby giving interesting examples. △ Less

Submitted 20 February, 2020; v1 submitted 1 January, 2020; originally announced January 2020.

Comments: 24 pages, minor corrections, references added

MSC Class: 58J26

Journal ref: Adv.Theor.Math.Phys. 25 no.5: 1235-1266, 2021

arXiv:1911.02280 [pdf, ps, other]

Time analyticity of ancient solutions to the heat equation on graphs

Authors: Fengwen Han, Bobo Hua, Lili Wang

Abstract: We study the time analyticity of ancient solutions to heat equations on graphs. Analogous to Dong and Zhang [DZ19], we prove the time analyticity of ancient solutions on graphs under some sharp growth condition. We study the time analyticity of ancient solutions to heat equations on graphs. Analogous to Dong and Zhang [DZ19], we prove the time analyticity of ancient solutions on graphs under some sharp growth condition. △ Less

Submitted 6 November, 2019; originally announced November 2019.

Comments: 12 pages

MSC Class: 05C10; 31C05

arXiv:1909.10024 [pdf, other]

Distribution-free consistent independence tests via center-outward ranks and signs

Authors: Hongjian Shi, Mathias Drton, Fang Han

Abstract: This paper investigates the problem of testing independence of two random vectors of general dimensions. For this, we give for the first time a distribution-free consistent test. Our approach combines distance covariance with the center-outward ranks and signs developed in Hallin (2017). In technical terms, the proposed test is consistent and distribution-free in the family of multivariate distrib… ▽ More This paper investigates the problem of testing independence of two random vectors of general dimensions. For this, we give for the first time a distribution-free consistent test. Our approach combines distance covariance with the center-outward ranks and signs developed in Hallin (2017). In technical terms, the proposed test is consistent and distribution-free in the family of multivariate distributions with nonvanishing (Lebesgue) probability densities. Exploiting the (degenerate) U-statistic structure of the distance covariance and the combinatorial nature of Hallin's center-outward ranks and signs, we are able to derive the limiting null distribution of our test statistic. The resulting asymptotic approximation is accurate already for moderate sample sizes and makes the test implementable without requiring permutation. The limiting distribution is derived via a more general result that gives a new type of combinatorial non-central limit theorem for double- and multiple-indexed permutation statistics. △ Less

Submitted 9 June, 2020; v1 submitted 22 September, 2019; originally announced September 2019.

Comments: to appear in JASA T&M

arXiv:1908.05255 [pdf, other]

On rank estimators in increasing dimensions

Authors: Yanqin Fan, Fang Han, Wei Li, Xiao-Hua Zhou

Abstract: The family of rank estimators, including Han's maximum rank correlation (Han, 1987) as a notable example, has been widely exploited in studying regression problems. For these estimators, although the linear index is introduced for alleviating the impact of dimensionality, the effect of large dimension on inference is rarely studied. This paper fills this gap via studying the statistical properties… ▽ More The family of rank estimators, including Han's maximum rank correlation (Han, 1987) as a notable example, has been widely exploited in studying regression problems. For these estimators, although the linear index is introduced for alleviating the impact of dimensionality, the effect of large dimension on inference is rarely studied. This paper fills this gap via studying the statistical properties of a larger family of M-estimators, whose objective functions are formulated as U-processes and may be discontinuous in increasing dimension set-up where the number of parameters, $p_{n}$, in the model is allowed to increase with the sample size, $n$. First, we find that often in estimation, as $p_{n}/n\rightarrow 0$, $(p_{n}/n)^{1/2}$ rate of convergence is obtainable. Second, we establish Bahadur-type bounds and study the validity of normal approximation, which we find often requires a much stronger scaling requirement than $p_{n}^{2}/n\rightarrow 0.$ Third, we state conditions under which the numerical derivative estimator of asymptotic covariance matrix is consistent, and show that the step size in implementing the covariance estimator has to be adjusted with respect to $p_{n}$. All theoretical results are further backed up by simulation studies. △ Less

Submitted 14 August, 2019; originally announced August 2019.

Comments: to appear in Journal of Econometrics

arXiv:1907.06577 [pdf, ps, other]

Probability inequalities for high dimensional time series under a triangular array framework

Authors: Fang Han, Weibiao Wu

Abstract: Study of time series data often involves measuring the strength of temporal dependence, on which statistical properties like consistency and central limit theorem are built. Historically, various dependence measures have been proposed. In this note, we first survey some of the most well-used dependence measures as well as various probability and moment inequalities built upon them under a high-dim… ▽ More Study of time series data often involves measuring the strength of temporal dependence, on which statistical properties like consistency and central limit theorem are built. Historically, various dependence measures have been proposed. In this note, we first survey some of the most well-used dependence measures as well as various probability and moment inequalities built upon them under a high-dimensional triangular array time series setting. We then argue that this triangular array setting will pose substantially new challenges to the verification of some dependence conditions. In particular, ``textbook results" could now be misleading, and hence are recommended to be used with caution. △ Less

Submitted 15 July, 2019; originally announced July 2019.

Comments: This is an invited short survey paper

arXiv:1906.09639 [pdf, ps, other]

Asymptotic joint distribution of extreme eigenvalues and trace of large sample covariance matrix in a generalized spiked population model

Authors: Zeng Li, Fang Han, Jianfeng Yao

Abstract: This paper studies the joint limiting behavior of extreme eigenvalues and trace of large sample covariance matrix in a generalized spiked population model, where the asymptotic regime is such that the dimension and sample size grow proportionally. The form of the joint limiting distribution is applied to conduct Johnson-Graybill-type tests, a family of approaches testing for signals in a statistic… ▽ More This paper studies the joint limiting behavior of extreme eigenvalues and trace of large sample covariance matrix in a generalized spiked population model, where the asymptotic regime is such that the dimension and sample size grow proportionally. The form of the joint limiting distribution is applied to conduct Johnson-Graybill-type tests, a family of approaches testing for signals in a statistical model. For this, higher order correction is further made, hel** alleviate the impact of finite-sample bias. The proof rests on determining the joint asymptotic behavior of two classes of spectral processes, corresponding to the extreme and linear spectral statistics respectively. △ Less

Submitted 23 June, 2019; originally announced June 2019.

Comments: to appear in the Annals of Statistics

arXiv:1905.02093 [pdf, ps, other]

String$\mathbf{^c}$ Structures and Modular Invariants

Authors: Haibao Duan, Fei Han, Ruizhi Huang

Abstract: In this paper, we study some algebraic topology aspects of String$^c$ structures, more precisely, from the perspective of Whitehead tower and the perspective of the loop group of $Spin^c(n)$. We also extend the generalized Witten genera constructed for the first time in \cite{CHZ11} to correspond to String$^c$ structures of various levels and give vanishing results for them. In this paper, we study some algebraic topology aspects of String$^c$ structures, more precisely, from the perspective of Whitehead tower and the perspective of the loop group of $Spin^c(n)$. We also extend the generalized Witten genera constructed for the first time in \cite{CHZ11} to correspond to String$^c$ structures of various levels and give vanishing results for them. △ Less

Submitted 17 September, 2020; v1 submitted 6 May, 2019; originally announced May 2019.

Comments: 41 pages; to appear in the Transactions of the American Mathematical Society

Showing 1–50 of 91 results for author: Han, F