-
TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing
Abstract: Despite many attempts to leverage pre-trained text-to-image models (T2I) like Stable Diffusion (SD) for controllable image editing, producing good predictable results remains a challenge. Previous approaches have focused on either fine-tuning pre-trained T2I models on specific datasets to generate certain kinds of images (e.g., with a specific object or person), or on optimizing the weights, text… ▽ More
Submitted 17 April, 2024; originally announced April 2024.
Comments: Conference on Computer Vision and Pattern Recognition (CVPR) 2024
-
arXiv:2205.10508 [pdf, ps, other]
A Review on the Optimal Fingerprinting Approach in Climate Change Studies
Abstract: We provide a review on the "optimal fingerprinting" approach as summarized in Allen and Tett (1999) from a point view of statistical inference in light of the recent criticism of McKitrick (2021). Our review finds that the "optimal fingerprinting" approach would survive much of McKitrick (2021)'s criticism under two conditions: (i) the null simulation of the climate model is independent of the phy… ▽ More
Submitted 21 May, 2022; originally announced May 2022.
-
Web-Scale Generic Object Detection at Microsoft Bing
Abstract: In this paper, we present Generic Object Detection (GenOD), one of the largest object detection systems deployed to a web-scale general visual search engine that can detect over 900 categories for all Microsoft Bing Visual Search queries in near real-time. It acts as a fundamental visual query understanding service that provides object-centric information and shows gains in multiple production sce… ▽ More
Submitted 5 July, 2021; originally announced July 2021.
Comments: In Proceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD) 2021, Virtual Event, Singapore
-
arXiv:2011.02258 [pdf, ps, other]
Concentration Inequalities for Statistical Inference
Abstract: This paper gives a review of concentration inequalities which are widely employed in non-asymptotical analyses of mathematical statistics in a wide range of settings, from distribution-free to distribution-dependent, from sub-Gaussian to sub-exponential, sub-Gamma, and sub-Weibull random variables, and from the mean to the maximum concentration. This review provides results in these settings with… ▽ More
Submitted 28 March, 2021; v1 submitted 4 November, 2020; originally announced November 2020.
Comments: Invited review article on constants-specified concentration inequalities published in Communications in Mathematical Research
MSC Class: 60F10; 60G50; 62E17
Journal ref: Communications in Mathematical Research. 37(1), 1-85 (2021)
-
Analyzing China's Consumer Price Index Comparatively with that of United States
Abstract: This paper provides a thorough analysis on the dynamic structures and predictability of China's Consumer Price Index (CPI-CN), with a comparison to those of the United States. Despite the differences in the two leading economies, both series can be well modeled by a class of Seasonal Autoregressive Integrated Moving Average Model with Covariates (S-ARIMAX). The CPI-CN series possess regular patter… ▽ More
Submitted 29 October, 2019; originally announced October 2019.
-
arXiv:1910.13074 [pdf, ps, other]
Multi-level Thresholding Test for High Dimensional Covariance Matrices
Abstract: We consider testing the equality of two high-dimensional covariance matrices by carrying out a multi-level thresholding procedure, which is designed to detect sparse and faint differences between the covariances. A novel U-statistic composition is developed to establish the asymptotic distribution of the thresholding statistics in conjunction with the matrix blocking and the coupling techniques. W… ▽ More
Submitted 28 October, 2019; originally announced October 2019.
-
arXiv:1812.07813 [pdf, ps, other]
Matrix Completion under Low-Rank Missing Mechanism
Abstract: Matrix completion is a modern missing data problem where both the missing structure and the underlying parameter are high dimensional. Although missing structure is a key component to any missing data problems, existing matrix completion methods often assume a simple uniform missing mechanism. In this work, we study matrix completion from corrupted data under a novel low-rank missing mechanism. Th… ▽ More
Submitted 19 March, 2020; v1 submitted 19 December, 2018; originally announced December 2018.
Comments: 29 pages, 0 figures
-
Distributed Statistical Inference for Massive Data
Abstract: This paper considers distributed statistical inference for general symmetric statistics %that encompasses the U-statistics and the M-estimators in the context of massive data where the data can be stored at multiple platforms in different locations. In order to facilitate effective computation and to avoid expensive communication among different platforms, we formulate distributed statistics which… ▽ More
Submitted 28 May, 2018; originally announced May 2018.
-
arXiv:1805.10742 [pdf, ps, other]
High-dimensional empirical likelihood inference
Abstract: High-dimensional statistical inference with general estimating equations are challenging and remain less explored. In this paper, we study two problems in the area: confidence set estimation for multiple components of the model parameters, and model specifications test. For the first one, we propose to construct a new set of estimating equations such that the impact from estimating the high-dimens… ▽ More
Submitted 6 November, 2019; v1 submitted 27 May, 2018; originally announced May 2018.
Comments: The original title of this paper is "High-dimensional statistical inferences with over-identification: confidence set estimation and specification test"
Journal ref: Biometrika 2021, Vol. 108, No. 1, 127-147
-
Two-Sample Tests for High Dimensional Means with Thresholding and Data Transformation
Abstract: We consider testing for two-sample means of high dimensional populations by thresholding. Two tests are investigated, which are designed for better power performance when the two population mean vectors differ only in sparsely populated coordinates. The first test is constructed by carrying out thresholding to remove the non-signal bearing dimensions. The second test combines data transformation v… ▽ More
Submitted 10 October, 2014; originally announced October 2014.
Comments: 64 pages
-
arXiv:1402.4882 [pdf, ps, other]
Tests for High Dimensional Generalized Linear Models
Abstract: We consider testing regression coefficients in high dimensional generalized linear models. An investigation of the test of Goeman et al. (2011) is conducted, which reveals that if the inverse of the link function is unbounded, the high dimensionality in the covariates can impose adverse impacts on the power of the test. We propose a test formation which can avoid the adverse impact of the high dim… ▽ More
Submitted 19 February, 2014; originally announced February 2014.
Comments: The research paper was stole by someone last November and illegally submitted to arXiv by a person named gong zi jiang nan. We have asked arXiv to withdraw the unfinished paper [arXiv:1311.4043] and it was removed last December. We have collected enough evidences to identify the person and Peking University has begun to investigate the plagiarizer
-
arXiv:1312.5103 [pdf, ps, other]
Tests alternative to higher criticism for high-dimensional means under sparsity and column-wise dependence
Abstract: We consider two alternative tests to the Higher Criticism test of Donoho and ** [Ann. Statist. 32 (2004) 962-994] for high-dimensional means under the sparsity of the nonzero means for sub-Gaussian distributed data with unknown column-wise dependence. The two alternative test statistics are constructed by first thresholding $L_1$ and $L_2$ statistics based on the sample means, respectively, follo… ▽ More
Submitted 18 December, 2013; originally announced December 2013.
Comments: Published in at http://dx.doi.org/10.1214/13-AOS1168 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOS-AOS1168
Journal ref: Annals of Statistics 2013, Vol. 41, No. 6, 2820-2851
-
arXiv:1308.5732 [pdf, ps, other]
High dimensional generalized empirical likelihood for moment restrictions with dependent data
Abstract: This paper considers the maximum generalized empirical likelihood (GEL) estimation and inference on parameters identified by high dimensional moment restrictions with weakly dependent data when the dimensions of the moment restrictions and the parameters diverge along with the sample size. The consistency with rates and the asymptotic normality of the GEL estimator are obtained by properly restric… ▽ More
Submitted 27 January, 2015; v1 submitted 26 August, 2013; originally announced August 2013.
Journal ref: Journal of Econometrics 2015, Vol. 185, No. 1, 283-304
-
arXiv:1302.0122 [pdf, ps, other]
Parameter estimation and model testing for Markov processes via conditional characteristic functions
Abstract: Markov processes are used in a wide range of disciplines, including finance. The transition densities of these processes are often unknown. However, the conditional characteristic functions are more likely to be available, especially for Lévy-driven processes. We propose an empirical likelihood approach, for both parameter estimation and model specification testing, based on the conditional charac… ▽ More
Submitted 1 February, 2013; originally announced February 2013.
Comments: Published in at http://dx.doi.org/10.3150/11-BEJ400 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)
Report number: IMS-BEJ-BEJ400
Journal ref: Bernoulli 2013, Vol. 19, No. 1, 228-251
-
arXiv:1211.2979 [pdf, ps, other]
ANOVA for longitudinal data with missing values
Abstract: We carry out ANOVA comparisons of multiple treatments for longitudinal studies with missing values. The treatment effects are modeled semiparametrically via a partially linear regression which is flexible in quantifying the time effects of treatments. The empirical likelihood is employed to formulate model-robust nonparametric ANOVA tests for treatment effects with respect to covariates, the nonpa… ▽ More
Submitted 13 November, 2012; originally announced November 2012.
Comments: Published in at http://dx.doi.org/10.1214/10-AOS824 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOS-AOS824
Journal ref: Annals of Statistics 2010, Vol. 38, No. 6, 3630-3659
-
arXiv:1208.3321 [pdf, ps, other]
Test for bandedness of high-dimensional covariance matrices and bandwidth estimation
Abstract: Motivated by the latest effort to employ banded matrices to estimate a high-dimensional covariance $Σ$, we propose a test for $Σ$ being banded with possible diverging bandwidth. The test is adaptive to the "large $p$, small $n$" situations without assuming a specific parametric distribution for the data. We also formulate a consistent estimator for the bandwidth of a banded high-dimensional covari… ▽ More
Submitted 16 August, 2012; originally announced August 2012.
Comments: Published in at http://dx.doi.org/10.1214/12-AOS1002 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOS-AOS1002
Journal ref: Annals of Statistics 2012, Vol. 40, No. 3, 1285-1314
-
arXiv:1206.0917 [pdf, ps, other]
Two sample tests for high-dimensional covariance matrices
Abstract: We propose two tests for the equality of covariance matrices between two high-dimensional populations. One test is on the whole variance--covariance matrices, and the other is on off-diagonal sub-matrices, which define the covariance between two nonoverlap** segments of the high-dimensional random vectors. The tests are applicable (i) when the data dimension is much larger than the sample sizes,… ▽ More
Submitted 5 June, 2012; originally announced June 2012.
Comments: Published in at http://dx.doi.org/10.1214/12-AOS993 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOS-AOS993
Journal ref: Annals of Statistics 2012, Vol. 40, No. 2, 908-940
-
arXiv:1203.2004 [pdf, ps, other]
On the approximate maximum likelihood estimation for diffusion processes
Abstract: The transition density of a diffusion process does not admit an explicit expression in general, which prevents the full maximum likelihood estimation (MLE) based on discretely observed sample paths. Aït-Sahalia [J. Finance 54 (1999) 1361--1395; Econometrica 70 (2002) 223--262] proposed asymptotic expansions to the transition densities of diffusion processes, which lead to an approximate maximum li… ▽ More
Submitted 9 March, 2012; originally announced March 2012.
Comments: Published in at http://dx.doi.org/10.1214/11-AOS922 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOS-AOS922
Journal ref: Annals of Statistics 2011, Vol. 39, No. 6, 2820-2851
-
arXiv:1002.4547 [pdf, ps, other]
A two-sample test for high-dimensional data with applications to gene-set testing
Abstract: We propose a two-sample test for the means of high-dimensional data when the data dimension is much larger than the sample size. Hotelling's classical $T^2$ test does not work for this "large $p$, small $n$" situation. The proposed test does not require explicit conditions in the relationship between the data dimension and sample size. This offers much flexibility in analyzing high-dimensional d… ▽ More
Submitted 24 February, 2010; originally announced February 2010.
Comments: Published in at http://dx.doi.org/10.1214/09-AOS716 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOS-AOS716 MSC Class: 62H15; 60K35 (Primary) 62G10 (Secondary)
Journal ref: Annals of Statistics 2010, Vol. 38, No. 2, 808-835
-
arXiv:1001.1667 [pdf, ps, other]
A goodness-of-fit test for parametric and semi-parametric models in multiresponse regression
Abstract: We propose an empirical likelihood test that is able to test the goodness of fit of a class of parametric and semi-parametric multiresponse regression models. The class includes as special cases fully parametric models; semi-parametric models, like the multiindex and the partially linear models; and models with shape constraints. Another feature of the test is that it allows both the response va… ▽ More
Submitted 11 January, 2010; originally announced January 2010.
Comments: Published in at http://dx.doi.org/10.3150/09-BEJ208 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)
Report number: IMS-BEJ-BEJ208
Journal ref: Bernoulli 2009, Vol. 15, No. 4, 955-976
-
arXiv:0903.0726 [pdf, ps, other]
Empirical likelihood for estimating equations with missing values
Abstract: We consider an empirical likelihood inference for parameters defined by general estimating equations when some components of the random observations are subject to missingness. As the nature of the estimating equations is wide-ranging, we propose a nonparametric imputation of the missing values from a kernel estimator of the conditional distribution of the missing variable given the always obser… ▽ More
Submitted 4 March, 2009; originally announced March 2009.
Comments: Published in at http://dx.doi.org/10.1214/07-AOS585 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOS-AOS585 MSC Class: 62G05 (Primary) 62G20 (Secondary)
Journal ref: Annals of Statistics 2009, Vol. 37, No. 1, 490-517
-
arXiv:math/0702123 [pdf, ps, other]
A test for model specification of diffusion processes
Abstract: We propose a test for model specification of a parametric diffusion process based on a kernel estimation of the transitional density of the process. The empirical likelihood is used to formulate a statistic, for each kernel smoothing bandwidth, which is effectively a Studentized $L_2$-distance between the kernel transitional density estimator and the parametric transitional density implied by th… ▽ More
Submitted 12 March, 2008; v1 submitted 6 February, 2007; originally announced February 2007.
Comments: Published in at http://dx.doi.org/10.1214/009053607000000659 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOS-AOS0288 MSC Class: 62G05 (Primary); 62J02 (Secondary)
Journal ref: Annals of Statistics 2008, Vol. 36, No. 1, 167-198