Search | arXiv e-print repository

Combining Experimental and Historical Data for Policy Evaluation

Authors: Ting Li, Chengchun Shi, Qianglin Wen, Yang Sui, Yongli Qin, Chunbo Lai, Hongtu Zhu

Abstract: This paper studies policy evaluation with multiple data sources, especially in scenarios that involve one experimental dataset with two arms, complemented by a historical dataset generated under a single control arm. We propose novel data integration methods that linearly integrate base policy value estimators constructed based on the experimental and historical data, with weights optimized to min… ▽ More This paper studies policy evaluation with multiple data sources, especially in scenarios that involve one experimental dataset with two arms, complemented by a historical dataset generated under a single control arm. We propose novel data integration methods that linearly integrate base policy value estimators constructed based on the experimental and historical data, with weights optimized to minimize the mean square error (MSE) of the resulting combined estimator. We further apply the pessimistic principle to obtain more robust estimators, and extend these developments to sequential decision making. Theoretically, we establish non-asymptotic error bounds for the MSEs of our proposed estimators, and derive their oracle, efficiency and robustness properties across a broad spectrum of reward shift scenarios. Numerical experiments and real-data-based analyses from a ridesharing company demonstrate the superior performance of the proposed estimators. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2402.04828 [pdf, other]

What drives the European carbon market? Macroeconomic factors and forecasts

Authors: Andrea Bastianin, Elisabetta Mirto, Yan Qin, Luca Rossini

Abstract: Putting a price on carbon -- with taxes or develo** carbon markets -- is a widely used policy measure to achieve the target of net-zero emissions by 2050. This paper tackles the issue of producing point, direction-of-change, and density forecasts for the monthly real price of carbon within the EU Emissions Trading Scheme (EU ETS). We aim to uncover supply- and demand-side forces that can contrib… ▽ More Putting a price on carbon -- with taxes or develo** carbon markets -- is a widely used policy measure to achieve the target of net-zero emissions by 2050. This paper tackles the issue of producing point, direction-of-change, and density forecasts for the monthly real price of carbon within the EU Emissions Trading Scheme (EU ETS). We aim to uncover supply- and demand-side forces that can contribute to improving the prediction accuracy of models at short- and medium-term horizons. We show that a simple Bayesian Vector Autoregressive (BVAR) model, augmented with either one or two factors capturing a set of predictors affecting the price of carbon, provides substantial accuracy gains over a wide set of benchmark forecasts, including survey expectations and forecasts made available by data providers. We extend the study to verified emissions and demonstrate that, in this case, adding stochastic volatility can further improve the forecasting performance of a single-factor BVAR model. We rely on emissions and price forecasts to build market monitoring tools that track demand and price pressure in the EU ETS market. Our results are relevant for policymakers and market practitioners interested in monitoring the carbon market dynamics. △ Less

Submitted 20 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

Comments: The Supplementary Material is available upon request to the authors

arXiv:2401.14142 [pdf, other]

Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Probabilistic Interpretations

Authors: Xinyue Xu, Yi Qin, Lu Mi, Hao Wang, Xiaomeng Li

Abstract: Existing methods, such as concept bottleneck models (CBMs), have been successful in providing concept-based interpretations for black-box deep learning models. They typically work by predicting concepts given the input and then predicting the final class label given the predicted concepts. However, (1) they often fail to capture the high-order, nonlinear interaction between concepts, e.g., correct… ▽ More Existing methods, such as concept bottleneck models (CBMs), have been successful in providing concept-based interpretations for black-box deep learning models. They typically work by predicting concepts given the input and then predicting the final class label given the predicted concepts. However, (1) they often fail to capture the high-order, nonlinear interaction between concepts, e.g., correcting a predicted concept (e.g., "yellow breast") does not help correct highly correlated concepts (e.g., "yellow belly"), leading to suboptimal final accuracy; (2) they cannot naturally quantify the complex conditional dependencies between different concepts and class labels (e.g., for an image with the class label "Kentucky Warbler" and a concept "black bill", what is the probability that the model correctly predicts another concept "black crown"), therefore failing to provide deeper insight into how a black-box model works. In response to these limitations, we propose Energy-based Concept Bottleneck Models (ECBMs). Our ECBMs use a set of neural networks to define the joint energy of candidate (input, concept, class) tuples. With such a unified interface, prediction, concept correction, and conditional dependency quantification are then represented as conditional probabilities, which are generated by composing different energy functions. Our ECBMs address both limitations of existing CBMs, providing higher accuracy and richer concept interpretations. Empirical results show that our approach outperforms the state-of-the-art on real-world datasets. △ Less

Submitted 26 February, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

Comments: Accepted by ICLR 2024

arXiv:2312.12022 [pdf, other]

LightGCNet: A Lightweight Geometric Constructive Neural Network for Data-Driven Soft sensors

Authors: **g Nan, Yan Qin, Wei Dai, Chau Yuen

Abstract: Data-driven soft sensors provide a potentially cost-effective and more accurate modeling approach to measure difficult-to-measure indices in industrial processes compared to mechanistic approaches. Artificial intelligence (AI) techniques, such as deep learning, have become a popular soft sensors modeling approach in the area of machine learning and big data. However, soft sensors models based deep… ▽ More Data-driven soft sensors provide a potentially cost-effective and more accurate modeling approach to measure difficult-to-measure indices in industrial processes compared to mechanistic approaches. Artificial intelligence (AI) techniques, such as deep learning, have become a popular soft sensors modeling approach in the area of machine learning and big data. However, soft sensors models based deep learning potentially lead to complex model structures and excessive training time. In addition, industrial processes often rely on distributed control systems (DCS) characterized by resource constraints. Herein, guided by spatial geometric, a lightweight geometric constructive neural network, namely LightGCNet, is proposed, which utilizes compact angle constraint to assign the hidden parameters from dynamic intervals. At the same time, a node pool strategy and spatial geometric relationships are used to visualize and optimize the process of assigning hidden parameters, enhancing interpretability. In addition, the universal approximation property of LightGCNet is proved by spatial geometric analysis. Two versions algorithmic implementations of LightGCNet are presented in this article. Simulation results concerning both benchmark datasets and the ore grinding process indicate remarkable merits of LightGCNet in terms of small network size, fast learning speed, and sound generalization. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: arXiv admin note: text overlap with arXiv:2307.00185

arXiv:2312.11393 [pdf, other]

Assessing Estimation Uncertainty under Model Misspecification

Authors: Rong Li, Yichen Qin, Yang Li

Abstract: Model misspecification is ubiquitous in data analysis because the data-generating process is often complex and mathematically intractable. Therefore, assessing estimation uncertainty and conducting statistical inference under a possibly misspecified working model is unavoidable. In such a case, classical methods such as bootstrap and asymptotic theory-based inference frequently fail since they rel… ▽ More Model misspecification is ubiquitous in data analysis because the data-generating process is often complex and mathematically intractable. Therefore, assessing estimation uncertainty and conducting statistical inference under a possibly misspecified working model is unavoidable. In such a case, classical methods such as bootstrap and asymptotic theory-based inference frequently fail since they rely heavily on the model assumptions. In this article, we provide a new bootstrap procedure, termed local residual bootstrap, to assess estimation uncertainty under model misspecification for generalized linear models. By resampling the residuals from the neighboring observations, we can approximate the sampling distribution of the statistic of interest accurately. Instead of relying on the score equations, the proposed method directly recreates the response variables so that we can easily conduct standard error estimation, confidence interval construction, hypothesis testing, and model evaluation and selection. It performs similarly to classical bootstrap when the model is correctly specified and provides a more accurate assessment of uncertainty under model misspecification, offering data analysts an easy way to guard against the impact of misspecified models. We establish desirable theoretical properties, such as the bootstrap validity, for the proposed method using the surrogate residuals. Numerical results and real data analysis further demonstrate the superiority of the proposed method. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.07727 [pdf, other]

Two-sample inference for sparse functional data

Authors: Chi Zhang, Peijun Sang, Yingli Qin

Abstract: We propose a novel test procedure for comparing mean functions across two groups within the reproducing kernel Hilbert space (RKHS) framework. Our proposed method is adept at handling sparsely and irregularly sampled functional data when observation times are random for each subject. Conventional approaches, which are built upon functional principal components analysis, usually assume a homogeneou… ▽ More We propose a novel test procedure for comparing mean functions across two groups within the reproducing kernel Hilbert space (RKHS) framework. Our proposed method is adept at handling sparsely and irregularly sampled functional data when observation times are random for each subject. Conventional approaches, which are built upon functional principal components analysis, usually assume a homogeneous covariance structure across groups. Nonetheless, justifying this assumption in real-world scenarios can be challenging. To eliminate the need for a homogeneous covariance structure, we first develop the functional Bahadur representation for the mean estimator under the RKHS framework; this representation naturally leads to the desirable pointwise limiting distributions. Moreover, we establish weak convergence for the mean estimator, allowing us to construct a test statistic for the mean difference. Our method is easily implementable and outperforms some conventional tests in controlling type I errors across various settings. We demonstrate the finite sample performance of our approach through extensive simulations and two real-world applications. △ Less

Submitted 29 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

arXiv:2309.05697 [pdf, other]

21cmEMU: an emulator of 21cmFAST summary observables

Authors: Daniela Breitman, Andrei Mesinger, Steven Murray, David Prelogovic, Yuxiang Qin, Roberto Trotta

Abstract: Recent years have witnessed rapid progress in observations of the Epoch of Reionization (EoR). These have enabled high-dimensional inference of galaxy and intergalactic medium (IGM) properties during the first billion years of our Universe. However, even using efficient, semi-numerical simulations, traditional inference approaches that compute 3D lightcones on-the-fly can take $10^5$ core hours. H… ▽ More Recent years have witnessed rapid progress in observations of the Epoch of Reionization (EoR). These have enabled high-dimensional inference of galaxy and intergalactic medium (IGM) properties during the first billion years of our Universe. However, even using efficient, semi-numerical simulations, traditional inference approaches that compute 3D lightcones on-the-fly can take $10^5$ core hours. Here we present 21cmEMU: an emulator of several summary observables from the popular 21cmFAST simulation code. 21cmEMU takes as input nine parameters characterizing EoR galaxies, and outputs the following summary statistics: (i) the IGM mean neutral fraction; (ii) the 21-cm power spectrum; (iii) the mean 21-cm spin temperature; (iv) the sky-averaged (global) 21-cm signal; (vi) the ultraviolet (UV) luminosity functions (LFs); and (vii) the Thomson scattering optical depth to the cosmic microwave background (CMB). All observables are predicted with sub-percent median accuracy, with a reduction of the computational cost by a factor of over 10$^4$. After validating inference results, we showcase a few applications, including: (i) quantifying the relative constraining power of different observational datasets; (ii) seeing how recent claims of a late EoR impact previous inferences; and (iii) forecasting upcoming constraints from the sixth observing season of the Hydrogen Epoch of Reionization Array (HERA) telescope. 21cmEMU is publicly-available, and is included as an alternative simulator in the public 21CMMC sampler. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: 21 pages, 13 figures, submitted to MNRAS

arXiv:2307.07574 [pdf, other]

Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models

Authors: Xiaorui Zhu, Yichen Qin, Peng Wang

Abstract: Statistical inference of the high-dimensional regression coefficients is challenging because the uncertainty introduced by the model selection procedure is hard to account for. A critical question remains unsettled; that is, is it possible and how to embed the inference of the model into the simultaneous inference of the coefficients? To this end, we propose a notion of simultaneous confidence int… ▽ More Statistical inference of the high-dimensional regression coefficients is challenging because the uncertainty introduced by the model selection procedure is hard to account for. A critical question remains unsettled; that is, is it possible and how to embed the inference of the model into the simultaneous inference of the coefficients? To this end, we propose a notion of simultaneous confidence intervals called the sparsified simultaneous confidence intervals. Our intervals are sparse in the sense that some of the intervals' upper and lower bounds are shrunken to zero (i.e., $[0,0]$), indicating the unimportance of the corresponding covariates. These covariates should be excluded from the final model. The rest of the intervals, either containing zero (e.g., $[-1,1]$ or $[0,1]$) or not containing zero (e.g., $[2,3]$), indicate the plausible and significant covariates, respectively. The proposed method can be coupled with various selection procedures, making it ideal for comparing their uncertainty. For the proposed method, we establish desirable asymptotic properties, develop intuitive graphical tools for visualization, and justify its superior performance through simulation and real data analysis. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: 26 pages, 6 figures

MSC Class: 62fxx

arXiv:2304.00691 [pdf, other]

Lithium-ion Battery Online Knee Onset Detection by Matrix Profile

Authors: Kate Qi Zhou, Yan Qin, Chau Yuen

Abstract: Lithium-ion batteries (LiBs) degrade slightly until the knee onset, after which the deterioration accelerates to end of life (EOL). The knee onset, which marks the initiation of the accelerated degradation rate, is crucial in providing an early warning of the battery's performance changes. However, there is only limited literature on online knee onset identification. Furthermore, it is good to per… ▽ More Lithium-ion batteries (LiBs) degrade slightly until the knee onset, after which the deterioration accelerates to end of life (EOL). The knee onset, which marks the initiation of the accelerated degradation rate, is crucial in providing an early warning of the battery's performance changes. However, there is only limited literature on online knee onset identification. Furthermore, it is good to perform such identification using easily collected measurements. To solve these challenges, an online knee onset identification method is developed by exploiting the temporal information within the discharge data. First, the temporal dynamics embedded in the discharge voltage cycles from the slight degradation stage are extracted by the dynamic time war**. Second, the anomaly is exposed by Matrix Profile during subsequence similarity search. The knee onset is detected when the temporal dynamics of the new cycle exceed the control limit and the profile index indicates a change in regime. Finally, the identified knee onset is utilized to categorize the battery into long-range or short-range categories by its strong correlation with the battery's EOL cycles. With the support of the battery categorization and the training data acquired under the same statistic distribution, the proposed SOH estimation model achieves enhanced estimation results with a root mean squared error as low as 0.22%. △ Less

Submitted 2 April, 2023; originally announced April 2023.

Journal ref: IEEE Transactions on Transportation Electrification, 2023

arXiv:2302.00814 [pdf, other]

Stochastic Contextual Bandits with Long Horizon Rewards

Authors: Yuzhen Qin, Yingcong Li, Fabio Pasqualetti, Maryam Fazel, Samet Oymak

Abstract: The growing interest in complex decision-making and language modeling problems highlights the importance of sample-efficient learning over very long horizons. This work takes a step in this direction by investigating contextual linear bandits where the current reward depends on at most $s$ prior actions and contexts (not necessarily consecutive), up to a time horizon of $h$. In order to avoid poly… ▽ More The growing interest in complex decision-making and language modeling problems highlights the importance of sample-efficient learning over very long horizons. This work takes a step in this direction by investigating contextual linear bandits where the current reward depends on at most $s$ prior actions and contexts (not necessarily consecutive), up to a time horizon of $h$. In order to avoid polynomial dependence on $h$, we propose new algorithms that leverage sparsity to discover the dependence pattern and arm parameters jointly. We consider both the data-poor ($T<h$) and data-rich ($T\ge h$) regimes, and derive respective regret upper bounds $\tilde O(d\sqrt{sT} +\min\{ q, T\})$ and $\tilde O(\sqrt{sdT})$, with sparsity $s$, feature dimension $d$, total time horizon $T$, and $q$ that is adaptive to the reward dependence pattern. Complementing upper bounds, we also show that learning over a single trajectory brings inherent challenges: While the dependence pattern and arm parameters form a rank-1 matrix, circulant matrices are not isometric over rank-1 manifolds and sample complexity indeed benefits from the sparse reward dependence structure. Our results necessitate a new analysis to address long-range temporal dependencies across data and avoid polynomial dependence on the reward horizon $h$. Specifically, we utilize connections to the restricted isometry property of circulant matrices formed by dependent sub-Gaussian vectors and establish new guarantees that are also of independent interest. △ Less

Submitted 3 February, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

Comments: 47 pages, to appear at AAAI 2023

arXiv:2208.11204 [pdf, other]

Transfer Learning-based State of Health Estimation for Lithium-ion Battery with Cycle Synchronization

Authors: Kate Qi Zhou, Yan Qin, Chau Yuen

Abstract: Accurately estimating a battery's state of health (SOH) helps prevent battery-powered applications from failing unexpectedly. With the superiority of reducing the data requirement of model training for new batteries, transfer learning (TL) emerges as a promising machine learning approach that applies knowledge learned from a source battery, which has a large amount of data. However, the determinat… ▽ More Accurately estimating a battery's state of health (SOH) helps prevent battery-powered applications from failing unexpectedly. With the superiority of reducing the data requirement of model training for new batteries, transfer learning (TL) emerges as a promising machine learning approach that applies knowledge learned from a source battery, which has a large amount of data. However, the determination of whether the source battery model is reasonable and which part of information can be transferred for SOH estimation are rarely discussed, despite these being critical components of a successful TL. To address these challenges, this paper proposes an interpretable TL-based SOH estimation method by exploiting the temporal dynamic to assist transfer learning, which consists of three parts. First, with the help of dynamic time war**, the temporal data from the discharge time series are synchronized, yielding the war** path of the cycle-synchronized time series responsible for capacity degradation over cycles. Second, the canonical variates retrieved from the spatial path of the cycle-synchronized time series are used for distribution similarity analysis between the source and target batteries. Third, when the distribution similarity is within the predefined threshold, a comprehensive target SOH estimation model is constructed by transferring the common temporal dynamics from the source SOH estimation model and compensating the errors with a residual model from the target battery. Through a widely-used open-source benchmark dataset, the estimation error of the proposed method evaluated by the root mean squared error is as low as 0.0034 resulting in a 77% accuracy improvement compared with existing methods. △ Less

Submitted 23 August, 2022; originally announced August 2022.

arXiv:2108.02905 [pdf, other]

Optimal integrating learning for split questionnaire design type data

Authors: Cunjie Lin, **gfu Peng, Yichen Qin, Yang Li, Yuhong Yang

Abstract: In the era of data science, it is common to encounter data with different subsets of variables obtained for different cases. An example is the split questionnaire design (SQD), which is adopted to reduce respondent fatigue and improve response rates by assigning different subsets of the questionnaire to different sampled respondents. A general question then is how to estimate the regression functi… ▽ More In the era of data science, it is common to encounter data with different subsets of variables obtained for different cases. An example is the split questionnaire design (SQD), which is adopted to reduce respondent fatigue and improve response rates by assigning different subsets of the questionnaire to different sampled respondents. A general question then is how to estimate the regression function based on such block-wise observed data. Currently, this is often carried out with the aid of missing data methods, which may unfortunately suffer intensive computational cost, high variability, and possible large modeling biases in real applications. In this article, we develop a novel approach for estimating the regression function for SQD-type data. We first construct a list of candidate models using available data-blocks separately, and then combine the estimates properly to make an efficient use of all the information. We show the resulting averaged model is asymptotically optimal in the sense that the squared loss and risk are asymptotically equivalent to those of the best but infeasible averaged estimator. Both simulated examples and an application to the SQD dataset from the European Social Survey show the promise of the proposed method. △ Less

Submitted 5 August, 2021; originally announced August 2021.

arXiv:2106.12991 [pdf, other]

Relationship between pulmonary nodule malignancy and surrounding pleurae, airways and vessels: a quantitative study using the public LIDC-IDRI dataset

Authors: Yulei Qin, Yun Gu, Hanxiao Zhang, Jie Yang, Lihui Wang, Zhexin Wang, Feng Yao, Yue-Min Zhu

Abstract: To investigate whether the pleurae, airways and vessels surrounding a nodule on non-contrast computed tomography (CT) can discriminate benign and malignant pulmonary nodules. The LIDC-IDRI dataset, one of the largest publicly available CT database, was exploited for study. A total of 1556 nodules from 694 patients were involved in statistical analysis, where nodules with average scorings <3 and >3… ▽ More To investigate whether the pleurae, airways and vessels surrounding a nodule on non-contrast computed tomography (CT) can discriminate benign and malignant pulmonary nodules. The LIDC-IDRI dataset, one of the largest publicly available CT database, was exploited for study. A total of 1556 nodules from 694 patients were involved in statistical analysis, where nodules with average scorings <3 and >3 were respectively denoted as benign and malignant. Besides, 339 nodules from 113 patients with diagnosis ground-truth were independently evaluated. Computer algorithms were developed to segment pulmonary structures and quantify the distances to pleural surface, airways and vessels, as well as the counting number and normalized volume of airways and vessels near a nodule. Odds ratio (OR) and Chi-square (χ^2) testing were performed to demonstrate the correlation between features of surrounding structures and nodule malignancy. A non-parametric receiver operating characteristic (ROC) analysis was conducted in logistic regression to evaluate discrimination ability of each structure. For benign and malignant groups, the average distances from nodules to pleural surface, airways and vessels are respectively (6.56, 5.19), (37.08, 26.43) and (1.42, 1.07) mm. The correlation between nodules and the counting number of airways and vessels that contact or project towards nodules are respectively (OR=22.96, χ^2=105.04) and (OR=7.06, χ^2=290.11). The correlation between nodules and the volume of airways and vessels are (OR=9.19, χ^2=159.02) and (OR=2.29, χ^2=55.89). The areas-under-curves (AUCs) for pleurae, airways and vessels are respectively 0.5202, 0.6943 and 0.6529. Our results show that malignant nodules are often surrounded by more pulmonary structures compared with benign ones, suggesting that features of these structures could be viewed as lung cancer biomarkers. △ Less

Submitted 12 December, 2021; v1 submitted 24 June, 2021; originally announced June 2021.

Comments: 33 pages, 3 figures, Submitted for review

arXiv:2009.04136 [pdf, other]

doi 10.1177/09622802211008206

Testing for Treatment Effect in Covariate-Adaptive Randomized Clinical Trials with Generalized Linear Models and Omitted Covariates

Authors: Li Yang, Wei Ma, Yichen Qin, Feifang Hu

Abstract: Concerns have been expressed over the validity of statistical inference under covariate-adaptive randomization despite the extensive use in clinical trials. In the literature, the inferential properties under covariate-adaptive randomization have been mainly studied for continuous responses; in particular, it is well known that the usual two sample t-test for treatment effect is typically conserva… ▽ More Concerns have been expressed over the validity of statistical inference under covariate-adaptive randomization despite the extensive use in clinical trials. In the literature, the inferential properties under covariate-adaptive randomization have been mainly studied for continuous responses; in particular, it is well known that the usual two sample t-test for treatment effect is typically conservative, in the sense that the actual test size is smaller than the nominal level. This phenomenon of invalid tests has also been found for generalized linear models without adjusting for the covariates and are sometimes more worrisome due to inflated Type I error. The purpose of this study is to examine the unadjusted test for treatment effect under generalized linear models and covariate-adaptive randomization. For a large class of covariate-adaptive randomization methods, we obtain the asymptotic distribution of the test statistic under the null hypothesis and derive the conditions under which the test is conservative, valid, or anti-conservative. Several commonly used generalized linear models, such as logistic regression and Poisson regression, are discussed in detail. An adjustment method is also proposed to achieve a valid size based on the asymptotic results. Numerical studies confirm the theoretical findings and demonstrate the effectiveness of the proposed adjustment method. △ Less

Submitted 2 May, 2021; v1 submitted 9 September, 2020; originally announced September 2020.

Comments: Updated to the published version

Journal ref: Statistical Methods in Medical Research 30, no. 9 (2021): 2148-2164

arXiv:2007.08129 [pdf, other]

doi 10.1109/LSP.2020.3036348

Layer-Wise Adaptive Updating for Few-Shot Image Classification

Authors: Yunxiao Qin, Weiguo Zhang, Zezheng Wang, Chenxu Zhao, **g** Shi

Abstract: Few-shot image classification (FSIC), which requires a model to recognize new categories via learning from few images of these categories, has attracted lots of attention. Recently, meta-learning based methods have been shown as a promising direction for FSIC. Commonly, they train a meta-learner (meta-learning model) to learn easy fine-tuning weight, and when solving an FSIC task, the meta-learner… ▽ More Few-shot image classification (FSIC), which requires a model to recognize new categories via learning from few images of these categories, has attracted lots of attention. Recently, meta-learning based methods have been shown as a promising direction for FSIC. Commonly, they train a meta-learner (meta-learning model) to learn easy fine-tuning weight, and when solving an FSIC task, the meta-learner efficiently fine-tunes itself to a task-specific model by updating itself on few images of the task. In this paper, we propose a novel meta-learning based layer-wise adaptive updating (LWAU) method for FSIC. LWAU is inspired by an interesting finding that compared with common deep models, the meta-learner pays much more attention to update its top layer when learning from few images. According to this finding, we assume that the meta-learner may greatly prefer updating its top layer to updating its bottom layers for better FSIC performance. Therefore, in LWAU, the meta-learner is trained to learn not only the easy fine-tuning model but also its favorite layer-wise adaptive updating rule to improve its learning efficiency. Extensive experiments show that with the layer-wise adaptive updating rule, the proposed LWAU: 1) outperforms existing few-shot classification methods with a clear margin; 2) learns from few images more efficiently by at least 5 times than existing meta-learners when solving FSIC. △ Less

Submitted 16 July, 2020; originally announced July 2020.

arXiv:2006.16375 [pdf, other]

Improving Calibration through the Relationship with Adversarial Robustness

Authors: Yao Qin, Xuezhi Wang, Alex Beutel, Ed H. Chi

Abstract: Neural networks lack adversarial robustness, i.e., they are vulnerable to adversarial examples that through small perturbations to inputs cause incorrect predictions. Further, trust is undermined when models give miscalibrated predictions, i.e., the predicted probability is not a good indicator of how much we should trust our model. In this paper, we study the connection between adversarial robust… ▽ More Neural networks lack adversarial robustness, i.e., they are vulnerable to adversarial examples that through small perturbations to inputs cause incorrect predictions. Further, trust is undermined when models give miscalibrated predictions, i.e., the predicted probability is not a good indicator of how much we should trust our model. In this paper, we study the connection between adversarial robustness and calibration and find that the inputs for which the model is sensitive to small perturbations (are easily attacked) are more likely to have poorly calibrated predictions. Based on this insight, we examine if calibration can be improved by addressing those adversarially unrobust inputs. To this end, we propose Adversarial Robustness based Adaptive Label Smoothing (AR-AdaLS) that integrates the correlations of adversarial robustness and calibration into training by adaptively softening labels for an example based on how easily it can be attacked by an adversary. We find that our method, taking the adversarial robustness of the in-distribution data into consideration, leads to better calibration over the model even under distributional shifts. In addition, AR-AdaLS can also be applied to an ensemble model to further improve model calibration. △ Less

Submitted 14 December, 2021; v1 submitted 29 June, 2020; originally announced June 2020.

Comments: Published at NeurIPS-2021

arXiv:2002.07405 [pdf, other]

Deflecting Adversarial Attacks

Authors: Yao Qin, Nicholas Frosst, Colin Raffel, Garrison Cottrell, Geoffrey Hinton

Abstract: There has been an ongoing cycle where stronger defenses against adversarial attacks are subsequently broken by a more advanced defense-aware attack. We present a new approach towards ending this cycle where we "deflect'' adversarial attacks by causing the attacker to produce an input that semantically resembles the attack's target class. To this end, we first propose a stronger defense based on Ca… ▽ More There has been an ongoing cycle where stronger defenses against adversarial attacks are subsequently broken by a more advanced defense-aware attack. We present a new approach towards ending this cycle where we "deflect'' adversarial attacks by causing the attacker to produce an input that semantically resembles the attack's target class. To this end, we first propose a stronger defense based on Capsule Networks that combines three detection mechanisms to achieve state-of-the-art detection performance on both standard and defense-aware attacks. We then show that undetected attacks against our defense often perceptually resemble the adversarial target class by performing a human study where participants are asked to label images produced by the attack. These attack images can no longer be called "adversarial'' because our network classifies them the same way as humans do. △ Less

Submitted 18 February, 2020; originally announced February 2020.

arXiv:1911.11922 [pdf, other]

LqRT: Robust Hypothesis Testing of Location Parameters using Lq-Likelihood-Ratio-Type Test in Python

Authors: Anton Alyakin, Yichen Qin, Carey E. Priebe

Abstract: A t-test is considered a standard procedure for inference on population means and is widely used in scientific discovery. However, as a special case of a likelihood-ratio test, t-test often shows drastic performance degradation due to the deviations from its hard-to-verify distributional assumptions. Alternatively, in this article, we propose a new two-sample Lq-likelihood-ratio-type test (LqRT) a… ▽ More A t-test is considered a standard procedure for inference on population means and is widely used in scientific discovery. However, as a special case of a likelihood-ratio test, t-test often shows drastic performance degradation due to the deviations from its hard-to-verify distributional assumptions. Alternatively, in this article, we propose a new two-sample Lq-likelihood-ratio-type test (LqRT) along with an easy-to-use Python package for implementation. LqRT preserves high power when the distributional assumption is violated, and maintains the satisfactory performance when the assumption is valid. As numerical studies suggest, LqRT dominates many other robust tests in power, such as Wilcoxon test and sign test, while maintaining a valid size. To the extent that the robustness of the Wilcoxon test (minimum asymptotic relative efficiency (ARE) of the Wilcoxon test vs the t-test is 0.864) suggests that the Wilcoxon test should be the default test of choice (rather than "use Wilcoxon if there is evidence of non-normality", the default position should be "use Wilcoxon unless there is good reason to believe the normality assumption"), the results in this article suggest that the LqRT is potentially the new default go-to test for practitioners. △ Less

Submitted 26 November, 2019; originally announced November 2019.

arXiv:1907.02957 [pdf, other]

Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions

Authors: Yao Qin, Nicholas Frosst, Sara Sabour, Colin Raffel, Garrison Cottrell, Geoffrey Hinton

Abstract: Adversarial examples raise questions about whether neural network models are sensitive to the same visual features as humans. In this paper, we first detect adversarial examples or otherwise corrupted images based on a class-conditional reconstruction of the input. To specifically attack our detection mechanism, we propose the Reconstructive Attack which seeks both to cause a misclassification and… ▽ More Adversarial examples raise questions about whether neural network models are sensitive to the same visual features as humans. In this paper, we first detect adversarial examples or otherwise corrupted images based on a class-conditional reconstruction of the input. To specifically attack our detection mechanism, we propose the Reconstructive Attack which seeks both to cause a misclassification and a low reconstruction error. This reconstructive attack produces undetected adversarial examples but with much smaller success rate. Among all these attacks, we find that CapsNets always perform better than convolutional networks. Then, we diagnose the adversarial examples for CapsNets and find that the success of the reconstructive attack is highly related to the visual similarity between the source and target class. Additionally, the resulting perturbations can cause the input image to appear visually more like the target class and hence become non-adversarial. This suggests that CapsNets use features that are more aligned with human perception and have the potential to address the central issue raised by adversarial examples. △ Less

Submitted 18 February, 2020; v1 submitted 5 July, 2019; originally announced July 2019.

Journal ref: ICLR 2020

arXiv:1905.10681 [pdf, other]

Composing Task-Agnostic Policies with Deep Reinforcement Learning

Authors: Ahmed H. Qureshi, Jacob J. Johnson, Yuzhe Qin, Taylor Henderson, Byron Boots, Michael C. Yip

Abstract: The composition of elementary behaviors to solve challenging transfer learning problems is one of the key elements in building intelligent machines. To date, there has been plenty of work on learning task-specific policies or skills but almost no focus on composing necessary, task-agnostic skills to find a solution to new problems. In this paper, we propose a novel deep reinforcement learning-base… ▽ More The composition of elementary behaviors to solve challenging transfer learning problems is one of the key elements in building intelligent machines. To date, there has been plenty of work on learning task-specific policies or skills but almost no focus on composing necessary, task-agnostic skills to find a solution to new problems. In this paper, we propose a novel deep reinforcement learning-based skill transfer and composition method that takes the agent's primitive policies to solve unseen tasks. We evaluate our method in difficult cases where training policy through standard reinforcement learning (RL) or even hierarchical RL is either not feasible or exhibits high sample complexity. We show that our method not only transfers skills to new problem settings but also solves the challenging environments requiring both task planning and motion control with high data efficiency. △ Less

Submitted 30 December, 2019; v1 submitted 25 May, 2019; originally announced May 2019.

Comments: ICLR 2020

arXiv:1905.06010 [pdf, other]

Automatic Model Selection for Neural Networks

Authors: David Laredo, Yulin Qin, Oliver Schütze, Jian-Qiao Sun

Abstract: Neural networks and deep learning are changing the way that artificial intelligence is being done. Efficiently choosing a suitable network architecture and fine-tune its hyper-parameters for a specific dataset is a time-consuming task given the staggering number of possible alternatives. In this paper, we address the problem of model selection by means of a fully automated framework for efficientl… ▽ More Neural networks and deep learning are changing the way that artificial intelligence is being done. Efficiently choosing a suitable network architecture and fine-tune its hyper-parameters for a specific dataset is a time-consuming task given the staggering number of possible alternatives. In this paper, we address the problem of model selection by means of a fully automated framework for efficiently selecting a neural network model for a given task: classification or regression. The algorithm, named Automatic Model Selection, is a modified micro-genetic algorithm that automatically and efficiently finds the most suitable neural network model for a given dataset. The main contributions of this method are a simple list based encoding for neural networks as genotypes in an evolutionary algorithm, new crossover, and mutation operators, the introduction of a fitness function that considers both, the accuracy of the model and its complexity and a method to measure the similarity between two neural networks. AMS is evaluated on two different datasets. By comparing some models obtained with AMS to state-of-the-art models for each dataset we show that AMS can automatically find efficient neural network models. Furthermore, AMS is computationally efficient and can make use of distributed computing paradigms to further boost its performance. △ Less

Submitted 15 May, 2019; originally announced May 2019.

Comments: 31 pages, 6 figures. Preprint Submitted to Elsevier Neural Networks

arXiv:1905.03611 [pdf, other]

Effect of E-cigarette Use and Social Network on Smoking Behavior Change: An agent-based model of E-cigarette and Cigarette Interaction

Authors: Yang Qin, Rojiemiahd Edjoc, Nathaniel D Osgood

Abstract: Despite a general reduction in smoking in many areas of the developed world, it remains one of the biggest public health threats. As an alternative to tobacco, the use of electronic cigarettes (ECig) has been increased dramatically over the last decade. ECig use is hypothesized to impact smoking behavior through several pathways, not only as a means of quitting cigarettes and lowering risk of rela… ▽ More Despite a general reduction in smoking in many areas of the developed world, it remains one of the biggest public health threats. As an alternative to tobacco, the use of electronic cigarettes (ECig) has been increased dramatically over the last decade. ECig use is hypothesized to impact smoking behavior through several pathways, not only as a means of quitting cigarettes and lowering risk of relapse, but also as both an alternative nicotine delivery device to cigarettes, as a visible use of nicotine that can lead to imitative behavior in the form of smoking, and as a gateway nicotine delivery technology that can build high levels of nicotine tolerance and pave the way for initiation of smoking. Evidence regarding the effect of ECig use on smoking behavior change remains inconclusive. To address these challenges, we built an agent-based model (ABM) of smoking and ECig use to examine the effects of ECig use on smoking behavior change. The impact of social network (SN) on the initiation of smoking and ECig use were also explored. Findings from the simulation suggest that the use of ECig generates substantially lower prevalence of current smoker (PCS), which demonstrates the potential for reducing smoking and lowering the risk of relapse. The effects of proximity-based influences within SN increases the prevalence of current ECig user (PCEU). The model also suggests the importance of improved understanding of drivers in cessation and relapse in ECig use, in light of findings that such aspects of behavior change may notably influence smoking behavior change and burden. △ Less

Submitted 2 May, 2019; originally announced May 2019.

Comments: 10 pages, SBP-BRiMS 2019

arXiv:1903.10346 [pdf, other]

Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition

Authors: Yao Qin, Nicholas Carlini, Ian Goodfellow, Garrison Cottrell, Colin Raffel

Abstract: Adversarial examples are inputs to machine learning models designed by an adversary to cause an incorrect output. So far, adversarial examples have been studied most extensively in the image domain. In this domain, adversarial examples can be constructed by imperceptibly modifying images to cause misclassification, and are practical in the physical world. In contrast, current targeted adversarial… ▽ More Adversarial examples are inputs to machine learning models designed by an adversary to cause an incorrect output. So far, adversarial examples have been studied most extensively in the image domain. In this domain, adversarial examples can be constructed by imperceptibly modifying images to cause misclassification, and are practical in the physical world. In contrast, current targeted adversarial examples applied to speech recognition systems have neither of these properties: humans can easily identify the adversarial perturbations, and they are not effective when played over-the-air. This paper makes advances on both of these fronts. First, we develop effectively imperceptible audio adversarial examples (verified through a human study) by leveraging the psychoacoustic principle of auditory masking, while retaining 100% targeted success rate on arbitrary full-sentence targets. Next, we make progress towards physical-world over-the-air audio adversarial examples by constructing perturbations which remain effective even after applying realistic simulated environmental distortions. △ Less

Submitted 7 June, 2019; v1 submitted 22 March, 2019; originally announced March 2019.

Comments: International Conference on Machine Learning (ICML), 2019

arXiv:1808.08793 [pdf, ps, other]

Empirical likelihood for linear models with spatial errors

Authors: Yongsong Qin

Abstract: For linear models with spatial errors, the empirical likelihood ratio statistics are constructed for the parameters of the models. It is shown that the limiting distributions of the empirical likelihood ratio statistics are chi-squared distributions, which are used to construct confidence regions for the parameters of the models. For linear models with spatial errors, the empirical likelihood ratio statistics are constructed for the parameters of the models. It is shown that the limiting distributions of the empirical likelihood ratio statistics are chi-squared distributions, which are used to construct confidence regions for the parameters of the models. △ Less

Submitted 27 August, 2018; originally announced August 2018.

arXiv:1802.06048 [pdf, other]

High-dimensional covariance matrix estimation using a low-rank and diagonal decomposition

Authors: Yilei Wu, Yingli Qin, Mu Zhu

Abstract: We study high-dimensional covariance/precision matrix estimation under the assumption that the covariance/precision matrix can be decomposed into a low-rank component L and a diagonal component D. The rank of L can either be chosen to be small or controlled by a penalty function. Under moderate conditions on the population covariance/precision matrix itself and on the penalty function, we prove so… ▽ More We study high-dimensional covariance/precision matrix estimation under the assumption that the covariance/precision matrix can be decomposed into a low-rank component L and a diagonal component D. The rank of L can either be chosen to be small or controlled by a penalty function. Under moderate conditions on the population covariance/precision matrix itself and on the penalty function, we prove some consistency results for our estimators. A blockwise coordinate descent algorithm, which iteratively updates L and D, is then proposed to obtain the estimator in practice. Finally, various numerical experiments are presented: using simulated data, we show that our estimator performs quite well in terms of the Kullback-Leibler loss; using stock return data, we show that our method can be applied to obtain enhanced solutions to the Markowitz portfolio selection problem. △ Less

Submitted 16 February, 2018; originally announced February 2018.

arXiv:1709.05454 [pdf, other]

Statistical inference on random dot product graphs: a survey

Authors: Avanti Athreya, Donniell E. Fishkind, Keith Levin, Vince Lyzinski, Youngser Park, Yichen Qin, Daniel L. Sussman, Minh Tang, Joshua T. Vogelstein, Carey E. Priebe

Abstract: The random dot product graph (RDPG) is an independent-edge random graph that is analytically tractable and, simultaneously, either encompasses or can successfully approximate a wide range of random graphs, from relatively simple stochastic block models to complex latent position graphs. In this survey paper, we describe a comprehensive paradigm for statistical inference on random dot product graph… ▽ More The random dot product graph (RDPG) is an independent-edge random graph that is analytically tractable and, simultaneously, either encompasses or can successfully approximate a wide range of random graphs, from relatively simple stochastic block models to complex latent position graphs. In this survey paper, we describe a comprehensive paradigm for statistical inference on random dot product graphs, a paradigm centered on spectral embeddings of adjacency and Laplacian matrices. We examine the analogues, in graph inference, of several canonical tenets of classical Euclidean inference: in particular, we summarize a body of existing results on the consistency and asymptotic normality of the adjacency and Laplacian spectral embeddings, and the role these spectral embeddings can play in the construction of single- and multi-sample hypothesis tests for graph data. We investigate several real-world applications, including community detection and classification in large social networks and the determination of functional and biologically relevant network properties from an exploratory data analysis of the Drosophila connectome. We outline requisite background and current open problems in spectral graph inference. △ Less

Submitted 16 September, 2017; originally announced September 2017.

Comments: An expository survey paper on a comprehensive paradigm for inference for random dot product graphs, centered on graph adjacency and Laplacian spectral embeddings. Paper outlines requisite background; summarizes theory, methodology, and applications from previous and ongoing work; and closes with a discussion of several open problems

MSC Class: 62FXX; 62GXX; 62HXX; 05CXX

Journal ref: Journal of Machine Learning Research, 2018

arXiv:1708.05439 [pdf, other]

Penalized Maximum Tangent Likelihood Estimation and Robust Variable Selection

Authors: Yichen Qin, Shaobo Li, Yang Li, Yan Yu

Abstract: We introduce a new class of mean regression estimators -- penalized maximum tangent likelihood estimation -- for high-dimensional regression estimation and variable selection. We first explain the motivations for the key ingredient, maximum tangent likelihood estimation (MTE), and establish its asymptotic properties. We further propose a penalized MTE for variable selection and show that it is… ▽ More We introduce a new class of mean regression estimators -- penalized maximum tangent likelihood estimation -- for high-dimensional regression estimation and variable selection. We first explain the motivations for the key ingredient, maximum tangent likelihood estimation (MTE), and establish its asymptotic properties. We further propose a penalized MTE for variable selection and show that it is $\sqrt{n}$-consistent, enjoys the oracle property. The proposed class of estimators consists penalized $\ell_2$ distance, penalized exponential squared loss, penalized least trimmed square and penalized least square as special cases and can be regarded as a mixture of minimum Kullback-Leibler distance estimation and minimum $\ell_2$ distance estimation. Furthermore, we consider the proposed class of estimators under the high-dimensional setting when the number of variables $d$ can grow exponentially with the sample size $n$, and show that the entire class of estimators (including the aforementioned special cases) can achieve the optimal rate of convergence in the order of $\sqrt{\ln(d)/n}$. Finally, simulation studies and real data analysis demonstrate the advantages of the penalized MTE. △ Less

Submitted 21 August, 2017; v1 submitted 17 August, 2017; originally announced August 2017.

Comments: 30 pages, 3 figures

arXiv:1705.03297 [pdf, other]

Semiparametric spectral modeling of the Drosophila connectome

Authors: Carey E. Priebe, Youngser Park, Minh Tang, Avanti Athreya, Vince Lyzinski, Joshua T. Vogelstein, Yichen Qin, Ben Cocanougher, Katharina Eichler, Marta Zlatic, Albert Cardona

Abstract: We present semiparametric spectral modeling of the complete larval Drosophila mushroom body connectome. Motivated by a thorough exploratory data analysis of the network via Gaussian mixture modeling (GMM) in the adjacency spectral embedding (ASE) representation space, we introduce the latent structure model (LSM) for network modeling and inference. LSM is a generalization of the stochastic block m… ▽ More We present semiparametric spectral modeling of the complete larval Drosophila mushroom body connectome. Motivated by a thorough exploratory data analysis of the network via Gaussian mixture modeling (GMM) in the adjacency spectral embedding (ASE) representation space, we introduce the latent structure model (LSM) for network modeling and inference. LSM is a generalization of the stochastic block model (SBM) and a special case of the random dot product graph (RDPG) latent position model, and is amenable to semiparametric GMM in the ASE representation space. The resulting connectome code derived via semiparametric GMM composed with ASE captures latent connectome structure and elucidates biologically relevant neuronal properties. △ Less

Submitted 9 May, 2017; originally announced May 2017.

arXiv:1704.02971 [pdf, other]

A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction

Authors: Yao Qin, Dong** Song, Haifeng Chen, Wei Cheng, Guofei Jiang, Garrison Cottrell

Abstract: The Nonlinear autoregressive exogenous (NARX) model, which predicts the current value of a time series based upon its previous values as well as the current and past values of multiple driving (exogenous) series, has been studied for decades. Despite the fact that various NARX models have been developed, few of them can capture the long-term temporal dependencies appropriately and select the relev… ▽ More The Nonlinear autoregressive exogenous (NARX) model, which predicts the current value of a time series based upon its previous values as well as the current and past values of multiple driving (exogenous) series, has been studied for decades. Despite the fact that various NARX models have been developed, few of them can capture the long-term temporal dependencies appropriately and select the relevant driving series to make predictions. In this paper, we propose a dual-stage attention-based recurrent neural network (DA-RNN) to address these two issues. In the first stage, we introduce an input attention mechanism to adaptively extract relevant driving series (a.k.a., input features) at each time step by referring to the previous encoder hidden state. In the second stage, we use a temporal attention mechanism to select relevant encoder hidden states across all time steps. With this dual-stage attention scheme, our model can not only make predictions effectively, but can also be easily interpreted. Thorough empirical studies based upon the SML 2010 dataset and the NASDAQ 100 Stock dataset demonstrate that the DA-RNN can outperform state-of-the-art methods for time series prediction. △ Less

Submitted 14 August, 2017; v1 submitted 7 April, 2017; originally announced April 2017.

Comments: International Joint Conference on Artificial Intelligence (IJCAI), 2017

arXiv:1612.06968 [pdf, other]

doi 10.4310/SII.2020.v13.n1.a9

Copula Modeling for Data with Ties

Authors: Yan Li, Yang Li, Yichen Qin, Jun Yan

Abstract: Copula modeling has gained much attention in many fields recently with the advantage of separating dependence structure from marginal distributions. In real data, however, serious ties are often present in one or multiple margins, which cause problems to many rank-based statistical methods developed under the assumption of continuous data with no ties. Simple methods such as breaking the ties at r… ▽ More Copula modeling has gained much attention in many fields recently with the advantage of separating dependence structure from marginal distributions. In real data, however, serious ties are often present in one or multiple margins, which cause problems to many rank-based statistical methods developed under the assumption of continuous data with no ties. Simple methods such as breaking the ties at random or using average rank introduce independence into the data and, hence, lead to biased estimation. We propose an estimation method that treats the ranks of tied data as being interval censored and maximizes a pseudo-likelihood based on interval censored pseudo-observations. A parametric bootstrap procedure that preserves the observed tied ranks in the data is adapted to assess the estimation uncertainty and perform goodness-of-fit tests. The proposed approach is shown to be very competitive in comparison to the simple treatments in a large scale simulation study. Application to a bivariate insurance data illustrates the methodology. △ Less

Submitted 20 December, 2016; originally announced December 2016.

Journal ref: Statistics and Its Interfaces: 2020

arXiv:1612.01801 [pdf, other]

Variable Selection with Scalable Bootstrap in Generalized Linear Model for Massive Data

Authors: Zhibing He, Yichen Qin, Ben-Chang Shia, Yang Li

Abstract: Bootstrap is commonly used as a tool for non-parametric statistical inference to estimate meaningful parameters in Variable Selection Models. However, for massive dataset that has exponential growth rate, the computation of Bootstrap Variable Selection (BootVS) can be a crucial issue. In this paper, we propose the method of Variable Selection with Bag of Little Bootstraps (BLBVS) on General Linear… ▽ More Bootstrap is commonly used as a tool for non-parametric statistical inference to estimate meaningful parameters in Variable Selection Models. However, for massive dataset that has exponential growth rate, the computation of Bootstrap Variable Selection (BootVS) can be a crucial issue. In this paper, we propose the method of Variable Selection with Bag of Little Bootstraps (BLBVS) on General Linear Regression and extend it to Generalized Linear Model for selecting important parameters and assessing the quality of estimators' computation efficiency by analyzing results of multiple bootstrap sub-samples. The introduced method best suits large datasets which have parallel and distributed computing structures. To test the performance of BLBVS, we compare it with BootVS from different aspects via empirical studies. The results of simulations show our method has excellent performance. A real data analysis, Risk Forecast of Credit Cards, is also presented to illustrate the computational superiority of BLBVS on large scale datasets, and the result demonstrates the usefulness and validity of our proposed method. △ Less

Submitted 23 December, 2016; v1 submitted 6 December, 2016; originally announced December 2016.

arXiv:1611.09509 [pdf, other]

Model Confidence Bounds for Variable Selection

Authors: Yang Li, Yuetian Luo, Davide Ferrari, Xiaonan Hu, Yichen Qin

Abstract: In this article, we introduce the concept of model confidence bounds (MCB) for variable selection in the context of nested models. Similarly to the endpoints in the familiar confidence interval for parameter estimation, the MCB identifies two nested models (upper and lower confidence bound models) containing the true model at a given level of confidence. Instead of trusting a single selected model… ▽ More In this article, we introduce the concept of model confidence bounds (MCB) for variable selection in the context of nested models. Similarly to the endpoints in the familiar confidence interval for parameter estimation, the MCB identifies two nested models (upper and lower confidence bound models) containing the true model at a given level of confidence. Instead of trusting a single selected model obtained from a given model selection method, the MCB proposes a group of nested models as candidates and the MCB's width and composition enable the practitioner to assess the overall model selection uncertainty. A new graphical tool --- the model uncertainty curve (MUC) --- is introduced to visualize the variability of model selection and to compare different model selection procedures. The MCB methodology is implemented by a fast bootstrap algorithm that is shown to yield the correct asymptotic coverage under rather general conditions. Our Monte Carlo simulations and real data examples confirm the validity and illustrate the advantages of the proposed method. △ Less

Submitted 26 July, 2018; v1 submitted 29 November, 2016; originally announced November 2016.

arXiv:1611.02802 [pdf, other]

Pairwise Sequential Randomization and Its Properties

Authors: Yichen Qin, Yang Li, Wei Ma, Feifang Hu

Abstract: In comparative studies, such as in causal inference and clinical trials, balancing important covariates is often one of the most important concerns for both efficient and credible comparison. However, chance imbalance still exists in many randomized experiments. This phenomenon of covariate imbalance becomes much more serious as the number of covariates $p$ increases. To address this issue, we int… ▽ More In comparative studies, such as in causal inference and clinical trials, balancing important covariates is often one of the most important concerns for both efficient and credible comparison. However, chance imbalance still exists in many randomized experiments. This phenomenon of covariate imbalance becomes much more serious as the number of covariates $p$ increases. To address this issue, we introduce a new randomization procedure, called pairwise sequential randomization (PSR). The proposed method allocates the units sequentially and adaptively, using information on the current level of imbalance and the incoming unit's covariate. With a large number of covariates or a large number of units, the proposed method shows substantial advantages over the traditional methods in terms of the covariate balance, estimation accuracy, and computational time, making it an ideal technique in the era of big data. The proposed method attains the optimal covariate balance, in the sense that the estimated treatment effect under the proposed method attains its minimum variance asymptotically. Also the proposed method is widely applicable in both causal inference and clinical trials. Numerical studies and real data analysis provide further evidence of the advantages of the proposed method. △ Less

Submitted 26 July, 2018; v1 submitted 8 November, 2016; originally announced November 2016.

arXiv:1310.7278 [pdf, other]

Robust Hypothesis Testing via Lq-Likelihood

Authors: Yichen Qin, Carey E. Priebe

Abstract: This article introduces a robust hypothesis testing procedure: the Lq-likelihood-ratio-type test (LqRT). By deriving the asymptotic distribution of this test statistic, the authors demonstrate its robustness both analytically and numerically, and they investigate the properties of both its influence function and its breakdown point. A proposed method to select the tuning parameter q offers a good… ▽ More This article introduces a robust hypothesis testing procedure: the Lq-likelihood-ratio-type test (LqRT). By deriving the asymptotic distribution of this test statistic, the authors demonstrate its robustness both analytically and numerically, and they investigate the properties of both its influence function and its breakdown point. A proposed method to select the tuning parameter q offers a good efficiency/robustness trade-off, compared with the traditional likelihood ratio test (LRT) and other robust tests. A simulation and real data analysis provides further evidence of the advantages of the proposed LqRT method. In particular, for the special case of testing the location parameter in the presence of gross error contamination, the LqRT dominates the Wilcoxon-Mann-Whitney test and the sign test at various levels of contamination. △ Less

Submitted 24 September, 2016; v1 submitted 27 October, 2013; originally announced October 2013.

Comments: 32 pages, 11 figures

arXiv:1302.0355 [pdf, other]

Estimation of the population spectral distribution from a large dimensional sample covariance matrix

Authors: Weiming Li, Jiaqi Chen, Yingli Qin, Jianfeng Yao, Zhidong Bai

Abstract: This paper introduces a new method to estimate the spectral distribution of a population covariance matrix from high-dimensional data. The method is founded on a meaningful generalization of the seminal Marcenko-Pastur equation, originally defined in the complex plan, to the real line. Beyond its easy implementation and the established asymptotic consistency, the new estimator outperforms two exis… ▽ More This paper introduces a new method to estimate the spectral distribution of a population covariance matrix from high-dimensional data. The method is founded on a meaningful generalization of the seminal Marcenko-Pastur equation, originally defined in the complex plan, to the real line. Beyond its easy implementation and the established asymptotic consistency, the new estimator outperforms two existing estimators from the literature in almost all the situations tested in a simulation experiment. An application to the analysis of the correlation matrix of S&P stocks data is also given. △ Less

Submitted 2 February, 2013; originally announced February 2013.

Comments: 16 pages, 4 figures

Showing 1–35 of 35 results for author: Qin, Y