Skip to main content

Showing 1–35 of 35 results for author: Qin, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.00317  [pdf, other

    stat.ML cs.LG stat.ME

    Combining Experimental and Historical Data for Policy Evaluation

    Authors: Ting Li, Chengchun Shi, Qianglin Wen, Yang Sui, Yongli Qin, Chunbo Lai, Hongtu Zhu

    Abstract: This paper studies policy evaluation with multiple data sources, especially in scenarios that involve one experimental dataset with two arms, complemented by a historical dataset generated under a single control arm. We propose novel data integration methods that linearly integrate base policy value estimators constructed based on the experimental and historical data, with weights optimized to min… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  2. arXiv:2402.04828  [pdf, other

    econ.EM stat.AP

    What drives the European carbon market? Macroeconomic factors and forecasts

    Authors: Andrea Bastianin, Elisabetta Mirto, Yan Qin, Luca Rossini

    Abstract: Putting a price on carbon -- with taxes or develo** carbon markets -- is a widely used policy measure to achieve the target of net-zero emissions by 2050. This paper tackles the issue of producing point, direction-of-change, and density forecasts for the monthly real price of carbon within the EU Emissions Trading Scheme (EU ETS). We aim to uncover supply- and demand-side forces that can contrib… ▽ More

    Submitted 20 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: The Supplementary Material is available upon request to the authors

  3. arXiv:2401.14142  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Probabilistic Interpretations

    Authors: Xinyue Xu, Yi Qin, Lu Mi, Hao Wang, Xiaomeng Li

    Abstract: Existing methods, such as concept bottleneck models (CBMs), have been successful in providing concept-based interpretations for black-box deep learning models. They typically work by predicting concepts given the input and then predicting the final class label given the predicted concepts. However, (1) they often fail to capture the high-order, nonlinear interaction between concepts, e.g., correct… ▽ More

    Submitted 26 February, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted by ICLR 2024

  4. arXiv:2312.12022  [pdf, other

    stat.ML cs.LG

    LightGCNet: A Lightweight Geometric Constructive Neural Network for Data-Driven Soft sensors

    Authors: **g Nan, Yan Qin, Wei Dai, Chau Yuen

    Abstract: Data-driven soft sensors provide a potentially cost-effective and more accurate modeling approach to measure difficult-to-measure indices in industrial processes compared to mechanistic approaches. Artificial intelligence (AI) techniques, such as deep learning, have become a popular soft sensors modeling approach in the area of machine learning and big data. However, soft sensors models based deep… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:2307.00185

  5. arXiv:2312.11393  [pdf, other

    stat.ME

    Assessing Estimation Uncertainty under Model Misspecification

    Authors: Rong Li, Yichen Qin, Yang Li

    Abstract: Model misspecification is ubiquitous in data analysis because the data-generating process is often complex and mathematically intractable. Therefore, assessing estimation uncertainty and conducting statistical inference under a possibly misspecified working model is unavoidable. In such a case, classical methods such as bootstrap and asymptotic theory-based inference frequently fail since they rel… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  6. arXiv:2312.07727  [pdf, other

    stat.ME math.ST

    Two-sample inference for sparse functional data

    Authors: Chi Zhang, Peijun Sang, Yingli Qin

    Abstract: We propose a novel test procedure for comparing mean functions across two groups within the reproducing kernel Hilbert space (RKHS) framework. Our proposed method is adept at handling sparsely and irregularly sampled functional data when observation times are random for each subject. Conventional approaches, which are built upon functional principal components analysis, usually assume a homogeneou… ▽ More

    Submitted 29 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

  7. arXiv:2309.05697  [pdf, other

    astro-ph.CO astro-ph.GA stat.ML

    21cmEMU: an emulator of 21cmFAST summary observables

    Authors: Daniela Breitman, Andrei Mesinger, Steven Murray, David Prelogovic, Yuxiang Qin, Roberto Trotta

    Abstract: Recent years have witnessed rapid progress in observations of the Epoch of Reionization (EoR). These have enabled high-dimensional inference of galaxy and intergalactic medium (IGM) properties during the first billion years of our Universe. However, even using efficient, semi-numerical simulations, traditional inference approaches that compute 3D lightcones on-the-fly can take $10^5$ core hours. H… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: 21 pages, 13 figures, submitted to MNRAS

  8. arXiv:2307.07574  [pdf, other

    stat.ME econ.EM stat.ML

    Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models

    Authors: Xiaorui Zhu, Yichen Qin, Peng Wang

    Abstract: Statistical inference of the high-dimensional regression coefficients is challenging because the uncertainty introduced by the model selection procedure is hard to account for. A critical question remains unsettled; that is, is it possible and how to embed the inference of the model into the simultaneous inference of the coefficients? To this end, we propose a notion of simultaneous confidence int… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: 26 pages, 6 figures

    MSC Class: 62fxx

  9. arXiv:2304.00691  [pdf, other

    stat.ML cs.LG

    Lithium-ion Battery Online Knee Onset Detection by Matrix Profile

    Authors: Kate Qi Zhou, Yan Qin, Chau Yuen

    Abstract: Lithium-ion batteries (LiBs) degrade slightly until the knee onset, after which the deterioration accelerates to end of life (EOL). The knee onset, which marks the initiation of the accelerated degradation rate, is crucial in providing an early warning of the battery's performance changes. However, there is only limited literature on online knee onset identification. Furthermore, it is good to per… ▽ More

    Submitted 2 April, 2023; originally announced April 2023.

    Journal ref: IEEE Transactions on Transportation Electrification, 2023

  10. arXiv:2302.00814  [pdf, other

    cs.LG cs.AI stat.ML

    Stochastic Contextual Bandits with Long Horizon Rewards

    Authors: Yuzhen Qin, Yingcong Li, Fabio Pasqualetti, Maryam Fazel, Samet Oymak

    Abstract: The growing interest in complex decision-making and language modeling problems highlights the importance of sample-efficient learning over very long horizons. This work takes a step in this direction by investigating contextual linear bandits where the current reward depends on at most $s$ prior actions and contexts (not necessarily consecutive), up to a time horizon of $h$. In order to avoid poly… ▽ More

    Submitted 3 February, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

    Comments: 47 pages, to appear at AAAI 2023

  11. arXiv:2208.11204  [pdf, other

    cs.LG stat.ML

    Transfer Learning-based State of Health Estimation for Lithium-ion Battery with Cycle Synchronization

    Authors: Kate Qi Zhou, Yan Qin, Chau Yuen

    Abstract: Accurately estimating a battery's state of health (SOH) helps prevent battery-powered applications from failing unexpectedly. With the superiority of reducing the data requirement of model training for new batteries, transfer learning (TL) emerges as a promising machine learning approach that applies knowledge learned from a source battery, which has a large amount of data. However, the determinat… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  12. arXiv:2108.02905  [pdf, other

    stat.ME stat.AP

    Optimal integrating learning for split questionnaire design type data

    Authors: Cunjie Lin, **gfu Peng, Yichen Qin, Yang Li, Yuhong Yang

    Abstract: In the era of data science, it is common to encounter data with different subsets of variables obtained for different cases. An example is the split questionnaire design (SQD), which is adopted to reduce respondent fatigue and improve response rates by assigning different subsets of the questionnaire to different sampled respondents. A general question then is how to estimate the regression functi… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

  13. arXiv:2106.12991  [pdf, other

    cs.CV eess.IV physics.med-ph stat.AP

    Relationship between pulmonary nodule malignancy and surrounding pleurae, airways and vessels: a quantitative study using the public LIDC-IDRI dataset

    Authors: Yulei Qin, Yun Gu, Hanxiao Zhang, Jie Yang, Lihui Wang, Zhexin Wang, Feng Yao, Yue-Min Zhu

    Abstract: To investigate whether the pleurae, airways and vessels surrounding a nodule on non-contrast computed tomography (CT) can discriminate benign and malignant pulmonary nodules. The LIDC-IDRI dataset, one of the largest publicly available CT database, was exploited for study. A total of 1556 nodules from 694 patients were involved in statistical analysis, where nodules with average scorings <3 and >3… ▽ More

    Submitted 12 December, 2021; v1 submitted 24 June, 2021; originally announced June 2021.

    Comments: 33 pages, 3 figures, Submitted for review

  14. Testing for Treatment Effect in Covariate-Adaptive Randomized Clinical Trials with Generalized Linear Models and Omitted Covariates

    Authors: Li Yang, Wei Ma, Yichen Qin, Feifang Hu

    Abstract: Concerns have been expressed over the validity of statistical inference under covariate-adaptive randomization despite the extensive use in clinical trials. In the literature, the inferential properties under covariate-adaptive randomization have been mainly studied for continuous responses; in particular, it is well known that the usual two sample t-test for treatment effect is typically conserva… ▽ More

    Submitted 2 May, 2021; v1 submitted 9 September, 2020; originally announced September 2020.

    Comments: Updated to the published version

    Journal ref: Statistical Methods in Medical Research 30, no. 9 (2021): 2148-2164

  15. arXiv:2007.08129  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Layer-Wise Adaptive Updating for Few-Shot Image Classification

    Authors: Yunxiao Qin, Weiguo Zhang, Zezheng Wang, Chenxu Zhao, **g** Shi

    Abstract: Few-shot image classification (FSIC), which requires a model to recognize new categories via learning from few images of these categories, has attracted lots of attention. Recently, meta-learning based methods have been shown as a promising direction for FSIC. Commonly, they train a meta-learner (meta-learning model) to learn easy fine-tuning weight, and when solving an FSIC task, the meta-learner… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

  16. arXiv:2006.16375  [pdf, other

    cs.LG stat.ML

    Improving Calibration through the Relationship with Adversarial Robustness

    Authors: Yao Qin, Xuezhi Wang, Alex Beutel, Ed H. Chi

    Abstract: Neural networks lack adversarial robustness, i.e., they are vulnerable to adversarial examples that through small perturbations to inputs cause incorrect predictions. Further, trust is undermined when models give miscalibrated predictions, i.e., the predicted probability is not a good indicator of how much we should trust our model. In this paper, we study the connection between adversarial robust… ▽ More

    Submitted 14 December, 2021; v1 submitted 29 June, 2020; originally announced June 2020.

    Comments: Published at NeurIPS-2021

  17. arXiv:2002.07405  [pdf, other

    cs.LG cs.CV stat.ML

    Deflecting Adversarial Attacks

    Authors: Yao Qin, Nicholas Frosst, Colin Raffel, Garrison Cottrell, Geoffrey Hinton

    Abstract: There has been an ongoing cycle where stronger defenses against adversarial attacks are subsequently broken by a more advanced defense-aware attack. We present a new approach towards ending this cycle where we "deflect'' adversarial attacks by causing the attacker to produce an input that semantically resembles the attack's target class. To this end, we first propose a stronger defense based on Ca… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

  18. arXiv:1911.11922  [pdf, other

    stat.ME

    LqRT: Robust Hypothesis Testing of Location Parameters using Lq-Likelihood-Ratio-Type Test in Python

    Authors: Anton Alyakin, Yichen Qin, Carey E. Priebe

    Abstract: A t-test is considered a standard procedure for inference on population means and is widely used in scientific discovery. However, as a special case of a likelihood-ratio test, t-test often shows drastic performance degradation due to the deviations from its hard-to-verify distributional assumptions. Alternatively, in this article, we propose a new two-sample Lq-likelihood-ratio-type test (LqRT) a… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

  19. arXiv:1907.02957  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions

    Authors: Yao Qin, Nicholas Frosst, Sara Sabour, Colin Raffel, Garrison Cottrell, Geoffrey Hinton

    Abstract: Adversarial examples raise questions about whether neural network models are sensitive to the same visual features as humans. In this paper, we first detect adversarial examples or otherwise corrupted images based on a class-conditional reconstruction of the input. To specifically attack our detection mechanism, we propose the Reconstructive Attack which seeks both to cause a misclassification and… ▽ More

    Submitted 18 February, 2020; v1 submitted 5 July, 2019; originally announced July 2019.

    Journal ref: ICLR 2020

  20. arXiv:1905.10681  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Composing Task-Agnostic Policies with Deep Reinforcement Learning

    Authors: Ahmed H. Qureshi, Jacob J. Johnson, Yuzhe Qin, Taylor Henderson, Byron Boots, Michael C. Yip

    Abstract: The composition of elementary behaviors to solve challenging transfer learning problems is one of the key elements in building intelligent machines. To date, there has been plenty of work on learning task-specific policies or skills but almost no focus on composing necessary, task-agnostic skills to find a solution to new problems. In this paper, we propose a novel deep reinforcement learning-base… ▽ More

    Submitted 30 December, 2019; v1 submitted 25 May, 2019; originally announced May 2019.

    Comments: ICLR 2020

  21. arXiv:1905.06010  [pdf, other

    cs.LG stat.ML

    Automatic Model Selection for Neural Networks

    Authors: David Laredo, Yulin Qin, Oliver Schütze, Jian-Qiao Sun

    Abstract: Neural networks and deep learning are changing the way that artificial intelligence is being done. Efficiently choosing a suitable network architecture and fine-tune its hyper-parameters for a specific dataset is a time-consuming task given the staggering number of possible alternatives. In this paper, we address the problem of model selection by means of a fully automated framework for efficientl… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

    Comments: 31 pages, 6 figures. Preprint Submitted to Elsevier Neural Networks

  22. arXiv:1905.03611  [pdf, other

    q-bio.OT stat.AP

    Effect of E-cigarette Use and Social Network on Smoking Behavior Change: An agent-based model of E-cigarette and Cigarette Interaction

    Authors: Yang Qin, Rojiemiahd Edjoc, Nathaniel D Osgood

    Abstract: Despite a general reduction in smoking in many areas of the developed world, it remains one of the biggest public health threats. As an alternative to tobacco, the use of electronic cigarettes (ECig) has been increased dramatically over the last decade. ECig use is hypothesized to impact smoking behavior through several pathways, not only as a means of quitting cigarettes and lowering risk of rela… ▽ More

    Submitted 2 May, 2019; originally announced May 2019.

    Comments: 10 pages, SBP-BRiMS 2019

  23. arXiv:1903.10346  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition

    Authors: Yao Qin, Nicholas Carlini, Ian Goodfellow, Garrison Cottrell, Colin Raffel

    Abstract: Adversarial examples are inputs to machine learning models designed by an adversary to cause an incorrect output. So far, adversarial examples have been studied most extensively in the image domain. In this domain, adversarial examples can be constructed by imperceptibly modifying images to cause misclassification, and are practical in the physical world. In contrast, current targeted adversarial… ▽ More

    Submitted 7 June, 2019; v1 submitted 22 March, 2019; originally announced March 2019.

    Comments: International Conference on Machine Learning (ICML), 2019

  24. arXiv:1808.08793  [pdf, ps, other

    stat.ME

    Empirical likelihood for linear models with spatial errors

    Authors: Yongsong Qin

    Abstract: For linear models with spatial errors, the empirical likelihood ratio statistics are constructed for the parameters of the models. It is shown that the limiting distributions of the empirical likelihood ratio statistics are chi-squared distributions, which are used to construct confidence regions for the parameters of the models.

    Submitted 27 August, 2018; originally announced August 2018.

  25. arXiv:1802.06048  [pdf, other

    stat.ME

    High-dimensional covariance matrix estimation using a low-rank and diagonal decomposition

    Authors: Yilei Wu, Yingli Qin, Mu Zhu

    Abstract: We study high-dimensional covariance/precision matrix estimation under the assumption that the covariance/precision matrix can be decomposed into a low-rank component L and a diagonal component D. The rank of L can either be chosen to be small or controlled by a penalty function. Under moderate conditions on the population covariance/precision matrix itself and on the penalty function, we prove so… ▽ More

    Submitted 16 February, 2018; originally announced February 2018.

  26. arXiv:1709.05454  [pdf, other

    stat.ME math.ST stat.ML

    Statistical inference on random dot product graphs: a survey

    Authors: Avanti Athreya, Donniell E. Fishkind, Keith Levin, Vince Lyzinski, Youngser Park, Yichen Qin, Daniel L. Sussman, Minh Tang, Joshua T. Vogelstein, Carey E. Priebe

    Abstract: The random dot product graph (RDPG) is an independent-edge random graph that is analytically tractable and, simultaneously, either encompasses or can successfully approximate a wide range of random graphs, from relatively simple stochastic block models to complex latent position graphs. In this survey paper, we describe a comprehensive paradigm for statistical inference on random dot product graph… ▽ More

    Submitted 16 September, 2017; originally announced September 2017.

    Comments: An expository survey paper on a comprehensive paradigm for inference for random dot product graphs, centered on graph adjacency and Laplacian spectral embeddings. Paper outlines requisite background; summarizes theory, methodology, and applications from previous and ongoing work; and closes with a discussion of several open problems

    MSC Class: 62FXX; 62GXX; 62HXX; 05CXX

    Journal ref: Journal of Machine Learning Research, 2018

  27. arXiv:1708.05439  [pdf, other

    stat.ME

    Penalized Maximum Tangent Likelihood Estimation and Robust Variable Selection

    Authors: Yichen Qin, Shaobo Li, Yang Li, Yan Yu

    Abstract: We introduce a new class of mean regression estimators -- penalized maximum tangent likelihood estimation -- for high-dimensional regression estimation and variable selection. We first explain the motivations for the key ingredient, maximum tangent likelihood estimation (MTE), and establish its asymptotic properties. We further propose a penalized MTE for variable selection and show that it is… ▽ More

    Submitted 21 August, 2017; v1 submitted 17 August, 2017; originally announced August 2017.

    Comments: 30 pages, 3 figures

  28. arXiv:1705.03297  [pdf, other

    stat.ML

    Semiparametric spectral modeling of the Drosophila connectome

    Authors: Carey E. Priebe, Youngser Park, Minh Tang, Avanti Athreya, Vince Lyzinski, Joshua T. Vogelstein, Yichen Qin, Ben Cocanougher, Katharina Eichler, Marta Zlatic, Albert Cardona

    Abstract: We present semiparametric spectral modeling of the complete larval Drosophila mushroom body connectome. Motivated by a thorough exploratory data analysis of the network via Gaussian mixture modeling (GMM) in the adjacency spectral embedding (ASE) representation space, we introduce the latent structure model (LSM) for network modeling and inference. LSM is a generalization of the stochastic block m… ▽ More

    Submitted 9 May, 2017; originally announced May 2017.

  29. arXiv:1704.02971  [pdf, other

    cs.LG stat.ML

    A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction

    Authors: Yao Qin, Dong** Song, Haifeng Chen, Wei Cheng, Guofei Jiang, Garrison Cottrell

    Abstract: The Nonlinear autoregressive exogenous (NARX) model, which predicts the current value of a time series based upon its previous values as well as the current and past values of multiple driving (exogenous) series, has been studied for decades. Despite the fact that various NARX models have been developed, few of them can capture the long-term temporal dependencies appropriately and select the relev… ▽ More

    Submitted 14 August, 2017; v1 submitted 7 April, 2017; originally announced April 2017.

    Comments: International Joint Conference on Artificial Intelligence (IJCAI), 2017

  30. Copula Modeling for Data with Ties

    Authors: Yan Li, Yang Li, Yichen Qin, Jun Yan

    Abstract: Copula modeling has gained much attention in many fields recently with the advantage of separating dependence structure from marginal distributions. In real data, however, serious ties are often present in one or multiple margins, which cause problems to many rank-based statistical methods developed under the assumption of continuous data with no ties. Simple methods such as breaking the ties at r… ▽ More

    Submitted 20 December, 2016; originally announced December 2016.

    Journal ref: Statistics and Its Interfaces: 2020

  31. arXiv:1612.01801  [pdf, other

    stat.CO stat.ME

    Variable Selection with Scalable Bootstrap in Generalized Linear Model for Massive Data

    Authors: Zhibing He, Yichen Qin, Ben-Chang Shia, Yang Li

    Abstract: Bootstrap is commonly used as a tool for non-parametric statistical inference to estimate meaningful parameters in Variable Selection Models. However, for massive dataset that has exponential growth rate, the computation of Bootstrap Variable Selection (BootVS) can be a crucial issue. In this paper, we propose the method of Variable Selection with Bag of Little Bootstraps (BLBVS) on General Linear… ▽ More

    Submitted 23 December, 2016; v1 submitted 6 December, 2016; originally announced December 2016.

  32. arXiv:1611.09509  [pdf, other

    stat.ME

    Model Confidence Bounds for Variable Selection

    Authors: Yang Li, Yuetian Luo, Davide Ferrari, Xiaonan Hu, Yichen Qin

    Abstract: In this article, we introduce the concept of model confidence bounds (MCB) for variable selection in the context of nested models. Similarly to the endpoints in the familiar confidence interval for parameter estimation, the MCB identifies two nested models (upper and lower confidence bound models) containing the true model at a given level of confidence. Instead of trusting a single selected model… ▽ More

    Submitted 26 July, 2018; v1 submitted 29 November, 2016; originally announced November 2016.

  33. arXiv:1611.02802  [pdf, other

    stat.ME

    Pairwise Sequential Randomization and Its Properties

    Authors: Yichen Qin, Yang Li, Wei Ma, Feifang Hu

    Abstract: In comparative studies, such as in causal inference and clinical trials, balancing important covariates is often one of the most important concerns for both efficient and credible comparison. However, chance imbalance still exists in many randomized experiments. This phenomenon of covariate imbalance becomes much more serious as the number of covariates $p$ increases. To address this issue, we int… ▽ More

    Submitted 26 July, 2018; v1 submitted 8 November, 2016; originally announced November 2016.

  34. arXiv:1310.7278  [pdf, other

    stat.AP

    Robust Hypothesis Testing via Lq-Likelihood

    Authors: Yichen Qin, Carey E. Priebe

    Abstract: This article introduces a robust hypothesis testing procedure: the Lq-likelihood-ratio-type test (LqRT). By deriving the asymptotic distribution of this test statistic, the authors demonstrate its robustness both analytically and numerically, and they investigate the properties of both its influence function and its breakdown point. A proposed method to select the tuning parameter q offers a good… ▽ More

    Submitted 24 September, 2016; v1 submitted 27 October, 2013; originally announced October 2013.

    Comments: 32 pages, 11 figures

  35. arXiv:1302.0355  [pdf, other

    stat.ME

    Estimation of the population spectral distribution from a large dimensional sample covariance matrix

    Authors: Weiming Li, Jiaqi Chen, Yingli Qin, Jianfeng Yao, Zhidong Bai

    Abstract: This paper introduces a new method to estimate the spectral distribution of a population covariance matrix from high-dimensional data. The method is founded on a meaningful generalization of the seminal Marcenko-Pastur equation, originally defined in the complex plan, to the real line. Beyond its easy implementation and the established asymptotic consistency, the new estimator outperforms two exis… ▽ More

    Submitted 2 February, 2013; originally announced February 2013.

    Comments: 16 pages, 4 figures