Skip to main content

Showing 1–50 of 161 results for author: Wang, Q

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.00655  [pdf, other

    stat.ME

    Markov Switching Multiple-equation Tensor Regressions

    Authors: Roberto Casarin, Radu Craiu, Qing Wang

    Abstract: We propose a new flexible tensor model for multiple-equation regression that accounts for latent regime changes. The model allows for dynamic coefficients and multi-dimensional covariates that vary across equations. We assume the coefficients are driven by a common hidden Markov process that addresses structural breaks to enhance the model flexibility and preserve parsimony. We introduce a new Sof… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  2. arXiv:2406.02234  [pdf, other

    cs.LG cs.AI math.DS stat.ML

    On the Limitations of Fractal Dimension as a Measure of Generalization

    Authors: Charlie Tan, Inés García-Redondo, Qiquan Wang, Michael M. Bronstein, Anthea Monod

    Abstract: Bounding and predicting the generalization gap of overparameterized neural networks remains a central open problem in theoretical machine learning. Neural network optimization trajectories have been proposed to possess fractal structure, leading to bounds and generalization measures based on notions of fractal dimension on these trajectories. Prominently, both the Hausdorff dimension and the persi… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 17 pages, 6 figures

  3. arXiv:2405.00188  [pdf, other

    stat.AP econ.TH

    A Revisit of the Optimal Excess-of-Loss Contract

    Authors: Ernest Aboagye, Vali Asimit, Tsz Chai Fung, Liang Peng, Qiuqi Wang

    Abstract: It is well-known that Excess-of-Loss reinsurance has more marketability than Stop-Loss reinsurance, though Stop-Loss reinsurance is the most prominent setting discussed in the optimal (re)insurance design literature. We point out that optimal reinsurance policy under Stop-Loss leads to a zero insolvency probability, which motivates our paper. We provide a remedy to this peculiar property of the op… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  4. arXiv:2403.11497  [pdf, other

    cs.CV cs.LG stat.ML

    Do CLIPs Always Generalize Better than ImageNet Models?

    Authors: Qizhou Wang, Yong Lin, Yongqiang Chen, Ludwig Schmidt, Bo Han, Tong Zhang

    Abstract: Large vision language models, such as CLIPs, have revolutionized modern machine learning. CLIPs have demonstrated great generalizability under distribution shifts, supported by an increasing body of literature. However, the evaluation datasets for CLIPs are variations primarily designed for ImageNet benchmarks, which may not fully reflect the extent to which CLIPs, e.g., pre-trained on LAION, robu… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Qizhou Wang, Yong Lin, and Yongqiang Chen contributed equally. Project page: https://counteranimal.github.io

  5. arXiv:2402.13678  [pdf, other

    stat.CO math.PR stat.ME

    Weak Poincaré inequality comparisons for ideal and hybrid slice sampling

    Authors: Sam Power, Daniel Rudolf, Björn Sprungk, Andi Q. Wang

    Abstract: Using the framework of weak Poincar{é} inequalities, we provide a general comparison between the Hybrid and Ideal Slice Sampling Markov chains in terms of their Dirichlet forms. In particular, under suitable assumptions Hybrid Slice Sampling will inherit fast convergence from Ideal Slice Sampling and conversely. We apply our results to analyse the convergence of the Independent Metropolis--Hasting… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 35 pages, 2 figures

    MSC Class: 65C05; 60J22

  6. arXiv:2401.12836  [pdf, other

    stat.ME stat.CO

    Empirical Likelihood Inference over Decentralized Networks

    Authors: **ye Du, Qihua Wang

    Abstract: As a nonparametric statistical inference approach, empirical likelihood has been found very useful in numerous occasions. However, it encounters serious computational challenges when applied directly to the modern massive dataset. This article studies empirical likelihood inference over decentralized distributed networks, where the data are locally collected and stored by different nodes. To fully… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  7. arXiv:2401.12827  [pdf, other

    stat.ME

    Distributed Empirical Likelihood Inference With or Without Byzantine Failures

    Authors: Qihua Wang, **ye Du, Ying Sheng

    Abstract: Empirical likelihood is a very important nonparametric approach which is of wide application. However, it is hard and even infeasible to calculate the empirical log-likelihood ratio statistic with massive data. The main challenge is the calculation of the Lagrange multiplier. This motivates us to develop a distributed empirical likelihood method by calculating the Lagrange multiplier in a multi-ro… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  8. arXiv:2312.11689  [pdf, ps, other

    math.PR stat.CO

    Weak Poincaré Inequalities for Markov chains: theory and applications

    Authors: Christophe Andrieu, Anthony Lee, Sam Power, Andi Q. Wang

    Abstract: We investigate the application of Weak Poincaré Inequalities (WPI) to Markov chains to study their rates of convergence and to derive complexity bounds. At a theoretical level we investigate the necessity of the existence of WPIs to ensure \mathrm{L}^{2}-convergence, in particular by establishing equivalence with the Resolvent Uniform Positivity-Improving (RUPI) condition and providing a counterex… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  9. arXiv:2312.10596  [pdf, other

    stat.ME

    A maximin optimal approach for sampling designs in two-phase studies

    Authors: Ruoyu Wang, Qihua Wang, Wang Miao

    Abstract: Data collection costs can vary widely across variables in data science tasks. Two-phase designs are often employed to save data collection costs. In two-phase studies, inexpensive variables are collected for all subjects in the first phase, and expensive variables are measured for a subset of subjects in the second phase based on a predetermined sampling rule. The estimation efficiency under two-p… ▽ More

    Submitted 25 May, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

  10. arXiv:2311.10848  [pdf, ps, other

    stat.ME stat.AP

    Addressing Population Heterogeneity for HIV Incidence Estimation Based on Recency Test

    Authors: Qi Wang, Ann Duerr, Fei Gao

    Abstract: Cross-sectional HIV incidence estimation leverages recency test results to determine the HIV incidence of a population of interest, where recency test uses biomarker profiles to infer whether an HIV-positive individual was "recently" infected. This approach possesses an obvious advantage over the conventional cohort follow-up method since it avoids longitudinal follow-up and repeated HIV testing.… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  11. arXiv:2311.03115  [pdf, other

    cs.CY cs.LG stat.AP

    RELand: Risk Estimation of Landmines via Interpretable Invariant Risk Minimization

    Authors: Mateo Dulce Rubio, Siqi Zeng, Qi Wang, Didier Alvarado, Francisco Moreno, Hoda Heidari, Fei Fang

    Abstract: Landmines remain a threat to war-affected communities for years after conflicts have ended, partly due to the laborious nature of demining tasks. Humanitarian demining operations begin by collecting relevant information from the sites to be cleared, which is then analyzed by human experts to determine the potential risk of remaining landmines. In this paper, we propose RELand system to support the… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  12. arXiv:2310.03164  [pdf, other

    stat.ME stat.AP

    A Hierarchical Random Effects State-space Model for Modeling Brain Activities from Electroencephalogram Data

    Authors: Xingche Guo, Bin Yang, Ji Meng Loh, Qinxia Wang, Yuanjia Wang

    Abstract: Mental disorders present challenges in diagnosis and treatment due to their complex and heterogeneous nature. Electroencephalogram (EEG) has shown promise as a potential biomarker for these disorders. However, existing methods for analyzing EEG signals have limitations in addressing heterogeneity and capturing complex brain activity patterns between regions. This paper proposes a novel random effe… ▽ More

    Submitted 27 January, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

  13. arXiv:2309.09872  [pdf, other

    stat.ME

    Moment-assisted Subsampling based Maximum Likelihood Estimator

    Authors: Miaomiao Su, Qihua Wang, Ruoyu Wang

    Abstract: This paper proposes a moment-assisted subsampling method which can improve the estimation efficiency of existing subsampling estimators. The motivation behind this approach stems from the fact that sample moments can be efficiently computed even if the sample size of the whole data set is huge. Through the generalized method of moments, this method incorporates informative sample moments of the wh… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  14. arXiv:2309.08489  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network

    Authors: Yiling Huang, Weiran Wang, Guanlong Zhao, Hank Liao, Wei Xia, Quan Wang

    Abstract: While standard speaker diarization attempts to answer the question "who spoken when", most of relevant applications in reality are more interested in determining "who spoken what". Whether it is the conventional modularized approach or the more recent end-to-end neural diarization (EEND), an additional automatic speech recognition (ASR) model and an orchestration algorithm are required to associat… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  15. arXiv:2308.10091  [pdf, other

    stat.AP

    Incorporating Connectivity among Internet Search Data for Enhanced Influenza-like Illness Tracking

    Authors: Shaoyang Ning, Ahmed Hussain, Qing Wang

    Abstract: Big data collected from the Internet possess great potential to reveal the ever-changing trends in society. In particular, accurate infectious disease tracking with Internet data has grown in popularity, providing invaluable information for public health decision makers and the general public. However, much of the complex connectivity among the Internet search data is not effectively addressed amo… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

  16. arXiv:2307.02904  [pdf, other

    math.AT stat.ML

    Computable Stability for Persistence Rank Function Machine Learning

    Authors: Qiquan Wang, Inés García-Redondo, Pierre Faugère, Anthea Monod, Gregory Henselman-Petrusek

    Abstract: Persistent homology barcodes and diagrams are a cornerstone of topological data analysis. Widely used in many real data settings, they relate variation in topological information (as measured by cellular homology) with variation in data, however, they are challenging to use in statistical settings due to their complex geometric structure. In this paper, we revisit the persistent homology rank func… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  17. arXiv:2305.18779  [pdf, other

    cs.LG math.AP math.OC stat.ML

    It begins with a boundary: A geometric view on probabilistically robust learning

    Authors: Leon Bungert, Nicolás García Trillos, Matt Jacobs, Daniel McKenzie, Đorđe Nikolić, Qingsong Wang

    Abstract: Although deep neural networks have achieved super-human performance on many classification tasks, they often exhibit a worrying lack of robustness towards adversarially generated examples. Thus, considerable effort has been invested into reformulating Empirical Risk Minimization (ERM) into an adversarially robust framework. Recently, attention has shifted towards approaches which interpolate betwe… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  18. arXiv:2305.13852  [pdf, ps, other

    stat.AP

    Learning Optimal Biomarker-Guided Treatment Policy for Chronic Disorders

    Authors: Bin Yang, Xingche Guo, Ji Meng Loh, Qinxia Wang, Yuanjia Wang

    Abstract: Electroencephalogram (EEG) provides noninvasive measures of brain activity and is found to be valuable for diagnosis of some chronic disorders. Specifically, pre-treatment EEG signals in alpha and theta frequency bands have demonstrated some association with anti-depressant response, which is well-known to have low response rate. We aim to design an integrated pipeline that improves the response r… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  19. arXiv:2303.04040  [pdf, other

    cs.LG stat.AP stat.ML

    Uncertainty Quantification of Spatiotemporal Travel Demand with Probabilistic Graph Neural Networks

    Authors: Qingyi Wang, Shenhao Wang, Dingyi Zhuang, Haris Koutsopoulos, **hua Zhao

    Abstract: Recent studies have significantly improved the prediction accuracy of travel demand using graph neural networks. However, these studies largely ignored uncertainty that inevitably exists in travel demand prediction. To fill this gap, this study proposes a framework of probabilistic graph neural networks (Prob-GNN) to quantify the spatiotemporal uncertainty of travel demand. This Prob-GNN framework… ▽ More

    Submitted 22 February, 2024; v1 submitted 7 March, 2023; originally announced March 2023.

  20. arXiv:2211.16919  [pdf, other

    cs.HC stat.AP

    The Mood of the Sunlight: Visualization of the Sunlight Data for Public Art

    Authors: Yifan Wang, Nan Li, Suxuan Jiang, **long Xu, Qi Wang, Shaomin Shen, Ning Ding

    Abstract: The application of data visualization in public art attracts increasing attention. In this paper, we present the design and implementation of a visualization method for sunlight data collected over a long period of time with an industrial camera. The proposed method makes use of the saturation and value information of collected sunlight image data in Hue Saturation Value color model to show the va… ▽ More

    Submitted 3 January, 2024; v1 submitted 30 November, 2022; originally announced November 2022.

  21. arXiv:2211.08959  [pdf, other

    math.PR stat.CO

    Explicit convergence bounds for Metropolis Markov chains: isoperimetry, spectral gaps and profiles

    Authors: Christophe Andrieu, Anthony Lee, Sam Power, Andi Q. Wang

    Abstract: We derive the first explicit bounds for the spectral gap of a random walk Metropolis algorithm on $R^d$ for any value of the proposal variance, which when scaled appropriately recovers the correct $d^{-1}$ dependence on dimension for suitably regular invariant distributions. We also obtain explicit bounds on the ${\rm L}^2$-mixing time for a broad class of models. In obtaining these results, we re… ▽ More

    Submitted 31 October, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

  22. arXiv:2210.09901  [pdf, other

    stat.CO

    Sampling using Adaptive Regenerative Processes

    Authors: Hector McKimm, Andi Q Wang, Murray Pollock, Christian P Robert, Gareth O Roberts

    Abstract: Enriching Brownian motion with regenerations from a fixed regeneration distribution $μ$ at a particular regeneration rate $κ$ results in a Markov process that has a target distribution $π$ as its invariant distribution. For the purpose of Monte Carlo inference, implementing such a scheme requires firstly selection of regeneration distribution $μ$, and secondly computation of a specific constant… ▽ More

    Submitted 20 February, 2024; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: 43 pages, 10 figures

  23. arXiv:2209.00991  [pdf, ps, other

    q-fin.RM math.ST stat.ME

    E-backtesting

    Authors: Qiuqi Wang, Ruodu Wang, Johanna Ziegel

    Abstract: In the recent Basel Accords, the Expected Shortfall (ES) replaces the Value-at-Risk (VaR) as the standard risk measure for market risk in the banking sector, making it the most important risk measure in financial regulation. One of the most challenging tasks in risk modeling practice is to backtest ES forecasts provided by financial institutions. To design a model-free backtesting procedure for ES… ▽ More

    Submitted 23 May, 2023; v1 submitted 26 August, 2022; originally announced September 2022.

  24. arXiv:2208.05239  [pdf, ps, other

    math.PR stat.CO

    Poincaré inequalities for Markov chains: a meeting with Cheeger, Lyapunov and Metropolis

    Authors: Christophe Andrieu, Anthony Lee, Sam Power, Andi Q. Wang

    Abstract: We develop a theory of weak Poincaré inequalities to characterize convergence rates of ergodic Markov chains. Motivated by the application of Markov chains in the context of algorithms, we develop a relevant set of tools which enable the practical study of convergence rates in the setting of Markov chain Monte Carlo methods, but also well beyond.

    Submitted 10 August, 2022; originally announced August 2022.

    Comments: 80 pages

    MSC Class: 60J22; 65C05

  25. arXiv:2207.00924  [pdf, other

    stat.ME

    Stability Approach to Regularization Selection for Reduced-Rank Regression

    Authors: Canhong Wen, Qin Wang, Yuan Jiang

    Abstract: The reduced-rank regression model is a popular model to deal with multivariate response and multiple predictors, and is widely used in biology, chemometrics, econometrics, engineering, and other fields. In the reduced-rank regression modelling, a central objective is to estimate the rank of the coefficient matrix that represents the number of effective latent factors in predicting the multivariate… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

  26. arXiv:2206.12882  [pdf, other

    stat.ME cs.AI cs.LG

    fETSmcs: Feature-based ETS model component selection

    Authors: Lingzhi Qi, Xixi Li, Qiang Wang, Suling Jia

    Abstract: The well-developed ETS (ExponenTial Smoothing or Error, Trend, Seasonality) method incorporating a family of exponential smoothing models in state space representation has been widely used for automatic forecasting. The existing ETS method uses information criteria for model selection by choosing an optimal model with the smallest information criterion among all models fitted to a given time serie… ▽ More

    Submitted 26 June, 2022; originally announced June 2022.

  27. arXiv:2205.13814  [pdf, ps, other

    cs.LG stat.ML

    Global Convergence of Over-parameterized Deep Equilibrium Models

    Authors: Zenan Ling, Xingyu Xie, Qiuhao Wang, Zongpeng Zhang, Zhouchen Lin

    Abstract: A deep equilibrium model (DEQ) is implicitly defined through an equilibrium point of an infinite-depth weight-tied model with an input-injection. Instead of infinite computations, it solves an equilibrium point directly with root-finding and computes gradients with implicit differentiation. The training dynamics of over-parameterized DEQs are investigated in this study. By supposing a condition on… ▽ More

    Submitted 28 March, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: Accepted by AISTATS 2023

  28. arXiv:2204.04567  [pdf, other

    cs.CV cs.LG stat.ML

    Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification

    Authors: Jiangtao Xie, Fei Long, Jiaming Lv, Qilong Wang, Peihua Li

    Abstract: Few-shot classification is a challenging problem as only very few training examples are given for each new task. One of the effective research lines to address this challenge focuses on learning deep representations driven by a similarity measure between a query image and few support images of some class. Statistically, this amounts to measure the dependency of image features, viewed as random vec… ▽ More

    Submitted 9 April, 2022; originally announced April 2022.

    Comments: Accepted to CVPR 2022 as an oral presentation. Equal contribution from first two authors

  29. arXiv:2202.12169  [pdf, other

    eess.AS cs.LG stat.ML

    Closing the Gap between Single-User and Multi-User VoiceFilter-Lite

    Authors: Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ian McGraw

    Abstract: VoiceFilter-Lite is a speaker-conditioned voice separation model that plays a crucial role in improving speech recognition and speaker verification by suppressing overlap** speech from non-target speakers. However, one limitation of VoiceFilter-Lite, and other speaker-conditioned speech models in general, is that these models are usually limited to a single target speaker. This is undesirable as… ▽ More

    Submitted 26 April, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

  30. arXiv:2202.12163  [pdf, other

    eess.AS cs.CL cs.LG stat.ML

    Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech

    Authors: Quan Wang, Yang Yu, Jason Pelecanos, Yiling Huang, Ignacio Lopez Moreno

    Abstract: In this paper, we introduce a novel language identification system based on conformer layers. We propose an attentive temporal pooling mechanism to allow the model to carry information in long-form audio via a recurrent form, such that the inference can be performed in a streaming fashion. Additionally, we investigate two domain adaptation approaches to allow adapting an existing language identifi… ▽ More

    Submitted 1 May, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

  31. arXiv:2202.07939  [pdf, other

    cs.LG eess.SP stat.AP

    Clustering Enabled Few-Shot Load Forecasting

    Authors: Qiyuan Wang, Zhihui Chen, Chenye Wu

    Abstract: While the advanced machine learning algorithms are effective in load forecasting, they often suffer from low data utilization, and hence their superior performance relies on massive datasets. This motivates us to design machine learning algorithms with improved data utilization. Specifically, we consider the load forecasting for a new user in the system by observing only few shots (data points) of… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

    Comments: *The first two authors contributed equally to this work, and hence are co-first authors of this work. C. Wu is the corresponding author. This work was supported in part by the Shenzhen Institute of Artificial Intelligence and Robotics for Society

  32. arXiv:2112.13651  [pdf, other

    stat.ME

    Factor modelling for high-dimensional functional time series

    Authors: Shaojun Guo, Xinghao Qiao, Qingsong Wang

    Abstract: Many economic and scientific problems involve the analysis of high-dimensional functional time series, where the number of functional variables ($p$) diverges as the number of serially dependent observations ($n$) increases. In this paper, we present a novel functional factor model for high-dimensional functional time series that maintains and makes use of the functional and dynamic structure to a… ▽ More

    Submitted 19 March, 2022; v1 submitted 27 December, 2021; originally announced December 2021.

  33. arXiv:2112.05605  [pdf, other

    stat.CO cs.LG

    Comparison of Markov chains via weak Poincaré inequalities with application to pseudo-marginal MCMC

    Authors: Christophe Andrieu, Anthony Lee, Sam Power, Andi Q. Wang

    Abstract: We investigate the use of a certain class of functional inequalities known as weak Poincaré inequalities to bound convergence of Markov chains to equilibrium. We show that this enables the straightforward and transparent derivation of subgeometric convergence bounds for methods such as the Independent Metropolis--Hastings sampler and pseudo-marginal methods for intractable likelihoods, the latter… ▽ More

    Submitted 9 August, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

    Comments: Revised manuscript; includes additional results

    MSC Class: 65C40; 65C05; 62J10

  34. arXiv:2111.05859  [pdf, other

    math.ST math.PR stat.CO stat.ME

    PDMP Monte Carlo methods for piecewise-smooth densities

    Authors: Augustin Chevallier, Sam Power, Andi Q. Wang, Paul Fearnhead

    Abstract: There has been substantial interest in develo** Markov chain Monte Carlo algorithms based on piecewise-deterministic Markov processes. However existing algorithms can only be used if the target distribution of interest is differentiable everywhere. The key to adapting these algorithms so that they can sample from to densities with discontinuities is defining appropriate dynamics for the process… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

  35. arXiv:2110.11856  [pdf, other

    stat.ME cs.SI math.ST

    L-2 Regularized maximum likelihood for $β$-model in large and sparse networks

    Authors: Yu Zhang, Qiu** Wang, Yuan Zhang, Ting Yan, **g Luo

    Abstract: The $β$-model is a powerful tool for modeling network generation driven by node degree heterogeneity. Its simple yet expressive nature particularly well-suits large and sparse networks, where many network models become infeasible due to computational challenge and observation scarcity. However, existing estimation algorithms for $β$-model do not scale up; and theoretical understandings remain limi… ▽ More

    Submitted 24 October, 2021; v1 submitted 22 October, 2021; originally announced October 2021.

    Report number: 000000 MSC Class: 62F12; 62F10; 91D30;

  36. arXiv:2110.02554  [pdf, other

    cs.LG stat.ML

    A Regularized Wasserstein Framework for Graph Kernels

    Authors: Asiri Wijesinghe, Qing Wang, Stephen Gould

    Abstract: We propose a learning framework for graph kernels, which is theoretically grounded on regularizing optimal transport. This framework provides a novel optimal transport distance metric, namely Regularized Wasserstein (RW) discrepancy, which can preserve both features and structure of graphs via Wasserstein distances on features and their local variations, local barycenters and global connectivity.… ▽ More

    Submitted 8 October, 2021; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: 21st IEEE International Conference on Data Mining (ICDM 2021)

  37. arXiv:2109.13819  [pdf, other

    math.PR stat.ME

    Perturbation theory for killed Markov processes and quasi-stationary distributions

    Authors: Daniel Rudolf, Andi Q. Wang

    Abstract: We investigate the stability of quasi-stationary distributions of killed Markov processes to perturbations of the generator. In the first setting, we consider a general bounded self-adjoint perturbation operator, and after that, study a particular unbounded perturbation corresponding to the truncation of the killing rate. In both scenarios, we quantify the difference between eigenfunctions of the… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

    Comments: 34 pages, 3 figures

    MSC Class: 60J70; 47A55; 65C05; 60J22

  38. One-Hot Graph Encoder Embedding

    Authors: Cencheng Shen, Qizhe Wang, Carey E. Priebe

    Abstract: In this paper we propose a lightning fast graph embedding method called one-hot graph encoder embedding. It has a linear computational complexity and the capacity to process billions of edges within minutes on standard PC -- making it an ideal candidate for huge graph processing. It is applicable to either adjacency matrix or graph Laplacian, and can be viewed as a transformation of the spectral e… ▽ More

    Submitted 1 December, 2022; v1 submitted 27 September, 2021; originally announced September 2021.

    Comments: 7 pages main + 7 pages appendix

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 45(6), 7933 - 7938, 2023

  39. arXiv:2109.10957  [pdf, other

    cs.RO stat.AP

    Real Robot Challenge: A Robotics Competition in the Cloud

    Authors: Stefan Bauer, Felix Widmaier, Manuel Wüthrich, Annika Buchholz, Sebastian Stark, Anirudh Goyal, Thomas Steinbrenner, Joel Akpo, Shruti Joshi, Vincent Berenz, Vaibhav Agrawal, Niklas Funk, Julen Urain De Jesus, Jan Peters, Joe Watson, Claire Chen, Krishnan Srinivasan, Junwu Zhang, Jeffrey Zhang, Matthew R. Walter, Rishabh Madan, Charles Schaff, Takahiro Maeda, Takuma Yoneda, Denis Yarats , et al. (17 additional authors not shown)

    Abstract: Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able… ▽ More

    Submitted 10 June, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

  40. arXiv:2108.12600  [pdf, other

    stat.ME

    A robust fusion-extraction procedure with summary statistics in the presence of biased sources

    Authors: Ruoyu Wang, Qihua Wang, Wang Miao

    Abstract: Information from various data sources is increasingly available nowadays. However, some of the data sources may produce biased estimation due to commonly encountered biased sampling, population heterogeneity, or model misspecification. This calls for statistical methods to combine information in the presence of biased sources. In this paper, a robust data fusion-extraction method is proposed. The… ▽ More

    Submitted 5 February, 2023; v1 submitted 28 August, 2021; originally announced August 2021.

  41. arXiv:2108.00259  [pdf, other

    stat.ML cs.AI cs.LG math.OC

    How much pre-training is enough to discover a good subnetwork?

    Authors: Cameron R. Wolfe, Fangshuo Liao, Qihan Wang, Junhyung Lyle Kim, Anastasios Kyrillidis

    Abstract: Neural network pruning is useful for discovering efficient, high-performing subnetworks within pre-trained, dense network architectures. More often than not, it involves a three-step process -- pre-training, pruning, and re-training -- that is computationally expensive, as the dense model must be fully pre-trained. While previous work has revealed through experiments the relationship between the a… ▽ More

    Submitted 22 August, 2023; v1 submitted 31 July, 2021; originally announced August 2021.

    Comments: 29 pages

    MSC Class: 68T07 ACM Class: I.2.6; I.2.10; I.4.0

  42. Evaluating Effectiveness of Public Health Intervention Strategies for Mitigating COVID-19 Pandemic

    Authors: Shanghong Xie, Wenbo Wang, Qinxia Wang, Yuanjia Wang, Donglin Zeng

    Abstract: Coronavirus disease 2019 (COVID-19) pandemic is an unprecedented global public health challenge. In the United States (US), state governments have implemented various non-pharmaceutical interventions (NPIs), such as physical distance closure (lockdown), stay-at-home order, mandatory facial mask in public in response to the rapid spread of COVID-19. To evaluate the effectiveness of these NPIs, we p… ▽ More

    Submitted 20 July, 2021; originally announced July 2021.

    Journal ref: Statistics in Medicine 41 (9) (2022) 3820-3836

  43. arXiv:2107.05328  [pdf, other

    cs.LG stat.ML

    Structured Directional Pruning via Perturbation Orthogonal Projection

    Authors: Yinchuan Li, Xiaofeng Liu, Yunfeng Shao, Qing Wang, Yanhui Geng

    Abstract: Structured pruning is an effective compression technique to reduce the computation of neural networks, which is usually achieved by adding perturbations to reduce network parameters at the cost of slightly increasing training loss. A more reasonable approach is to find a sparse minimizer along the flat minimum valley found by optimizers, i.e. stochastic gradient descent, which keeps the training l… ▽ More

    Submitted 21 October, 2021; v1 submitted 12 July, 2021; originally announced July 2021.

  44. arXiv:2106.02475  [pdf, ps, other

    stat.ME

    Distributed nonparametric regression imputation for missing response problems with large-scale data

    Authors: Ruoyu Wang, Miaomiao Su, Qihua Wang

    Abstract: Nonparametric regression imputation is commonly used in missing data analysis. However, it suffers from the ``curse of dimension". The problem can be alleviated by the explosive sample size in the era of big data, while the large-scale data size presents some challenges on the storage of data and the calculation of estimators. These challenges make the classical nonparametric regression impu… ▽ More

    Submitted 8 January, 2023; v1 submitted 4 June, 2021; originally announced June 2021.

    Journal ref: Journal of Machine Learning Research, 2023

  45. arXiv:2104.02125  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System

    Authors: Roza Chojnacka, Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno

    Abstract: In this paper, we describe SpeakerStew - a hybrid system to perform speaker verification on 46 languages. Two core ideas were explored in this system: (1) Pooling training data of different languages together for multilingual generalization and reducing development cycles; (2) A novel triage mechanism between text-dependent and text-independent models to reduce runtime cost and expected latency. T… ▽ More

    Submitted 15 June, 2021; v1 submitted 5 April, 2021; originally announced April 2021.

  46. arXiv:2103.16810  [pdf, other

    stat.ME

    An Expectation-Maximization Algorithm for Continuous-time Hidden Markov Models

    Authors: Qingcan Wang, Weinan E

    Abstract: We propose a unified framework that extends the inference methods for classical hidden Markov models to continuous settings, where both the hidden states and observations occur in continuous time. Two different settings are analyzed: hidden jump process with a finite state space, and hidden diffusion process with a continuous state space. For each setting, we first estimate the hidden states given… ▽ More

    Submitted 17 June, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

    MSC Class: 60J25; 65C40

  47. arXiv:2103.13879  [pdf, other

    cs.SI physics.soc-ph stat.AP

    Examining mobility data justice during 2017 Hurricane Harvey

    Authors: Hengfang Deng, Qi Wang

    Abstract: Natural disasters can significantly disrupt human mobility in urban areas. Studies have attempted to understand and quantify such disruptions using crowdsourced mobility data sets. However, limited research has studied the justice issues of mobility data in the context of natural disasters. The lack of research leaves us without an empirical foundation to quantify and control the possible biases i… ▽ More

    Submitted 20 March, 2021; originally announced March 2021.

  48. arXiv:2103.09527  [pdf, other

    stat.ML cs.LG

    Implicit Normalizing Flows

    Authors: Cheng Lu, Jianfei Chen, Chongxuan Li, Qiuhao Wang, Jun Zhu

    Abstract: Normalizing flows define a probability distribution by an explicit invertible transformation $\boldsymbol{\mathbf{z}}=f(\boldsymbol{\mathbf{x}})$. In this work, we present implicit normalizing flows (ImpFlows), which generalize normalizing flows by allowing the map** to be implicitly defined by the roots of an equation… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

  49. arXiv:2102.05829  [pdf, other

    cs.LG stat.ML

    Causal Inference for Time series Analysis: Problems, Methods and Evaluation

    Authors: Raha Moraffah, Paras Sheth, Mansooreh Karami, Anchit Bhattacharya, Qianru Wang, Anique Tahir, Adrienne Raglin, Huan Liu

    Abstract: Time series data is a collection of chronological observations which is generated by several domains such as medical and financial fields. Over the years, different tasks such as classification, forecasting, and clustering have been proposed to analyze this type of data. Time series data has been also used to study the effect of interventions over time. Moreover, in many fields of science, learnin… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

  50. arXiv:2102.04843  [pdf, other

    stat.AP

    A Vector Autoregression Prediction Model for COVID-19 Outbreak

    Authors: Qinan Wang, Yaomu Zhou, Xiaofei Chen

    Abstract: Since two people came down a county of north Seattle with positive COVID-19 (coronavirus-19) in 2019, the current total cases in the United States (U.S.) are over 12 million. Predicting the pandemic trend under effective variables is crucial to help find a way to control the epidemic. Based on available literature, we propose a validated Vector Autoregression (VAR) time series model to predict the… ▽ More

    Submitted 6 February, 2021; originally announced February 2021.

    Comments: 14 pages, 4 figures