Skip to main content

Showing 1–50 of 349 results for author: Xu, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.16507  [pdf, other

    stat.ME stat.ML

    Statistical ranking with dynamic covariates

    Authors: Pinjun Dong, Ruijian Han, Binyan Jiang, Yiming Xu

    Abstract: We consider a covariate-assisted ranking model grounded in the Plackett--Luce framework. Unlike existing works focusing on pure covariates or individual effects with fixed covariates, our approach integrates individual effects with dynamic covariates. This added flexibility enhances realistic ranking yet poses significant challenges for analyzing the associated estimation procedures. This paper ma… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 40 pages; 8 figures

  2. arXiv:2406.08209  [pdf, other

    stat.ML cs.LG math.OC

    Forward-Euler time-discretization for Wasserstein gradient flows can be wrong

    Authors: Yewei Xu, Qin Li

    Abstract: In this note, we examine the forward-Euler discretization for simulating Wasserstein gradient flows. We provide two counter-examples showcasing the failure of this discretization even for a simple case where the energy functional is defined as the KL divergence against some nicely structured probability densities. A simple explanation of this failure is also discussed.

    Submitted 12 June, 2024; originally announced June 2024.

    MSC Class: 65M12

  3. arXiv:2406.05193  [pdf, ps, other

    stat.ME stat.CO

    Probabilistic Clustering using Shared Latent Variable Model for Assessing Alzheimers Disease Biomarkers

    Authors: Yizhen Xu, Scott Zeger, Zheyu Wang

    Abstract: The preclinical stage of many neurodegenerative diseases can span decades before symptoms become apparent. Understanding the sequence of preclinical biomarker changes provides a critical opportunity for early diagnosis and effective intervention prior to significant loss of patients' brain functions. The main challenge to early detection lies in the absence of direct observation of the disease sta… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  4. arXiv:2406.03628  [pdf, other

    stat.ML cs.LG

    Synthetic Oversampling: Theory and A Practical Approach Using LLMs to Address Data Imbalance

    Authors: Ryumei Nakada, Yichen Xu, Lexin Li, Linjun Zhang

    Abstract: Imbalanced data and spurious correlations are common challenges in machine learning and data science. Oversampling, which artificially increases the number of instances in the underrepresented classes, has been widely adopted to tackle these challenges. In this article, we introduce OPAL (\textbf{O}versam\textbf{P}ling with \textbf{A}rtificial \textbf{L}LM-generated data), a systematic oversamplin… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 59 pages, 7 figures

  5. arXiv:2406.00827  [pdf, other

    econ.EM stat.ME

    LaLonde (1986) after Nearly Four Decades: Lessons Learned

    Authors: Guido Imbens, Yiqing Xu

    Abstract: In 1986, Robert LaLonde published an article that compared nonexperimental estimates to experimental benchmarks (LaLonde 1986). He concluded that the nonexperimental methods at the time could not systematically replicate experimental benchmarks, casting doubt on the credibility of these methods. Following LaLonde's critical assessment, there have been significant methodological advances and practi… ▽ More

    Submitted 8 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  6. arXiv:2405.20451  [pdf, other

    stat.ML cs.LG math.OC

    Statistical Properties of Robust Satisficing

    Authors: Zhiyi Li, Yunbei Xu, Ruohan Zhan

    Abstract: The Robust Satisficing (RS) model is an emerging approach to robust optimization, offering streamlined procedures and robust generalization across various applications. However, the statistical theory of RS remains unexplored in the literature. This paper fills in the gap by comprehensively analyzing the theoretical properties of the RS model. Notably, the RS structure offers a more straightforwar… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  7. arXiv:2405.17535  [pdf, other

    cs.LG cs.AI stat.ML

    Calibrated Dataset Condensation for Faster Hyperparameter Search

    Authors: Mucong Ding, Yuancheng Xu, Tahseen Rabbani, Xiaoyu Liu, Brian Gravelle, Teresa Ranadive, Tai-Ching Tuan, Furong Huang

    Abstract: Dataset condensation can be used to reduce the computational cost of training multiple models on a large dataset by condensing the training dataset into a small synthetic set. State-of-the-art approaches rely on matching the model gradients between the real and synthetic data. However, there is no theoretical guarantee of the generalizability of the condensed data: data condensation often generali… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  8. arXiv:2405.12838  [pdf, ps, other

    quant-ph stat.CO

    Quantum Non-Identical Mean Estimation: Efficient Algorithms and Fundamental Limits

    Authors: Jiachen Hu, Tongyang Li, Xinzhao Wang, Yecheng Xue, Chenyi Zhang, Han Zhong

    Abstract: We systematically investigate quantum algorithms and lower bounds for mean estimation given query access to non-identically distributed samples. On the one hand, we give quantum mean estimators with quadratic quantum speed-up given samples from different bounded or sub-Gaussian random variables. On the other hand, we prove that, in general, it is impossible for any quantum algorithm to achieve qua… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 31 pages, 0 figure. To appear in the 19th Theory of Quantum Computation, Communication and Cryptography (TQC 2024)

  9. arXiv:2405.08886  [pdf, other

    cs.LG stat.ML

    The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks

    Authors: Ziquan Liu, Yufei Cui, Yan Yan, Yi Xu, Xiangyang Ji, Xue Liu, Antoni B. Chan

    Abstract: In safety-critical applications such as medical imaging and autonomous driving, where decisions have profound implications for patient health and road safety, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks and reliable uncertainty quantification in decision-making. With extensive research focused on enhancing adversarial robustness th… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: ICML2024

  10. arXiv:2405.00417  [pdf, other

    cs.LG stat.ME stat.ML

    Conformal Risk Control for Ordinal Classification

    Authors: Yunpeng Xu, Wenge Guo, Zhi Wei

    Abstract: As a natural extension to the standard conformal prediction method, several conformal risk control methods have been recently developed and applied to various learning problems. In this work, we seek to control the conformal risk in expectation for ordinal classification tasks, which have broad applications to many real problems. For this purpose, we firstly formulated the ordinal classification t… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 17 pages, 8 figures, 2 table; 1 supplementary page

    Journal ref: In UAI 2023: The 39th Conference on Uncertainty in Artificial Intelligence

  11. arXiv:2404.17769  [pdf, other

    cs.IR stat.ME stat.ML

    Conformal Ranked Retrieval

    Authors: Yunpeng Xu, Wenge Guo, Zhi Wei

    Abstract: Given the wide adoption of ranked retrieval techniques in various information systems that significantly impact our daily lives, there is an increasing need to assess and address the uncertainty inherent in their predictions. This paper introduces a novel method using the conformal risk control framework to quantitatively measure and manage risks in the context of ranked retrieval problems. Our re… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 14 pages, 6 figures, 1 table; 7 supplementary pages, 12 supplementary figures, 2 supplementary tables

  12. arXiv:2404.13204  [pdf, other

    stat.AP stat.CO

    Scalable Bayesian Image-on-Scalar Regression for Population-Scale Neuroimaging Data Analysis

    Authors: Yuliang Xu, Timothy D. Johnson, Thomas E. Nichols, Jian Kang

    Abstract: Bayesian Image-on-Scalar Regression (ISR) offers significant advantages for neuroimaging data analysis, including flexibility and the ability to quantify uncertainty. However, its application to large-scale imaging datasets, such as found in the UK Biobank, is hindered by the computational demands of traditional posterior computation methods, as well as the challenge of individual-specific brain m… ▽ More

    Submitted 15 June, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  13. arXiv:2404.03804  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    TransformerLSR: Attentive Joint Model of Longitudinal Data, Survival, and Recurrent Events with Concurrent Latent Structure

    Authors: Zhiyue Zhang, Yao Zhao, Yanxun Xu

    Abstract: In applications such as biomedical studies, epidemiology, and social sciences, recurrent events often co-occur with longitudinal measurements and a terminal event, such as death. Therefore, jointly modeling longitudinal measurements, recurrent events, and survival data while accounting for their dependencies is critical. While joint models for the three components exist in statistical literature,… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  14. arXiv:2404.01467  [pdf

    cs.SI physics.soc-ph stat.AP

    Transnational Network Dynamics of Problematic Information Diffusion

    Authors: Esteban Villa-Turek, Rod Abhari, Erik C. Nisbet, Yu Xu, Ayse Deniz Lokmanoglu

    Abstract: This study maps the spread of two cases of COVID-19 conspiracy theories and misinformation in Spanish and French in Latin American and French-speaking communities on Facebook, and thus contributes to understanding the dynamics, reach and consequences of emerging transnational misinformation networks. The findings show that co-sharing behavior of public Facebook groups created transnational network… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  15. arXiv:2404.01273  [pdf, other

    cs.LG cs.CL stat.ME

    TWIN-GPT: Digital Twins for Clinical Trials via Large Language Model

    Authors: Yue Wang, Tianfan Fu, Yinlong Xu, Zihan Ma, Hongxia Xu, Yingzhou Lu, Bang Du, Honghao Gao, Jian Wu

    Abstract: Clinical trials are indispensable for medical research and the development of new treatments. However, clinical trials often involve thousands of participants and can span several years to complete, with a high probability of failure during the process. Recently, there has been a burgeoning interest in virtual clinical trials, which simulate real-world scenarios and hold the potential to significa… ▽ More

    Submitted 28 June, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  16. arXiv:2403.03353  [pdf, ps, other

    stat.ML cs.LG math.FA

    Hypothesis Spaces for Deep Learning

    Authors: Rui Wang, Yuesheng Xu, Mingsong Yan

    Abstract: This paper introduces a hypothesis space for deep learning that employs deep neural networks (DNNs). By treating a DNN as a function of two variables, the physical variable and parameter variable, we consider the primitive set of the DNNs for the parameter variable located in a set of the weight matrices and biases determined by a prescribed depth and widths of the DNNs. We then complete the linea… ▽ More

    Submitted 11 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  17. arXiv:2401.15248  [pdf, other

    cs.LG stat.ML

    Better Representations via Adversarial Training in Pre-Training: A Theoretical Perspective

    Authors: Yue Xing, Xiaofeng Lin, Qifan Song, Yi Xu, Belinda Zeng, Guang Cheng

    Abstract: Pre-training is known to generate universal representations for downstream tasks in large-scale deep learning such as large language models. Existing literature, e.g., \cite{kim2020adversarial}, empirically observe that the downstream tasks can inherit the adversarial robustness of the pre-trained model. We provide theoretical justifications for this robustness inheritance phenomenon. Our theoreti… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: To appear in AISTATS2024

  18. arXiv:2401.11380  [pdf, other

    cs.LG math.ST stat.ME stat.ML

    MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning

    Authors: Mao Hong, Zhiyue Zhang, Yue Wu, Yanxun Xu

    Abstract: Model-based offline reinforcement learning methods (RL) have achieved state-of-the-art performance in many decision-making problems thanks to their sample efficiency and generalizability. Despite these advancements, existing model-based offline RL approaches either focus on theoretical studies without develo** practical algorithms or rely on a restricted parametric policy space, thus not fully l… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

  19. arXiv:2401.08463  [pdf, other

    math.ST stat.ML

    Statistical inference for pairwise comparison models

    Authors: Ruijian Han, Wenlu Tang, Yiming Xu

    Abstract: Pairwise comparison models have been widely used for utility evaluation and ranking across various fields. The increasing scale of problems today underscores the need to understand statistical inference in these models when the number of subjects diverges, a topic currently lacking in the literature except in a few special instances. To partially address this gap, this paper establishes a near-opt… ▽ More

    Submitted 2 April, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 28 pages; include additional results

  20. arXiv:2312.07271  [pdf

    cs.LG cs.AI stat.ML

    Analyze the Robustness of Classifiers under Label Noise

    Authors: Cheng Zeng, Yixuan Xu, Jiaqi Tian

    Abstract: This study explores the robustness of label noise classifiers, aiming to enhance model resilience against noisy data in complex real-world scenarios. Label noise in supervised learning, characterized by erroneous or imprecise labels, significantly impairs model performance. This research focuses on the increasingly pertinent issue of label noise's impact on practical applications. Addressing the p… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 21 pages, 11 figures

  21. arXiv:2312.03218  [pdf, other

    cs.LG cs.CC stat.ML

    Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization

    Authors: Yuanshi Liu, Hanzhen Zhao, Yang Xu, Pengyun Yue, Cong Fang

    Abstract: Gradient-based minimax optimal algorithms have greatly promoted the development of continuous optimization and machine learning. One seminal work due to Yurii Nesterov [Nes83a] established $\tilde{\mathcal{O}}(\sqrt{L/μ})$ gradient complexity for minimizing an $L$-smooth $μ$-strongly convex objective. However, an ideal algorithm would adapt to the explicit complexity of a particular objective func… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: Optimization for Machine Learning

  22. arXiv:2311.03382  [pdf, other

    cs.IR cs.AI cs.LG stat.ME

    Causal Structure Representation Learning of Confounders in Latent Space for Recommendation

    Authors: Hangtong Xu, Yuanbo Xu, Yongjian Yang

    Abstract: Inferring user preferences from the historical feedback of users is a valuable problem in recommender systems. Conventional approaches often rely on the assumption that user preferences in the feedback data are equivalent to the real user preferences without additional noise, which simplifies the problem modeling. However, there are various confounders during user-item interactions, such as weathe… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  23. arXiv:2311.03381  [pdf, other

    cs.IR cs.AI cs.LG stat.ME

    Separating and Learning Latent Confounders to Enhancing User Preferences Modeling

    Authors: Hangtong Xu, Yuanbo Xu, Yongjian Yang

    Abstract: Recommender models aim to capture user preferences from historical feedback and then predict user-specific feedback on candidate items. However, the presence of various unmeasured confounders causes deviations between the user preferences in the historical feedback and the true preferences, resulting in models not meeting their expected performance. Existing debias models either (1) specific to so… ▽ More

    Submitted 2 April, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: Accepted by DASFAA 2024

  24. arXiv:2310.16336  [pdf, other

    cs.LG stat.ML

    SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process

    Authors: Zichong Li, Yanbo Xu, Simiao Zuo, Haoming Jiang, Chao Zhang, Tuo Zhao, Hongyuan Zha

    Abstract: Transformer Hawkes process models have shown to be successful in modeling event sequence data. However, most of the existing training methods rely on maximizing the likelihood of event sequences, which involves calculating some intractable integral. Moreover, the existing methods fail to provide uncertainty quantification for model predictions, e.g., confidence intervals for the predicted event's… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  25. arXiv:2310.16284  [pdf, other

    stat.ME math.ST stat.CO

    Bayesian Image Mediation Analysis

    Authors: Yuliang Xu, Jian Kang

    Abstract: Mediation analysis aims to separate the indirect effect through mediators from the direct effect of the exposure on the outcome. It is challenging to perform mediation analysis with neuroimaging data which involves high dimensionality, complex spatial correlations, sparse activation patterns and relatively low signal-to-noise ratio. To address these issues, we develop a new spatially varying coeff… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  26. arXiv:2309.15983  [pdf, other

    stat.ME econ.EM stat.AP

    What To Do (and Not to Do) with Causal Panel Analysis under Parallel Trends: Lessons from A Large Reanalysis Study

    Authors: Albert Chiu, Xingchen Lan, Ziyi Liu, Yiqing Xu

    Abstract: Two-way fixed effects (TWFE) models are ubiquitous in causal panel analysis in political science. However, recent methodological discussions challenge their validity in the presence of heterogeneous treatment effects (HTE) and violations of the parallel trends assumption (PTA). This burgeoning literature has introduced multiple estimators and diagnostics, leading to confusion among empirical resea… ▽ More

    Submitted 14 June, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

  27. Deep Generative Imputation Model for Missing Not At Random Data

    Authors: Jialei Chen, Yuanbo Xu, Pengyang Wang, Yongjian Yang

    Abstract: Data analysis usually suffers from the Missing Not At Random (MNAR) problem, where the cause of the value missing is not fully observed. Compared to the naive Missing Completely At Random (MCAR) problem, it is more in line with the realistic scenario whereas more complex and challenging. Existing statistical methods model the MNAR mechanism by different decomposition of the joint distribution of t… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  28. arXiv:2307.07675  [pdf, other

    cs.LG cs.IR stat.ML

    On the Robustness of Epoch-Greedy in Multi-Agent Contextual Bandit Mechanisms

    Authors: Yinglun Xu, Bhuvesh Kumar, Jacob Abernethy

    Abstract: Efficient learning in multi-armed bandit mechanisms such as pay-per-click (PPC) auctions typically involves three challenges: 1) inducing truthful bidding behavior (incentives), 2) using personalization in the users (context), and 3) circumventing manipulations in click patterns (corruptions). Each of these challenges has been studied orthogonally in the literature; incentives have been addressed… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

  29. arXiv:2307.03821  [pdf, other

    stat.ME

    Mediation Analysis with Graph Mediator

    Authors: Yixi Xu, Yi Zhao

    Abstract: This study introduces a mediation analysis framework when the mediator is a graph. A Gaussian covariance graph model is assumed for graph representation. Causal estimands and assumptions are discussed under this representation. With a covariance matrix as the mediator, parametric mediation models are imposed based on matrix decomposition. Assuming Gaussian random errors, likelihood-based estimator… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  30. arXiv:2306.14878  [pdf, other

    cs.LG cs.CV stat.CO stat.ML

    Restart Sampling for Improving Generative Processes

    Authors: Yilun Xu, Mingyang Deng, Xiang Cheng, Yonglong Tian, Ziming Liu, Tommi Jaakkola

    Abstract: Generative processes that involve solving differential equations, such as diffusion models, frequently necessitate balancing speed and quality. ODE-based samplers are fast but plateau in performance while SDE-based samplers deliver higher sample quality at the cost of increased sampling time. We attribute this difference to sampling errors: ODE-samplers involve smaller discretization errors while… ▽ More

    Submitted 1 November, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: Code is available at https://github.com/Newbeeer/diffusion_restart_sampling

  31. arXiv:2306.02826  [pdf, ps, other

    quant-ph cs.AI cs.DS cs.LG stat.ML

    Near-Optimal Quantum Coreset Construction Algorithms for Clustering

    Authors: Yecheng Xue, Xiaoyu Chen, Tongyang Li, Shaofeng H. -C. Jiang

    Abstract: $k$-Clustering in $\mathbb{R}^d$ (e.g., $k$-median and $k$-means) is a fundamental machine learning problem. While near-linear time approximation algorithms were known in the classical setting for a dataset with cardinality $n$, it remains open to find sublinear-time quantum algorithms. We give quantum algorithms that find coresets for $k$-clustering in $\mathbb{R}^d$ with $\tilde{O}(\sqrt{nk}d^{3… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Comments: 32 pages, 0 figures, 1 table. To appear in the Fortieth International Conference on Machine Learning (ICML 2023)

  32. arXiv:2306.02821  [pdf, other

    math.ST stat.ME

    A unified analysis of likelihood-based estimators in the Plackett--Luce model

    Authors: Ruijian Han, Yiming Xu

    Abstract: The Plackett--Luce model is a popular approach for ranking data analysis, where a utility vector is employed to determine the probability of each outcome based on Luce's choice axiom. In this paper, we investigate the asymptotic theory of utility vector estimation by maximizing different types of likelihood, such as the full-, marginal-, and quasi-likelihood. We provide a rank-matching interpretat… ▽ More

    Submitted 20 June, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: 42 pages, corrected typos, added the supplementary file containing all remaining proofs

  33. arXiv:2306.01675  [pdf, other

    stat.ME physics.soc-ph

    Bayesian Segmentation Modeling of Epidemic Growth

    Authors: Tejasv Bedi, Yanxun Xu, Qiwei Li

    Abstract: Tracking the spread of infectious disease during a pandemic has posed a great challenge to the governments and health sectors on a global scale. To facilitate informed public health decision-making, the concerned parties usually rely on short-term daily and weekly projections generated via predictive modeling. Several deterministic and stochastic epidemiological models, including growth and compar… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  34. arXiv:2305.17083  [pdf, other

    stat.ML cs.LG econ.EM math.ST stat.ME

    A Policy Gradient Method for Confounded POMDPs

    Authors: Mao Hong, Zhengling Qi, Yanxun Xu

    Abstract: In this paper, we propose a policy gradient method for confounded partially observable Markov decision processes (POMDPs) with continuous state and observation spaces in the offline setting. We first establish a novel identification result to non-parametrically estimate any history-dependent policy gradient under POMDPs using the offline data. The identification enables us to solve a sequence of c… ▽ More

    Submitted 30 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 95 pages, 3 figures

  35. arXiv:2305.16536  [pdf, ps, other

    cs.LG stat.ML

    Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression

    Authors: Yihao Xue, Siddharth Joshi, Eric Gan, Pin-Yu Chen, Baharan Mirzasoleiman

    Abstract: Contrastive learning (CL) has emerged as a powerful technique for representation learning, with or without label supervision. However, supervised CL is prone to collapsing representations of subclasses within a class by not capturing all their features, and unsupervised CL may suppress harder class-relevant features by focusing on learning easy class-irrelevant features; both significantly comprom… ▽ More

    Submitted 28 May, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: to appear at ICML 2023

  36. arXiv:2304.14549  [pdf, other

    stat.AP

    Evaluating Racialized Economic Segregation in the Presence of Spatial Autocorrelation

    Authors: Yang Xu, Loni Philip Tabb

    Abstract: Research on residential segregation has been active since the 1950s and originated in a desire to quantify the level of racial/ethnic segregation in the United States. The Index of Concentration at the Extremes (ICE), an operationalization of racialized economic segregation that simultaneously captures spatial, racial, and income polarization, has been a popular topic in public health research, wi… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  37. arXiv:2303.13700  [pdf

    physics.med-ph stat.AP

    Bayesian Reconstruction of Magnetic Resonance Images using Gaussian Processes

    Authors: Yihong Xu, Chad W. Farris, Stephan W. Anderson, Xin Zhang, Keith A. Brown

    Abstract: A central goal of modern magnetic resonance imaging (MRI) is to reduce the time required to produce high-quality images. Efforts have included hardware and software innovations such as parallel imaging, compressed sensing, and deep learning-based reconstruction. Here, we propose and demonstrate a Bayesian method to build statistical libraries of magnetic resonance (MR) images in k-space and use th… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

  38. arXiv:2303.11399  [pdf, other

    econ.EM stat.ME

    How Much Should We Trust Instrumental Variable Estimates in Political Science? Practical Advice Based on Over 60 Replicated Studies

    Authors: Apoorva Lal, Mac Lockhart, Yiqing Xu, Ziwen Zu

    Abstract: Instrumental variable (IV) strategies are widely used in political science to establish causal relationships. However, the identifying assumptions required by an IV design are demanding, and it remains challenging for researchers to assess their validity. In this paper, we replicate 67 papers published in three top journals in political science during 2010-2022 and identify several troubling patte… ▽ More

    Submitted 7 November, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: Forthcoming in Political Analysis. Appendix (supp.pdf) in archived zip

  39. arXiv:2303.10908  [pdf, other

    stat.AP

    A Two-stage Bayesian Model for Assessing the Geography of Racialized Economic Segregation and Premature Mortality Across US Counties

    Authors: Yang Xu, Loni Philip Tabb

    Abstract: Racialized economic segregation, a key metric that simultaneously accounts for spatial, social and income polarization, has been linked to adverse health outcomes, including morbidity and mortality; however, statistical methods for measuring the association between racialized economic segregation and health outcomes are not well-developed and are usually studied at the individual level. In this pa… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: 30 pages, 5 figures

  40. arXiv:2303.06422  [pdf, other

    stat.CO math.NA stat.AP

    An approximate control variates approach to multifidelity distribution estimation

    Authors: Ruijian Han, Boris Kramer, Dong** Lee, Akil Narayan, Yiming Xu

    Abstract: Forward simulation-based uncertainty quantification that studies the distribution of quantities of interest (QoI) is a crucial component for computationally robust engineering design and prediction. There is a large body of literature devoted to accurately assessing statistics of QoIs, and in particular, multilevel or multifidelity approaches are known to be effective, leveraging cost-accuracy tra… ▽ More

    Submitted 5 July, 2023; v1 submitted 11 March, 2023; originally announced March 2023.

    Comments: 41 pages, added additional numerical experiments

  41. arXiv:2303.02201  [pdf, other

    stat.ME

    Causal Inference using Multivariate Generalized Linear Mixed-Effects Models with Longitudinal Data

    Authors: Yizhen Xu, Jisoo Kim, Laura K. Hummers, Ami A. Shah, Scott Zeger

    Abstract: Dynamic prediction of causal effects under different treatment regimes conditional on an individual's characteristics and longitudinal history is an essential problem in precision medicine. This is challenging in practice because outcomes and treatment assignment mechanisms are unknown in observational studies, an individual's treatment efficacy is a counterfactual, and the existence of selection… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: 19 pages, 5 figures

  42. arXiv:2302.14427  [pdf, other

    stat.ML cs.LG

    Federated Covariate Shift Adaptation for Missing Target Output Values

    Authors: Yaqian Xu, Wenquan Cui, Jianjun Xu, Haoyang Cheng

    Abstract: The most recent multi-source covariate shift algorithm is an efficient hyperparameter optimization algorithm for missing target output. In this paper, we extend this algorithm to the framework of federated learning. For data islands in federated learning and covariate shift adaptation, we propose the federated domain adaptation estimate of the target risk which is asymptotically unbiased with a de… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

  43. arXiv:2302.11647  [pdf, other

    stat.ME stat.AP

    Patient stratification in multi-arm trials: a two-stage procedure with Bayesian profile regression

    Authors: Yuejia Xu, Angela M. Wood, Brian D. M. Tom

    Abstract: Precision medicine is an emerging field that takes into account individual heterogeneity to inform better clinical practice. In clinical trials, the evaluation of treatment effect heterogeneity is an important component, and recently, many statistical methods have been proposed for stratifying patients into different subgroups based on such heterogeneity. However, the majority of existing methods… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

  44. arXiv:2302.11638  [pdf, other

    stat.ME stat.AP

    Sequential Re-estimation Learning of Optimal Individualized Treatment Rules Among Ordinal Treatments with Application to Recommended Intervals Between Blood Donations

    Authors: Yuejia Xu, Angela M. Wood, David J. Roberts, Brian D. M. Tom

    Abstract: Personalized medicine has gained much popularity recently as a way of providing better healthcare by tailoring treatments to suit individuals. Our research, motivated by the UK INTERVAL blood donation trial, focuses on estimating the optimal individualized treatment rule (ITR) in the ordinal treatment-arms setting. Restrictions on minimum lengths between whole blood donations exist to safeguard do… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

  45. arXiv:2302.10796  [pdf, ps, other

    quant-ph cs.AI cs.LG stat.ML

    Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret

    Authors: Han Zhong, Jiachen Hu, Yecheng Xue, Tongyang Li, Liwei Wang

    Abstract: While quantum reinforcement learning (RL) has attracted a surge of attention recently, its theoretical understanding is limited. In particular, it remains elusive how to design provably efficient quantum RL algorithms that can address the exploration-exploitation trade-off. To this end, we propose a novel UCRL-style algorithm that takes advantage of quantum computing for tabular Markov decision pr… ▽ More

    Submitted 13 June, 2024; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: ICML 2024

  46. arXiv:2302.00816  [pdf, other

    stat.AP cs.CV

    Dynamic Atomic Column Detection in Transmission Electron Microscopy Videos via Ridge Estimation

    Authors: Yuchen Xu, Andrew M. Thomas, Peter A. Crozier, David S. Matteson

    Abstract: Ridge detection is a classical tool to extract curvilinear features in image processing. As such, it has great promise in applications to material science problems; specifically, for trend filtering relatively stable atom-shaped objects in image sequences, such as Transmission Electron Microscopy (TEM) videos. Standard analysis of TEM videos is limited to frame-by-frame object recognition. We inst… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

    Comments: 27 pages, 11 figures

  47. arXiv:2301.09300  [pdf, other

    stat.ML cs.LG

    A Tale of Two Latent Flows: Learning Latent Space Normalizing Flow with Short-run Langevin Flow for Approximate Inference

    Authors: Jianwen Xie, Yaxuan Zhu, Yifei Xu, Dingcheng Li, ** Li

    Abstract: We study a normalizing flow in the latent space of a top-down generator model, in which the normalizing flow model plays the role of the informative prior model of the generator. We propose to jointly learn the latent space normalizing flow prior model and the top-down generator model by a Markov chain Monte Carlo (MCMC)-based maximum likelihood algorithm, where a short-run Langevin sampling from… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

    Comments: The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI) 2023

  48. arXiv:2301.09231  [pdf, other

    cs.LG stat.AP stat.ML

    GP-NAS-ensemble: a model for NAS Performance Prediction

    Authors: Kunlong Chen, Liu Yang, Yitian Chen, Kun** Chen, Yidan Xu, Lujun Li

    Abstract: It is of great significance to estimate the performance of a given model architecture without training in the application of Neural Architecture Search (NAS) as it may take a lot of time to evaluate the performance of an architecture. In this paper, a novel NAS framework called GP-NAS-ensemble is proposed to predict the performance of a neural network architecture with a small training dataset. We… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

  49. arXiv:2301.02651  [pdf, other

    eess.SY stat.ML

    A Robust Data-driven Process Modeling Applied to Time-series Stochastic Power Flow

    Authors: Pooja Algikar, Yijun Xu, Somayeh Yarahmadi, Lamine Mili

    Abstract: In this paper, we propose a robust data-driven process model whose hyperparameters are robustly estimated using the Schweppe-type generalized maximum likelihood estimator. The proposed model is trained on recorded time-series data of voltage phasors and power injections to perform a time-series stochastic power flow calculation. Power system data are often corrupted with outliers caused by large e… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

    Comments: Submitted to the IEEE Transactions on Power Systems

  50. arXiv:2212.14468  [pdf, other

    stat.ML cs.LG stat.ME

    An Instrumental Variable Approach to Confounded Off-Policy Evaluation

    Authors: Yang Xu, ** Zhu, Chengchun Shi, Shikai Luo, Rui Song

    Abstract: Off-policy evaluation (OPE) is a method for estimating the return of a target policy using some pre-collected observational data generated by a potentially different behavior policy. In some cases, there may be unmeasured variables that can confound the action-reward or action-next-state relationships, rendering many existing OPE approaches ineffective. This paper develops an instrumental variable… ▽ More

    Submitted 2 February, 2023; v1 submitted 29 December, 2022; originally announced December 2022.