Skip to main content

Showing 1–50 of 208 results for author: Xu, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.15762  [pdf, other

    cs.LG stat.ML

    Rethinking the Diffusion Models for Numerical Tabular Data Imputation from the Perspective of Wasserstein Gradient Flow

    Authors: Zhichao Chen, Haoxuan Li, Fangyikang Wang, Odin Zhang, Hu Xu, Xiaoyu Jiang, Zhihuan Song, Eric H. Wang

    Abstract: Diffusion models (DMs) have gained attention in Missing Data Imputation (MDI), but there remain two long-neglected issues to be addressed: (1). Inaccurate Imputation, which arises from inherently sample-diversification-pursuing generative process of DMs. (2). Difficult Training, which stems from intricate design required for the mask matrix in model training stage. To address these concerns within… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2406.00574  [pdf, other

    math.OC math.ST stat.ML

    On the Convergence Rates of Set Membership Estimation of Linear Systems with Disturbances Bounded by General Convex Sets

    Authors: Haonan Xu, Yingying Li

    Abstract: This paper studies the uncertainty set estimation of system parameters of linear dynamical systems with bounded disturbances, which is motivated by robust (adaptive) constrained control. Departing from the confidence bounds of least square estimation from the machine-learning literature, this paper focuses on a method commonly used in (robust constrained) control literature: set membership estimat… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  3. arXiv:2405.15505  [pdf, other

    cs.LG cs.AI stat.ML

    Revisiting Counterfactual Regression through the Lens of Gromov-Wasserstein Information Bottleneck

    Authors: Hao Yang, Zexu Sun, Hongteng Xu, Xu Chen

    Abstract: As a promising individualized treatment effect (ITE) estimation method, counterfactual regression (CFR) maps individuals' covariates to a latent space and predicts their counterfactual outcomes. However, the selection bias between control and treatment groups often imbalances the two groups' latent distributions and negatively impacts this method's performance. In this study, we revisit counterfac… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 19 pages

  4. arXiv:2405.11204  [pdf, other

    cs.LG stat.ML

    Learning from Imperfect Human Feedback: a Tale from Corruption-Robust Dueling

    Authors: Yuwei Cheng, Fan Yao, Xuefeng Liu, Haifeng Xu

    Abstract: This paper studies Learning from Imperfect Human Feedback (LIHF), motivated by humans' potential irrationality or imperfect perception of true preference. We revisit the classic dueling bandit problem as a model of learning from comparative human feedback, and enrich it by casting the imperfection in human feedback as agnostic corruption to user utilities. We start by identifying the fundamental l… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  5. arXiv:2405.05730  [pdf, other

    stat.ME

    Change point localisation and inference in fragmented functional data

    Authors: Gengyu Xue, Haotian Xu, Yi Yu

    Abstract: We study the problem of change point localisation and inference for sequentially collected fragmented functional data, where each curve is observed only over discrete grids randomly sampled over a short fragment. The sequence of underlying covariance functions is assumed to be piecewise constant, with changes happening at unknown time points. To localise the change points, we propose a computation… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  6. arXiv:2405.05459  [pdf, other

    stat.ME math.ST

    Estimation and Inference for Change Points in Functional Regression Time Series

    Authors: Shivam Kumar, Haotian Xu, Haeran Cho, Daren Wang

    Abstract: In this paper, we study the estimation and inference of change points under a functional linear regression model with changes in the slope function. We present a novel Functional Regression Binary Segmentation (FRBS) algorithm which is computationally efficient as well as achieving consistency in multiple change point detection. This algorithm utilizes the predictive power of piece-wise constant f… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  7. arXiv:2404.04992  [pdf, other

    cs.CV stat.AP

    Efficient Surgical Tool Recognition via HMM-Stabilized Deep Learning

    Authors: Haifeng Wang, Hao Xu, Jun Wang, Jian Zhou, Ke Deng

    Abstract: Recognizing various surgical tools, actions and phases from surgery videos is an important problem in computer vision with exciting clinical applications. Existing deep-learning-based methods for this problem either process each surgical video as a series of independent images without considering their dependence, or rely on complicated deep learning models to count for dependence of video frames.… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  8. arXiv:2404.03878  [pdf, other

    stat.ME stat.ML

    Wasserstein F-tests for Fréchet regression on Bures-Wasserstein manifolds

    Authors: Haoshu Xu, Hongzhe Li

    Abstract: This paper considers the problem of regression analysis with random covariance matrix as outcome and Euclidean covariates in the framework of Fréchet regression on the Bures-Wasserstein manifold. Such regression problems have many applications in single cell genomics and neuroscience, where we have covariance matrix measured over a large set of samples. Fréchet regression on the Bures-Wasserstein… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  9. arXiv:2404.01273  [pdf, other

    cs.LG cs.CL stat.ME

    TWIN-GPT: Digital Twins for Clinical Trials via Large Language Model

    Authors: Yue Wang, Tianfan Fu, Yinlong Xu, Zihan Ma, Hongxia Xu, Yingzhou Lu, Bang Du, Honghao Gao, Jian Wu

    Abstract: Clinical trials are indispensable for medical research and the development of new treatments. However, clinical trials often involve thousands of participants and can span several years to complete, with a high probability of failure during the process. Recently, there has been a burgeoning interest in virtual clinical trials, which simulate real-world scenarios and hold the potential to significa… ▽ More

    Submitted 28 June, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  10. arXiv:2404.00220  [pdf, other

    stat.ML cs.LG

    Partially-Observable Sequential Change-Point Detection for Autocorrelated Data via Upper Confidence Region

    Authors: Haijie Xu, Xiaochen Xian, Chen Zhang, Kaibo Liu

    Abstract: Sequential change point detection for multivariate autocorrelated data is a very common problem in practice. However, when the sensing resources are limited, only a subset of variables from the multivariate system can be observed at each sensing time point. This raises the problem of partially observable multi-sensor sequential change point detection. For it, we propose a detection scheme called a… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  11. arXiv:2404.00218  [pdf, other

    stat.ML cs.LG

    Functional-Edged Network Modeling

    Authors: Haijie Xu, Chen Zhang

    Abstract: Contrasts with existing works which all consider nodes as functions and use edges to represent the relationships between different functions. We target at network modeling whose edges are functional data and transform the adjacency matrix into a functional adjacency tensor, introducing an additional dimension dedicated to function representation. Tucker functional decomposition is used for the fun… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  12. arXiv:2403.16665  [pdf, other

    cs.DS cs.DM eess.SP stat.CO

    Adaptive Frequency Bin Interval in FFT via Dense Sampling Factor $α$

    Authors: Haichao Xu

    Abstract: The Fast Fourier Transform (FFT) is a fundamental tool for signal analysis, widely used across various fields. However, traditional FFT methods encounter challenges in adjusting the frequency bin interval, which may impede accurate spectral analysis. In this study, we propose a method for adjusting the frequency bin interval in FFT by introducing a parameter $α$. We elucidate the underlying princi… ▽ More

    Submitted 26 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  13. arXiv:2311.03382  [pdf, other

    cs.IR cs.AI cs.LG stat.ME

    Causal Structure Representation Learning of Confounders in Latent Space for Recommendation

    Authors: Hangtong Xu, Yuanbo Xu, Yongjian Yang

    Abstract: Inferring user preferences from the historical feedback of users is a valuable problem in recommender systems. Conventional approaches often rely on the assumption that user preferences in the feedback data are equivalent to the real user preferences without additional noise, which simplifies the problem modeling. However, there are various confounders during user-item interactions, such as weathe… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  14. arXiv:2311.03381  [pdf, other

    cs.IR cs.AI cs.LG stat.ME

    Separating and Learning Latent Confounders to Enhancing User Preferences Modeling

    Authors: Hangtong Xu, Yuanbo Xu, Yongjian Yang

    Abstract: Recommender models aim to capture user preferences from historical feedback and then predict user-specific feedback on candidate items. However, the presence of various unmeasured confounders causes deviations between the user preferences in the historical feedback and the true preferences, resulting in models not meeting their expected performance. Existing debias models either (1) specific to so… ▽ More

    Submitted 2 April, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: Accepted by DASFAA 2024

  15. arXiv:2309.13896  [pdf, other

    cs.LG stat.ML

    Follow-ups Also Matter: Improving Contextual Bandits via Post-serving Contexts

    Authors: Chaoqi Wang, Ziyu Ye, Zhe Feng, Ashwinkumar Badanidiyuru, Haifeng Xu

    Abstract: Standard contextual bandit problem assumes that all the relevant contexts are observed before the algorithm chooses an arm. This modeling paradigm, while useful, often falls short when dealing with problems in which valuable additional context can be observed after arm selection. For example, content recommendation platforms like Youtube, Instagram, Tiktok also observe valuable follow-up informati… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: NeurIPS 2023 (Spotlight)

  16. arXiv:2309.11120  [pdf, other

    stat.ML cs.LG

    Ano-SuPs: Multi-size anomaly detection for manufactured products by identifying suspected patches

    Authors: Hao Xu, Juan Du, Andi Wang

    Abstract: Image-based systems have gained popularity owing to their capacity to provide rich manufacturing status information, low implementation costs and high acquisition rates. However, the complexity of the image background and various anomaly patterns pose new challenges to existing matrix decomposition methods, which are inadequate for modeling requirements. Moreover, the uncertainty of the anomaly ca… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: accepted oral presentation at the 18th INFORMS DMDA Workshop

  17. arXiv:2309.07435  [pdf, other

    stat.ME

    Uncertainty Intervals for Prediction Errors in Time Series Forecasting

    Authors: Hui Xu, Song Mei, Stephen Bates, Jonathan Taylor, Robert Tibshirani

    Abstract: Inference for prediction errors is critical in time series forecasting pipelines. However, providing statistically meaningful uncertainty intervals for prediction errors remains relatively under-explored. Practitioners often resort to forward cross-validation (FCV) for obtaining point estimators and constructing confidence intervals based on the Central Limit Theorem (CLT). The naive version assum… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: 35 pages, 17 figures

  18. arXiv:2307.00712  [pdf

    cs.LG cs.AI cs.IT stat.ML

    Worth of knowledge in deep learning

    Authors: Hao Xu, Yuntian Chen, Dongxiao Zhang

    Abstract: Knowledge constitutes the accumulated understanding and experience that humans use to gain insight into the world. In deep learning, prior knowledge is essential for mitigating shortcomings of data-driven models, such as data dependence, generalization ability, and compliance with constraints. To enable efficient evaluation of the worth of knowledge, we present a framework inspired by interpretabl… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

  19. arXiv:2306.11154  [pdf, other

    cs.GT econ.TH stat.AP

    A Truth Serum for Eliciting Self-Evaluations in Scientific Reviews

    Authors: Jibang Wu, Haifeng Xu, Yifan Guo, Weijie Su

    Abstract: This paper designs a simple, efficient and truthful mechanism to to elicit self-evaluations about items jointly owned by owners. A key application of this mechanism is to improve the peer review of large scientific conferences where a paper often has multiple authors and many authors have multiple papers. Our mechanism is designed to generate an entirely new source of review data truthfully elicit… ▽ More

    Submitted 13 February, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

  20. arXiv:2304.01345  [pdf, other

    q-bio.NC stat.ME

    Establishing group-level brain structural connectivity incorporating anatomical knowledge under latent space modeling

    Authors: Selena Wang, Yiting Wang, Frederick H. Xu, Li Shen, Yize Zhao

    Abstract: Brain structural connectivity, capturing the white matter fiber tracts among brain regions inferred by diffusion MRI (dMRI), provides a unique characterization of brain anatomical organization. One fundamental question to address with structural connectivity is how to properly summarize and perform statistical inference for a group-level connectivity architecture, for instance, under different sex… ▽ More

    Submitted 21 February, 2023; originally announced April 2023.

  21. arXiv:2301.12065  [pdf, other

    cs.LG stat.ML

    Decentralized Entropic Optimal Transport for Privacy-preserving Distributed Distribution Comparison

    Authors: Xiangfeng Wang, Hongteng Xu, Moyi Yang

    Abstract: Privacy-preserving distributed distribution comparison measures the distance between the distributions whose data are scattered across different agents in a distributed system and cannot be shared among the agents. In this study, we propose a novel decentralized entropic optimal transport (EOT) method, which provides a privacy-preserving and communication-efficient solution to this problem with th… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  22. arXiv:2301.11491  [pdf, other

    math.ST stat.ME

    Change point detection and inference in multivariable nonparametric models under mixing conditions

    Authors: Carlos Misael Madrid Padilla, Haotian Xu, Daren Wang, Oscar Hernan Madrid Padilla, Yi Yu

    Abstract: This paper studies multivariate nonparametric change point localization and inference problems. The data consists of a multivariate time series with potentially short range dependence. The distribution of this data is assumed to be piecewise constant with densities in a Hölder class. The change points, or times at which the distribution changes, are unknown. We derive the limiting distributions of… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

  23. arXiv:2212.06339  [pdf, other

    cs.LG cs.CV stat.ML

    Regularized Optimal Transport Layers for Generalized Global Pooling Operations

    Authors: Hongteng Xu, Minjie Cheng

    Abstract: Global pooling is one of the most significant operations in many machine learning models and tasks, which works for information fusion and structured data (like sets and graphs) representation. However, without solid mathematical fundamentals, its practical implementations often depend on empirical mechanisms and thus lead to sub-optimal, even unsatisfactory performance. In this work, we develop a… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  24. arXiv:2211.14752  [pdf, other

    cs.LG cs.NE stat.ML

    Differentiable Meta Multigraph Search with Partial Message Propagation on Heterogeneous Information Networks

    Authors: Chao Li, Hao Xu, Kun He

    Abstract: Heterogeneous information networks (HINs) are widely employed for describing real-world data with intricate entities and relationships. To automatically utilize their semantic information, graph neural architecture search has recently been developed on various tasks of HINs. Existing works, on the other hand, show weaknesses in instability and inflexibility. To address these issues, we propose a n… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

    Comments: 12 pages, 7 figures, 8 tables, accepted by AAAI 2023 conference

  25. arXiv:2211.07817  [pdf, other

    cs.LG stat.ML

    Multi-Player Bandits Robust to Adversarial Collisions

    Authors: Shivakumar Mahesh, Anshuka Rangi, Haifeng Xu, Long Tran-Thanh

    Abstract: Motivated by cognitive radios, stochastic Multi-Player Multi-Armed Bandits has been extensively studied in recent years. In this setting, each player pulls an arm, and receives a reward corresponding to the arm if there is no collision, namely the arm was selected by one single player. Otherwise, the player receives no reward if collision occurs. In this paper, we consider the presence of maliciou… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

  26. arXiv:2210.12470  [pdf, ps, other

    cs.LG stat.ML

    Learning Correlated Stackelberg Equilibrium in General-Sum Multi-Leader-Single-Follower Games

    Authors: Yaolong Yu, Haifeng Xu, Haipeng Chen

    Abstract: Many real-world strategic games involve interactions between multiple players. We study a hierarchical multi-player game structure, where players with asymmetric roles can be separated into leaders and followers, a setting often referred to as Stackelberg game or leader-follower game. In particular, we focus on a Stackelberg game scenario where there are multiple leaders and a single follower, cal… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

  27. arXiv:2210.08964  [pdf, other

    stat.ME cs.AI cs.CL cs.LG math.ST

    PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting

    Authors: Hao Xue, Flora D. Salim

    Abstract: This paper presents a new perspective on time series forecasting. In existing time series forecasting methods, the models take a sequence of numerical values as input and yield numerical values as output. The existing SOTA models are largely based on the Transformer architecture, modified with multiple encoding mechanisms to incorporate the context and semantics around the historical data. Inspire… ▽ More

    Submitted 10 December, 2023; v1 submitted 20 September, 2022; originally announced October 2022.

    Comments: TKDE Accepted Version

  28. arXiv:2209.08889  [pdf, other

    stat.ME stat.AP stat.CO

    Inference of nonlinear causal effects with GWAS summary data

    Authors: Ben Dai, Chunlin Li, Haoran Xue, Wei Pan, Xiaotong Shen

    Abstract: Large-scale genome-wide association studies (GWAS) have offered an exciting opportunity to discover putative causal genes or risk factors associated with diseases by using SNPs as instrumental variables (IVs). However, conventional approaches assume linear causal relations partly for simplicity and partly for the availability of GWAS summary data. In this work, we propose a novel model {for transc… ▽ More

    Submitted 26 October, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: 33 pages, 11 figures

  29. arXiv:2209.05788  [pdf, other

    stat.ME

    Empirical Bayes Multistage Testing for Large-Scale Experiments

    Authors: Hui Xu, Weinan Wang

    Abstract: Modern application of A/B tests is challenging due to its large scale in various dimensions, which demands flexibility to deal with multiple testing sequentially. The state-of-the-art practice first reduces the observed data stream to always-valid p-values, and then chooses a cut-off as in conventional multiple testing schemes. Here we propose an alternative method called AMSET (adaptive multistag… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

  30. arXiv:2208.13663  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Understanding the Limits of Poisoning Attacks in Episodic Reinforcement Learning

    Authors: Anshuka Rangi, Haifeng Xu, Long Tran-Thanh, Massimo Franceschetti

    Abstract: To understand the security threats to reinforcement learning (RL) algorithms, this paper studies poisoning attacks to manipulate \emph{any} order-optimal learning algorithm towards a targeted policy in episodic RL and examines the potential damage of two natural types of poisoning attacks, i.e., the manipulation of \emph{reward} and \emph{action}. We discover that the effect of attacks crucially d… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.

    Comments: Accepted at International Joint Conferences on Artificial Intelligence (IJCAI) 2022

  31. arXiv:2207.12453  [pdf, other

    math.ST stat.ME

    Change point inference in high-dimensional regression models under temporal dependence

    Authors: Haotian Xu, Daren Wang, Zifeng Zhao, Yi Yu

    Abstract: This paper concerns about the limiting distributions of change point estimators, in a high-dimensional linear regression time series context, where a regression object $(y_t, X_t) \in \mathbb{R} \times \mathbb{R}^p$ is observed at every time point $t \in \{1, \ldots, n\}$. At unknown time points, called change points, the regression coefficients change, with the jump sizes measured in $\ell_2$-nor… ▽ More

    Submitted 1 October, 2023; v1 submitted 25 July, 2022; originally announced July 2022.

  32. arXiv:2206.00769  [pdf, other

    cs.LG cs.CR stat.ML

    Defense Against Gradient Leakage Attacks via Learning to Obscure Data

    Authors: Yuxuan Wan, Han Xu, Xiaorui Liu, Jie Ren, Wenqi Fan, Jiliang Tang

    Abstract: Federated learning is considered as an effective privacy-preserving learning mechanism that separates the client's data and model training process. However, federated learning is still under the risk of privacy leakage because of the existence of attackers who deliberately conduct gradient leakage attacks to reconstruct the client data. Recently, popular strategies such as gradient perturbation me… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: 13 pages, 2 figures

  33. arXiv:2205.15059  [pdf, other

    cs.LG stat.ML

    Hilbert Curve Projection Distance for Distribution Comparison

    Authors: Tao Li, Cheng Meng, Hongteng Xu, Jun Yu

    Abstract: Distribution comparison plays a central role in many machine learning tasks like data classification and generative modeling. In this study, we propose a novel metric, called Hilbert curve projection (HCP) distance, to measure the distance between two probability distributions with low complexity. In particular, we first project two high-dimensional probability distributions using Hilbert curve to… ▽ More

    Submitted 6 February, 2024; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: 33 pages, 11 figures

  34. arXiv:2205.13573  [pdf, other

    cs.LG stat.ME stat.ML

    Efficient Approximation of Gromov-Wasserstein Distance Using Importance Sparsification

    Authors: Mengyu Li, Jun Yu, Hongteng Xu, Cheng Meng

    Abstract: As a valid metric of metric-measure spaces, Gromov-Wasserstein (GW) distance has shown the potential for matching problems of structured data like point clouds and graphs. However, its application in practice is limited due to the high computational complexity. To overcome this challenge, we propose a novel importance sparsification method, called \textsc{Spar-GW}, to approximate GW distance effic… ▽ More

    Submitted 9 January, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

  35. arXiv:2205.01849  [pdf, other

    stat.ME

    Estimation of prediction error with known covariate shift

    Authors: Hui Xu, Robert Tibshirani

    Abstract: In supervised learning, the estimation of prediction error on unlabeled test data is an important task. Existing methods are usually built on the assumption that the training and test data are sampled from the same distribution, which is often violated in practice. As a result, traditional estimators like cross-validation (CV) will be biased and this may result in poor model selection. In this pap… ▽ More

    Submitted 28 September, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

  36. arXiv:2204.03588  [pdf, other

    stat.AP

    Music Influence Modeling Based on Directed Network Model

    Authors: Xuan Zhang, Tingdi Ren, Lihong Wang, Haiyong Xu

    Abstract: Studying the history of music may provide a glimpse into the development of human creativity as we examine the evolutionary and revolutionary trends in music and genres. First, a musical influence metric was created to construct a directed network of musical influence. Second, we examined the revolutions and development of musical genres, modeled the similarity, and explored similarities and influ… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

  37. arXiv:2201.13001  [pdf, other

    cs.LG cs.AI cs.DS q-bio.NC stat.ML

    Deep Discriminative to Kernel Density Graph for In- and Out-of-distribution Calibrated Inference

    Authors: Jayanta Dey, Haoyin Xu, Will LeVine, Ashwin De Silva, Tyler M. Tomita, Ali Geisa, Tiffany Chu, Jacob Desman, Joshua T. Vogelstein

    Abstract: Deep discriminative approaches like random forests and deep neural networks have recently found applications in many important real-world scenarios. However, deploying these learning algorithms in safety-critical applications raises concerns, particularly when it comes to ensuring confidence calibration for both in-distribution and out-of-distribution data points. Many popular methods for in-distr… ▽ More

    Submitted 7 June, 2024; v1 submitted 31 January, 2022; originally announced January 2022.

  38. arXiv:2110.14628  [pdf, ps, other

    stat.ML cs.IT cs.LG

    (Almost) Free Incentivized Exploration from Decentralized Learning Agents

    Authors: Chengshuai Shi, Haifeng Xu, Wei Xiong, Cong Shen

    Abstract: Incentivized exploration in multi-armed bandits (MAB) has witnessed increasing interests and many progresses in recent years, where a principal offers bonuses to agents to do explorations on her behalf. However, almost all existing studies are confined to temporary myopic agents. In this work, we break this barrier and study incentivized exploration with multiple and long-term strategic agents, wh… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Comments: Accepted to NeurIPS 2021, camera-ready version

  39. arXiv:2110.06450  [pdf, ps, other

    stat.ME

    Online network change point detection with missing values and temporal dependence

    Authors: Haotian Xu, Paromita Dubey, Yi Yu

    Abstract: In this paper we study online change point detection in dynamic networks with time heterogeneous missing pattern within networks and dependence across the time course. The missingness probabilities, the entrywise sparsity of networks, the rank of networks and the jump size in terms of the Frobenius norm, are all allowed to vary as functions of the pre-change sample size. On top of a thorough handl… ▽ More

    Submitted 26 May, 2023; v1 submitted 12 October, 2021; originally announced October 2021.

  40. arXiv:2109.14501  [pdf, other

    stat.ML cs.AI cs.LG

    Towards a theory of out-of-distribution learning

    Authors: Jayanta Dey, Ali Geisa, Ronak Mehta, Tyler M. Tomita, Hayden S. Helm, Haoyin Xu, Eric Eaton, Jeffery Dick, Carey E. Priebe, Joshua T. Vogelstein

    Abstract: Learning is a process wherein a learning agent enhances its performance through exposure of experience or data. Throughout this journey, the agent may encounter diverse learning environments. For example, data may be presented to the leaner all at once, in multiple batches, or sequentially. Furthermore, the distribution of each data sample could be either identical and independent (iid) or non-iid… ▽ More

    Submitted 7 June, 2024; v1 submitted 29 September, 2021; originally announced September 2021.

  41. arXiv:2109.10869  [pdf, other

    stat.OT

    Interactive Probing of Multivariate Time Series Prediction Models: A Case of Freight Rate Analysis

    Authors: Haonan Xu, Haotian Li, Yong Wang

    Abstract: We present an interactive probing tool to create, modify and analyze what-if scenarios for multivariate time series models. The solution is applied to freight trading, where analysts can carry out sensitivity analysis on freight rates by changing demand and supply-related econometric variables and observing their resultant effects on freight indexes. We utilize various visualization techniques to… ▽ More

    Submitted 31 August, 2021; originally announced September 2021.

  42. arXiv:2108.13637  [pdf, other

    cs.LG cs.AI q-bio.NC stat.ML

    When are Deep Networks really better than Decision Forests at small sample sizes, and how?

    Authors: Haoyin Xu, Kaleab A. Kinfu, Will LeVine, Sambit Panda, Jayanta Dey, Michael Ainsworth, Yu-Chung Peng, Madi Kusmanov, Florian Engert, Christopher M. White, Joshua T. Vogelstein, Carey E. Priebe

    Abstract: Deep networks and decision forests (such as random forests and gradient boosted trees) are the leading machine learning methods for structured and tabular data, respectively. Many papers have empirically compared large numbers of classifiers on one or two different domains (e.g., on 100 different tabular data settings). However, a careful conceptual and empirical comparison of these two strategies… ▽ More

    Submitted 2 November, 2021; v1 submitted 31 August, 2021; originally announced August 2021.

  43. arXiv:2106.04800  [pdf, other

    cs.SI cs.LG stat.ML

    Diffusion Source Identification on Networks with Statistical Confidence

    Authors: Quinlan Dawkins, Tianxi Li, Haifeng Xu

    Abstract: Diffusion source identification on networks is a problem of fundamental importance in a broad class of applications, including rumor controlling and virus identification. Though this problem has received significant recent attention, most studies have focused only on very restrictive settings and lack theoretical guarantees for more realistic networks. We introduce a statistical framework for the… ▽ More

    Submitted 17 June, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

  44. arXiv:2106.02040  [pdf, other

    cs.RO cs.AI stat.ML

    Towards Learning to Play Piano with Dexterous Hands and Touch

    Authors: Huazhe Xu, Yu** Luo, Shaoxiong Wang, Trevor Darrell, Roberto Calandra

    Abstract: The virtuoso plays the piano with passion, poetry and extraordinary technical ability. As Liszt said (a virtuoso)must call up scent and blossom, and breathe the breath of life. The strongest robots that can play a piano are based on a combination of specialized robot hands/piano and hardcoded planning algorithms. In contrast to that, in this paper, we demonstrate how an agent can learn directly fr… ▽ More

    Submitted 5 August, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

  45. arXiv:2105.14647  [pdf, ps, other

    stat.ME

    Orthogonal Subsampling for Big Data Linear Regression

    Authors: Lin Wang, Jake Elmstedt, Weng Kee Wong, Hongquan Xu

    Abstract: The dramatic growth of big datasets presents a new challenge to data storage and analysis. Data reduction, or subsampling, that extracts useful information from datasets is a crucial step in big data analysis. We propose an orthogonal subsampling (OSS) approach for big data with a focus on linear regression models. The approach is inspired by the fact that an orthogonal array of two levels provide… ▽ More

    Submitted 30 May, 2021; originally announced May 2021.

  46. arXiv:2105.14244  [pdf, other

    cs.LG cs.SI stat.ML

    Learning Graphon Autoencoders for Generative Graph Modeling

    Authors: Hongteng Xu, Peilin Zhao, Junzhou Huang, Dixin Luo

    Abstract: Graphon is a nonparametric model that generates graphs with arbitrary sizes and can be induced from graphs easily. Based on this model, we propose a novel algorithmic framework called \textit{graphon autoencoder} to build an interpretable and scalable graph generative model. This framework treats observed graphs as induced graphons in functional space and derives their latent representations by an… ▽ More

    Submitted 29 May, 2021; originally announced May 2021.

  47. arXiv:2103.00567  [pdf, other

    stat.ME math.ST

    Randomization Inference for Composite Experiments with Spillovers and Peer Effects

    Authors: Hui Xu, Guillaume Basse

    Abstract: Group-formation experiments, in which experimental units are randomly assigned to groups, are a powerful tool for studying peer effects in the social sciences. Existing design and analysis approaches allow researchers to draw inference from such experiments without relying on parametric assumptions. In practice, however, group-formation experiments are often coupled with a second, external interve… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

  48. arXiv:2102.07711  [pdf, ps, other

    cs.LG cs.AI cs.CR stat.ML

    Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

    Authors: Anshuka Rangi, Long Tran-Thanh, Haifeng Xu, Massimo Franceschetti

    Abstract: We study bandit algorithms under data poisoning attacks in a bounded reward setting. We consider a strong attacker model in which the attacker can observe both the selected actions and their corresponding rewards and can contaminate the rewards with additive noise. We show that any bandit algorithm with regret $O(\log T)$ can be forced to suffer a regret $Ω(T)$ with an expected amount of contamina… ▽ More

    Submitted 3 May, 2022; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: Accepted to AAAI 2022

  49. arXiv:2102.03794  [pdf, other

    cs.LG cs.CV stat.ML

    A self-adaptive and robust fission clustering algorithm via heat diffusion and maximal turning angle

    Authors: Yu Han, Shizhan Lu, Haiyan Xu

    Abstract: Cluster analysis, which focuses on the grou** and categorization of similar elements, is widely used in various fields of research. A novel and fast clustering algorithm, fission clustering algorithm, is proposed in recent year. In this article, we propose a robust fission clustering (RFC) algorithm and a self-adaptive noise identification method. The RFC and the self-adaptive noise identificati… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

    Comments: 11 pages, 8 figures

  50. arXiv:2102.02741  [pdf, other

    cs.LG stat.ML

    Hawkes Processes on Graphons

    Authors: Hongteng Xu, Dixin Luo, Hongyuan Zha

    Abstract: We propose a novel framework for modeling multiple multivariate point processes, each with heterogeneous event types that share an underlying space and obey the same generative mechanism. Focusing on Hawkes processes and their variants that are associated with Granger causality graphs, our model leverages an uncountable event type space and samples the graphs with different sizes from a nonparamet… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.