Skip to main content

Showing 1–50 of 139 results for author: Hu, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.01079  [pdf, ps, other

    stat.ML cs.AI cs.LG

    On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

    Authors: Jerry Yao-Chieh Hu, Weimin Wu, Zhuoru Li, Zhao Song, Han Liu

    Abstract: We investigate the statistical and computational limits of latent \textbf{Di}ffusion \textbf{T}ransformers (\textbf{DiT}s) under the low-dimensional linear latent space assumption. Statistically, we study the universal approximation and sample complexity of the DiTs score function, as well as the distribution recovery property of the initial data. Specifically, under mild data assumptions, we deri… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.19049  [pdf, other

    cs.LG cs.AI stat.ML

    Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation

    Authors: Amartya Sanyal, Yaxi Hu, Yaodong Yu, Yian Ma, Yixin Wang, Bernhard Schölkopf

    Abstract: "Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisan… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.14380  [pdf, other

    econ.EM cs.LG stat.ME

    Estimating Treatment Effects under Recommender Interference: A Structured Neural Networks Approach

    Authors: Ruohan Zhan, Shichao Han, Yuchen Hu, Zhenling Jiang

    Abstract: Recommender systems are essential for content-sharing platforms by curating personalized content. To evaluate updates to recommender systems targeting content creators, platforms frequently rely on creator-side randomized experiments. The treatment effect measures the change in outcomes when a new algorithm is implemented compared to the status quo. We show that the standard difference-in-means es… ▽ More

    Submitted 3 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  4. arXiv:2406.03136  [pdf, ps, other

    cs.LG cs.AI cs.CC stat.ML

    Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models

    Authors: Jerry Yao-Chieh Hu, Maojiang Su, En-Jui Kuo, Zhao Song, Han Liu

    Abstract: We study the computational limits of Low-Rank Adaptation (LoRA) update for finetuning transformer-based models using fine-grained complexity theory. Our key observation is that the existence of low-rank decompositions within the gradient computation of LoRA adaptation leads to possible algorithmic speedup. This allows us to (i) identify a phase transition behavior and (ii) prove the existence of n… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  5. arXiv:2406.01575  [pdf, other

    math.OC cs.AI cs.LG stat.ML

    Stochastic Bilevel Optimization with Lower-Level Contextual Markov Decision Processes

    Authors: Vinzenz Thoma, Barna Pasztor, Andreas Krause, Giorgia Ramponi, Yifan Hu

    Abstract: In various applications, the optimal policy in a strategic decision-making problem depends both on the environmental configuration and exogenous events. For these settings, we introduce Bilevel Optimization with Contextual Markov Decision Processes (BO-CMDP), a stochastic bilevel decision-making model, where the lower level consists of solving a contextual Markov Decision Process (CMDP). BO-CMDP c… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 54 pages, 18 Figures

  6. arXiv:2405.19463  [pdf, other

    stat.ML cs.LG econ.EM math.OC

    Stochastic Optimization Algorithms for Instrumental Variable Regression with Streaming Data

    Authors: Xuxing Chen, Abhishek Roy, Yifan Hu, Krishnakumar Balasubramanian

    Abstract: We develop and analyze algorithms for instrumental variable regression by viewing the problem as a conditional stochastic optimization problem. In the context of least-squares instrumental variable regression, our algorithms neither require matrix inversions nor mini-batches and provides a fully online approach for performing instrumental variable regression with streaming data. When the true mode… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  7. arXiv:2405.16564  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Contextual Linear Optimization with Bandit Feedback

    Authors: Yichun Hu, Nathan Kallus, Xiaojie Mao, Yanchen Wu

    Abstract: Contextual linear optimization (CLO) uses predictive observations to reduce uncertainty in random cost coefficients and thereby improve average-cost performance. An example is a stochastic shortest path with random edge costs (e.g., traffic) and predictive features (e.g., lagged traffic, weather). Existing work on CLO assumes the data has fully observed cost coefficient vectors, but in many applic… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  8. arXiv:2404.03900  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Nonparametric Modern Hopfield Models

    Authors: Jerry Yao-Chieh Hu, Bo-Yu Chen, Dennis Wu, Feng Ruan, Han Liu

    Abstract: We present a nonparametric construction for deep learning compatible modern Hopfield models and utilize this framework to debut an efficient variant. Our key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models as a nonparametric regression problem subject to a set of query-memory pairs. Crucially, our framework not only recovers the known resul… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 59 pages; Code available at https://github.com/MAGICS-LAB/NonparametricHopfield

  9. arXiv:2404.03830  [pdf, other

    cs.LG cs.AI stat.ML

    BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model

    Authors: Chenwei Xu, Yu-Chao Huang, Jerry Yao-Chieh Hu, Weijian Li, Ammar Gilani, Hsi-Sheng Goan, Han Liu

    Abstract: We introduce the \textbf{B}i-Directional \textbf{S}parse \textbf{Hop}field Network (\textbf{BiSHop}), a novel end-to-end framework for deep tabular learning. BiSHop handles the two major challenges of deep tabular learning: non-rotationally invariant data structure and feature sparsity in tabular data. Our key motivation comes from the recent established connection between associative memory and a… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 40 page; Code available at https://github.com/MAGICS-LAB/BiSHop

  10. arXiv:2404.03828  [pdf, other

    cs.LG cs.AI stat.ML

    Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

    Authors: Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Robin Luo, Hong-Yu Chen, Weijian Li, Wei-Po Wang, Han Liu

    Abstract: We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathrm{OutEffHop}$) and use it to address the outlier inefficiency problem of {training} gigantic transformer-based models. Our main contribution is a novel associative memory model facilitating \textit{outlier-efficient} associative memory retrievals. Interestingly, this memory model manifests a model-based interpretation of an out… ▽ More

    Submitted 26 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted at ICML 2024; v2 updated to camera-ready version; Code available at https://github.com/MAGICS-LAB/OutEffHop; Models are on Hugging Face: https://huggingface.co/collections/magicslabnu/outeffhop-6610fcede8d2cda23009a98f

  11. arXiv:2404.03827  [pdf, other

    cs.LG cs.AI stat.ML

    Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models

    Authors: Dennis Wu, Jerry Yao-Chieh Hu, Teng-Yun Hsiao, Han Liu

    Abstract: We propose a two-stage memory retrieval dynamics for modern Hopfield models, termed $\mathtt{U\text{-}Hop}$, with enhanced memory capacity. Our key contribution is a learnable feature map $Φ$ which transforms the Hopfield energy function into kernel space. This transformation ensures convergence between the local minima of energy and the fixed points of retrieval dynamics within the kernel space.… ▽ More

    Submitted 12 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted at ICML 2024; v2 updated to camera-ready version; Code available at https://github.com/MAGICS-LAB/UHop

  12. arXiv:2403.13041  [pdf, other

    cs.CR cs.AI cs.LG stat.ML

    Provable Privacy with Non-Private Pre-Processing

    Authors: Yaxi Hu, Amartya Sanyal, Bernhard Schölkopf

    Abstract: When analysing Differentially Private (DP) machine learning pipelines, the potential privacy cost of data-dependent pre-processing is frequently overlooked in privacy accounting. In this work, we propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms. Our framework establishes upper bounds on the overall privacy guarante… ▽ More

    Submitted 21 June, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  13. arXiv:2403.01386  [pdf, other

    stat.ME econ.EM

    Minimax-Regret Sample Selection in Randomized Experiments

    Authors: Yuchen Hu, Henry Zhu, Emma Brunskill, Stefan Wager

    Abstract: Randomized controlled trials are often run in settings with many subpopulations that may have differential benefits from the treatment being evaluated. We consider the problem of sample selection, i.e., whom to enroll in a randomized trial, such as to optimize welfare in a heterogeneous population. We formalize this problem within the minimax-regret framework, and derive optimal sample-selection s… ▽ More

    Submitted 25 June, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

  14. arXiv:2402.04520  [pdf, ps, other

    cs.LG cs.AI stat.ML

    On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis

    Authors: Jerry Yao-Chieh Hu, Thomas Lin, Zhao Song, Han Liu

    Abstract: We investigate the computational limits of the memory retrieval dynamics of modern Hopfield models from the fine-grained complexity analysis. Our key contribution is the characterization of a phase transition behavior in the efficiency of all possible modern Hopfield models based on the norm of patterns. Specifically, we establish an upper bound criterion for the norm of input query patterns and m… ▽ More

    Submitted 31 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted at ICML 2024; v2 corrected typos; v3 added clarifications and references; v4,5 updated to camera-ready version

  15. arXiv:2401.00521  [pdf, other

    cs.LG cs.AI stat.AP

    Multi-spatial Multi-temporal Air Quality Forecasting with Integrated Monitoring and Reanalysis Data

    Authors: Yuxiao Hu, Qian Li, Xiaodan Shi, **yue Yan, Yuntian Chen

    Abstract: Accurate air quality forecasting is crucial for public health, environmental monitoring and protection, and urban planning. However, existing methods fail to effectively utilize multi-scale information, both spatially and temporally. Spatially, there is a lack of integration between individual monitoring stations and city-wide scales. Temporally, the periodic nature of air quality variations is of… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

  16. arXiv:2312.17346  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction

    Authors: Dennis Wu, Jerry Yao-Chieh Hu, Weijian Li, Bo-Yu Chen, Han Liu

    Abstract: We present STanHop-Net (Sparse Tandem Hopfield Network) for multivariate time series prediction with memory-enhanced capabilities. At the heart of our approach is STanHop, a novel Hopfield-based neural network block, which sparsely learns and stores both temporal and cross-series representations in a data-dependent fashion. In essence, STanHop sequentially learn temporal representation and cross-s… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  17. arXiv:2312.13482  [pdf, other

    stat.ME

    Spatially Adaptive Variable Screening in Presurgical fMRI Data Analysis

    Authors: Yifei Hu, Xinge Jessie Jeng

    Abstract: Accurate delineation of tumor-adjacent functional brain regions is essential for planning function-preserving neurosurgery. Functional magnetic resonance imaging (fMRI) is increasingly used for presurgical counseling and planning. When analyzing presurgical fMRI data, false negatives are more dangerous to the patients than false positives because patients are more likely to experience significant… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  18. arXiv:2312.10796  [pdf, other

    stat.ME math.ST

    Two sample test for covariance matrices in ultra-high dimension

    Authors: Xiucai Ding, Yichen Hu, Zhenggang Wang

    Abstract: In this paper, we propose a new test for testing the equality of two population covariance matrices in the ultra-high dimensional setting that the dimension is much larger than the sizes of both of the two samples. Our proposed methodology relies on a data splitting procedure and a comparison of a set of well selected eigenvalues of the sample covariance matrices on the split data sets. Compared t… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: 43 pages, 1 figure

  19. arXiv:2312.07067  [pdf, other

    cs.LG cs.CR cs.CV stat.AP

    Focus on Hiders: Exploring Hidden Threats for Enhancing Adversarial Training

    Authors: Qian Li, Yuxiao Hu, Yinpeng Dong, Dongxiao Zhang, Yuntian Chen

    Abstract: Adversarial training is often formulated as a min-max problem, however, concentrating only on the worst adversarial examples causes alternating repetitive confusion of the model, i.e., previously defended or correctly classified samples are not defensible or accurately classifiable in subsequent adversarial training. We characterize such non-ignorable samples as "hiders", which reveal the hidden h… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  20. arXiv:2312.05758  [pdf, other

    cs.LG stat.AP

    CLeaRForecast: Contrastive Learning of High-Purity Representations for Time Series Forecasting

    Authors: Jiaxin Gao, Yuxiao Hu, Qinglong Cao, Siqi Dai, Yuntian Chen

    Abstract: Time series forecasting (TSF) holds significant importance in modern society, spanning numerous domains. Previous representation learning-based TSF algorithms typically embrace a contrastive learning paradigm featuring segregated trend-periodicity representations. Yet, these methodologies disregard the inherent high-impact noise embedded within time series data, resulting in representation inaccur… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  21. arXiv:2309.12673  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    On Sparse Modern Hopfield Model

    Authors: Jerry Yao-Chieh Hu, Donglin Yang, Dennis Wu, Chenwei Xu, Bo-Yu Chen, Han Liu

    Abstract: We introduce the sparse modern Hopfield model as a sparse extension of the modern Hopfield model. Like its dense counterpart, the sparse modern Hopfield model equips a memory-retrieval dynamics whose one-step approximation corresponds to the sparse attention mechanism. Theoretically, our key contribution is a principled derivation of a closed-form sparse Hopfield energy using the convex conjugate… ▽ More

    Submitted 29 November, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: 37 pages, accepted at NeurIPS 2023. [v2] updated to match with camera-ready version. Code is available at https://github.com/MAGICS-LAB/SparseModernHopfield

  22. arXiv:2309.08429  [pdf, other

    eess.SP stat.ML

    IHT-Inspired Neural Network for Single-Snapshot DOA Estimation with Sparse Linear Arrays

    Authors: Yunqiao Hu, Shunqiao Sun

    Abstract: Single-snapshot direction-of-arrival (DOA) estimation using sparse linear arrays (SLAs) has gained significant attention in the field of automotive MIMO radars. This is due to the dynamic nature of automotive settings, where multiple snapshots aren't accessible, and the importance of minimizing hardware costs. Low-rank Hankel matrix completion has been proposed to interpolate the missing elements… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: 5 pages, 5 figures

  23. arXiv:2309.02236  [pdf, other

    cs.LG cs.AI stat.ML

    Distributionally Robust Model-based Reinforcement Learning with Large State Spaces

    Authors: Shyam Sundhar Ramesh, Pier Giuseppe Sessa, Yifan Hu, Andreas Krause, Ilija Bogunovic

    Abstract: Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. To overcome these issues, we study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Journal ref: AISTATS 2024

  24. arXiv:2306.15616  [pdf, other

    stat.ME

    Network-Adjusted Covariates for Community Detection

    Authors: Yaofang Hu, Wanjie Wang

    Abstract: Community detection is a crucial task in network analysis that can be significantly improved by incorporating subject-level information, i.e. covariates. However, current methods often struggle with selecting tuning parameters and analyzing low-degree nodes. In this paper, we introduce a novel method that addresses these challenges by constructing network-adjusted covariates, which leverage the ne… ▽ More

    Submitted 11 February, 2024; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: 48 pages

    MSC Class: 91D30; 62F12; 91C20

  25. arXiv:2306.06252  [pdf, other

    cs.LG stat.ML

    Feature Programming for Multivariate Time Series Prediction

    Authors: Alex Reneau, Jerry Yao-Chieh Hu, Chenwei Xu, Weijian Li, Ammar Gilani, Han Liu

    Abstract: We introduce the concept of programmable feature engineering for time series modeling and propose a feature programming framework. This framework generates large amounts of predictive features for noisy multivariate time series while allowing users to incorporate their inductive bias with minimal effort. The key motivation of our framework is to view any multivariate time series as a cumulative su… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 21 pages, accepted to ICML2023. Code is available at https://github.com/SirAlex900/FeatureProgramming

  26. arXiv:2306.03962  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    PILLAR: How to make semi-private learning more effective

    Authors: Francesco Pinto, Yaxi Hu, Fanny Yang, Amartya Sanyal

    Abstract: In Semi-Supervised Semi-Private (SP) learning, the learner has access to both public unlabelled and private labelled data. We propose a computationally efficient algorithm that, under mild assumptions on the data, provably achieves significantly lower private labelled sample complexity and can be efficiently run on real-world datasets. For this purpose, we leverage the features extracted by networ… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  27. arXiv:2303.01887  [pdf, other

    econ.EM stat.AP

    Fast Forecasting of Unstable Data Streams for On-Demand Service Platforms

    Authors: Yu Jeffrey Hu, Jeroen Rombouts, Ines Wilms

    Abstract: On-demand service platforms face a challenging problem of forecasting a large collection of high-frequency regional demand data streams that exhibit instabilities. This paper develops a novel forecast framework that is fast and scalable, and automatically assesses changing environments without human intervention. We empirically test our framework on a large-scale demand data set from a leading on-… ▽ More

    Submitted 31 May, 2024; v1 submitted 3 March, 2023; originally announced March 2023.

  28. arXiv:2302.05516  [pdf, other

    stat.ML cs.LG math.OC

    Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD than Constant Stepsize

    Authors: Mert Gürbüzbalaban, Yuanhan Hu, Umut Şimşekli, Lingjiong Zhu

    Abstract: Cyclic and randomized stepsizes are widely used in the deep learning practice and can often outperform standard stepsize choices such as constant stepsize in SGD. Despite their empirical success, not much is currently known about when and why they can theoretically improve the generalization performance. We consider a general class of Markovian stepsizes for learning, which contain i.i.d. random s… ▽ More

    Submitted 29 August, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

    Comments: To Appear

    Journal ref: Transactions of Machine Learning Research, 2023

  29. arXiv:2212.13574  [pdf, other

    stat.ME

    Weak Signal Inclusion Under Dependence and Applications in Genome-wide Association Study

    Authors: X. Jessie Jeng, Yifei Hu, Quan Sun, Yun Li

    Abstract: Motivated by the inquiries of weak signals in underpowered genome-wide association studies (GWASs), we consider the problem of retaining true signals that are not strong enough to be individually separable from a large amount of noise. We address the challenge from the perspective of false negative control and present false negative control (FNC) screening, a data-driven method to efficiently regu… ▽ More

    Submitted 2 February, 2024; v1 submitted 27 December, 2022; originally announced December 2022.

    Comments: arXiv admin note: text overlap with arXiv:2006.15667

  30. arXiv:2212.09996  [pdf, other

    stat.ME stat.AP

    A marginalized three-part interrupted time series regression model for proportional data

    Authors: Shangyuan Ye, Maricela Cruz, Yuchen Hu, Yun Yu

    Abstract: Interrupted time series (ITS) is often used to evaluate the effectiveness of a health policy intervention that accounts for the temporal dependence of outcomes. When the outcome of interest is a percentage or percentile, the data can be highly skewed, bounded in $[0, 1]$, and have many zeros or ones. A three-part Beta regression model is commonly used to separate zeros, ones, and positive values e… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

  31. arXiv:2212.02585  [pdf, ps, other

    econ.EM stat.ML

    Identification of Unobservables in Observations

    Authors: Yingyao Hu

    Abstract: In empirical studies, the data usually don't include all the variables of interest in an economic model. This paper shows the identification of unobserved variables in observations at the population level. When the observables are distinct in each observation, there exists a function map** from the observables to the unobservables. Such a function guarantees the uniqueness of the latent value in… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

  32. arXiv:2212.00570  [pdf, other

    stat.ML cs.LG stat.CO

    Penalized Overdamped and Underdamped Langevin Monte Carlo Algorithms for Constrained Sampling

    Authors: Mert Gürbüzbalaban, Yuanhan Hu, Lingjiong Zhu

    Abstract: We consider the constrained sampling problem where the goal is to sample from a target distribution $π(x)\propto e^{-f(x)}$ when $x$ is constrained to lie on a convex body $\mathcal{C}$. Motivated by penalty methods from continuous optimization, we propose penalized Langevin Dynamics (PLD) and penalized underdamped Langevin Monte Carlo (PULMC) methods that convert the constrained sampling problem… ▽ More

    Submitted 14 April, 2024; v1 submitted 29 November, 2022; originally announced December 2022.

  33. arXiv:2211.15762  [pdf, other

    cs.LG stat.ML

    Understanding the Impact of Adversarial Robustness on Accuracy Disparity

    Authors: Yuzheng Hu, Fan Wu, Hongyang Zhang, Han Zhao

    Abstract: While it has long been empirically observed that adversarial robustness may be at odds with standard accuracy and may have further disparate impacts on different classes, it remains an open question to what extent such observations hold and how the class imbalance plays a role within. In this paper, we attempt to understand this question of accuracy disparity by taking a closer look at linear clas… ▽ More

    Submitted 28 May, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted at ICML 2023

  34. arXiv:2210.13965  [pdf

    cs.LG stat.AP

    Exploring the impact of weather on Metro demand forecasting using machine learning method

    Authors: Yiming Hu, Yangchuan Huang, Shuying Liu, Yuanyang Qi, Danhui Bai

    Abstract: Urban rail transit provides significant comprehensive benefits such as large traffic volume and high speed, serving as one of the most important components of urban traffic construction management and congestion solution. Using real passenger flow data of an Asian subway system from April to June of 2018, this work analyzes the space-time distribution of the passenger flow using short-term traffic… ▽ More

    Submitted 4 May, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: 16 pages, 4 figures

  35. arXiv:2210.01300  [pdf, other

    stat.ML cs.LG econ.EM

    Revealing Unobservables by Deep Learning: Generative Element Extraction Networks (GEEN)

    Authors: Yingyao Hu, Yang Liu, Jiaxiong Yao

    Abstract: Latent variable models are crucial in scientific research, where a key variable, such as effort, ability, and belief, is unobserved in the sample but needs to be identified. This paper proposes a novel method for estimating realizations of a latent variable $X^*$ in a random sample that contains its multiple measurements. With the key assumption that the measurements are independent conditional on… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

    Comments: 19 pages, 6 figures

  36. arXiv:2209.00197  [pdf, other

    stat.ME econ.EM

    Switchback Experiments under Geometric Mixing

    Authors: Yuchen Hu, Stefan Wager

    Abstract: The switchback is an experimental design that measures treatment effects by repeatedly turning an intervention on and off for a whole system. Switchback experiments are a robust way to overcome cross-unit spillover effects; however, they are vulnerable to bias from temporal carryovers. In this paper, we consider properties of switchback experiments in Markovian systems that mix at a geometric rate… ▽ More

    Submitted 2 April, 2024; v1 submitted 31 August, 2022; originally announced September 2022.

  37. arXiv:2208.00257   

    stat.ME stat.AP

    Covariate-Assisted Community Detection on Sparse Networks

    Authors: Yaofang Hu, Wanjie Wang

    Abstract: Community detection is an important problem when processing network data. Traditionally, this is done by exploiting the connections between nodes, but connections can be too sparse to detect communities in many real datasets. Node covariates can be used to assist community detection; see Binkiewicz et al. (2017); Weng and Feng (2022); Yan and Sarkar (2021); Yang et al. (2013). However, how to comb… ▽ More

    Submitted 27 June, 2023; v1 submitted 30 July, 2022; originally announced August 2022.

    Comments: The theory and algorithm are developed very differently, and so a new submission is in 2306.15616

    MSC Class: 91D30; 62F12; 91C20

  38. arXiv:2206.03985  [pdf, other

    cs.LG cs.CR stat.ML

    How unfair is private learning ?

    Authors: Amartya Sanyal, Yaxi Hu, Fanny Yang

    Abstract: As machine learning algorithms are deployed on sensitive data in critical decision making processes, it is becoming increasingly important that they are also private and fair. In this paper, we show that, when the data has a long-tailed structure, it is not possible to build accurate learning algorithms that are both private and results in higher accuracy on minority subpopulations. We further sho… ▽ More

    Submitted 24 December, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Accepted as an Oral paper in UAI '2022, Major update on 23 Dec, 2022

  39. arXiv:2205.14278  [pdf, ps, other

    math.OC cs.LG stat.ML

    Generalization Bounds of Nonconvex-(Strongly)-Concave Stochastic Minimax Optimization

    Authors: Siqi Zhang, Yifan Hu, Liang Zhang, Niao He

    Abstract: This paper takes an initial step to systematically investigate the generalization bounds of algorithms for solving nonconvex-(strongly)-concave (NC-SC/NC-C) stochastic minimax optimization measured by the stationarity of primal functions. We first establish algorithm-agnostic generalization bounds via uniform convergence between the empirical minimax problem and the population minimax problem. The… ▽ More

    Submitted 6 February, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

  40. arXiv:2205.08676  [pdf, ps, other

    stat.ME

    Testing the parametric form of the conditional variance in regressions based on distance covariance

    Authors: Yue Hu, Haiqi Li, Falong Tan

    Abstract: In this paper, we propose a new test for checking the parametric form of the conditional variance based on distance covariance in nonlinear and nonparametric regression models. Inherit from the nice properties of distance covariance, our test is very easy to implement in practice and less effected by the dimensionality of covariates. The asymptotic properties of the test statistic are investigated… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

  41. arXiv:2205.06689  [pdf, other

    stat.ML cs.LG math.OC

    Heavy-Tail Phenomenon in Decentralized SGD

    Authors: Mert Gurbuzbalaban, Yuanhan Hu, Umut Simsekli, Kun Yuan, Lingjiong Zhu

    Abstract: Recent theoretical studies have shown that heavy-tails can emerge in stochastic optimization due to `multiplicative noise', even under surprisingly simple settings, such as linear regression with Gaussian data. While these studies have uncovered several interesting phenomena, they consider conventional stochastic optimization problems, which exclude decentralized settings that naturally arise in m… ▽ More

    Submitted 16 May, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

  42. arXiv:2203.15297  [pdf, other

    cs.CV cs.NE stat.ML

    Kernel Modulation: A Parameter-Efficient Method for Training Convolutional Neural Networks

    Authors: Yuhuang Hu, Shih-Chii Liu

    Abstract: Deep Neural Networks, particularly Convolutional Neural Networks (ConvNets), have achieved incredible success in many vision tasks, but they usually require millions of parameters for good accuracy performance. With increasing applications that use ConvNets, updating hundreds of networks for multiple tasks on an embedded device can be costly in terms of memory, bandwidth, and energy. Approaches to… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: Accepted at 2022 26th International Conference on Pattern Recognition (ICPR)

  43. arXiv:2112.02093  [pdf, other

    cs.LG cs.RO stat.ML

    Causal-based Time Series Domain Generalization for Vehicle Intention Prediction

    Authors: Ye** Hu, Xiaogang Jia, Masayoshi Tomizuka, Wei Zhan

    Abstract: Accurately predicting possible behaviors of traffic participants is an essential capability for autonomous vehicles. Since autonomous vehicles need to navigate in dynamically changing environments, they are expected to make accurate predictions regardless of where they are and what driving circumstances they encountered. Therefore, generalization capability to unseen domains is crucial for predict… ▽ More

    Submitted 3 December, 2021; originally announced December 2021.

    Comments: Accepted by NeurIPS 2021 Workshop on Distribution Shifts

  44. arXiv:2110.12343  [pdf, other

    cs.LG math.ST stat.ME

    Off-Policy Evaluation in Partially Observed Markov Decision Processes under Sequential Ignorability

    Authors: Yuchen Hu, Stefan Wager

    Abstract: We consider off-policy evaluation of dynamic treatment rules under sequential ignorability, given an assumption that the underlying system can be modeled as a partially observed Markov decision process (POMDP). We propose an estimator, partial history importance weighting, and show that it can consistently estimate the stationary mean rewards of a target policy given long enough draws from the beh… ▽ More

    Submitted 9 May, 2023; v1 submitted 23 October, 2021; originally announced October 2021.

  45. arXiv:2108.11875  [pdf, other

    stat.ML cs.LG physics.ao-ph

    A spatio-temporal LSTM model to forecast across multiple temporal and spatial scales

    Authors: Yihao Hu, Fearghal O'Donncha, Paulito Palmes, Meredith Burke, Ramon Filgueira, Jon Grant

    Abstract: This paper presents a novel spatio-temporal LSTM (SPATIAL) architecture for time series forecasting applied to environmental datasets. The framework was evaluated across multiple sensors and for three different oceanic variables: current speed, temperature, and dissolved oxygen. Network implementation proceeded in two directions that are nominally separated but connected as part of a natural envir… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

  46. arXiv:2108.04851  [pdf, other

    stat.ME

    Bayesian Inference using the Proximal Map**: Uncertainty Quantification under Varying Dimensionality

    Authors: Maoran Xu, Hua Zhou, Yujie Hu, Leo L. Duan

    Abstract: In statistical applications, it is common to encounter parameters supported on a varying or unknown dimensional space. Examples include the fused lasso regression, the matrix recovery under an unknown low rank, etc. Despite the ease of obtaining a point estimate via the optimization, it is much more challenging to quantify their uncertainty -- in the Bayesian framework, a major difficulty is that… ▽ More

    Submitted 2 October, 2022; v1 submitted 10 August, 2021; originally announced August 2021.

    Comments: 26 pages, 4 figures

  47. arXiv:2108.00968  [pdf, other

    cs.CV cs.AI stat.ML

    Robust Semantic Segmentation with Superpixel-Mix

    Authors: Gianni Franchi, Nacim Belkhir, Mai Lan Ha, Yufei Hu, Andrei Bursuc, Volker Blanz, Angela Yao

    Abstract: Along with predictive performance and runtime speed, reliability is a key requirement for real-world semantic segmentation. Reliability encompasses robustness, predictive uncertainty and reduced bias. To improve reliability, we introduce Superpixel-mix, a new superpixel-based data augmentation method with teacher-student consistency training. Unlike other mixing-based augmentation techniques, mixi… ▽ More

    Submitted 21 October, 2021; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: Accepted to BMVC2021

  48. arXiv:2107.10955  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Learning Linear Polytree Structural Equation Models

    Authors: Xingmei Lou, Yu Hu, Xiaodong Li

    Abstract: We are interested in the problem of learning the directed acyclic graph (DAG) when data are generated from a linear structural equation model (SEM) and the causal structure can be characterized by a polytree. Under the Gaussian polytree models, we study sufficient conditions on the sample sizes for the well-known Chow-Liu algorithm to exactly recover both the skeleton and the equivalence class of… ▽ More

    Submitted 14 May, 2024; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: 35 pages, 5 figures, 4 tables

  49. Face Identification Proficiency Test Designed Using Item Response Theory

    Authors: Géraldine Jeckeln, Ying Hu, Jacqueline G. Cavazos, Amy N. Yates, Carina A. Hahn, Larry Tang, P. Jonathon Phillips, Alice J. O'Toole

    Abstract: Measures of face-identification proficiency are essential to ensure accurate and consistent performance by professional forensic face examiners and others who perform face-identification tasks in applied scenarios. Current proficiency tests rely on static sets of stimulus items, and so, cannot be administered validly to the same individual multiple times. To create a proficiency test, a large numb… ▽ More

    Submitted 9 August, 2022; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: 20 pages (including references), 10 figures

  50. arXiv:2104.03802  [pdf, other

    stat.ME econ.EM

    Average Direct and Indirect Causal Effects under Interference

    Authors: Yuchen Hu, Shuangning Li, Stefan Wager

    Abstract: We propose a definition for the average indirect effect of a binary treatment in the potential outcomes model for causal inference under cross-unit interference. Our definition is analogous to the standard definition of the average direct effect, and can be expressed without needing to compare outcomes across multiple randomized experiments. We show that the proposed indirect effect satisfies a de… ▽ More

    Submitted 11 January, 2022; v1 submitted 8 April, 2021; originally announced April 2021.