Skip to main content

Showing 1–50 of 62 results for author: Zheng, Z

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.18795  [pdf, other

    stat.ML cs.LG

    Federated Q-Learning with Reference-Advantage Decomposition: Almost Optimal Regret and Logarithmic Communication Cost

    Authors: Zhong Zheng, Haochen Zhang, Lingzhou Xue

    Abstract: In this paper, we consider model-free federated reinforcement learning for tabular episodic Markov decision processes. Under the coordination of a central server, multiple agents collaboratively explore the environment and learn an optimal policy without sharing their raw data. Despite recent advances in federated Q-learning algorithms achieving near-linear regret speedup with low communication co… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  2. arXiv:2405.12953  [pdf, other

    stat.ME

    Quantifying Uncertainty in Classification Performance: ROC Confidence Bands Using Conformal Prediction

    Authors: Zheshi Zheng, Bo Yang, Peter Song

    Abstract: To evaluate a classification algorithm, it is common practice to plot the ROC curve using test data. However, the inherent randomness in the test data can undermine our confidence in the conclusions drawn from the ROC curve, necessitating uncertainty quantification. In this article, we propose an algorithm to construct confidence bands for the ROC curve, quantifying the uncertainty of classificati… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  3. arXiv:2404.08164  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    Language Model Prompt Selection via Simulation Optimization

    Authors: Haoting Zhang, **ghai He, Rhonda Righter, Zeyu Zheng

    Abstract: With the advancement in generative language models, the selection of prompts has gained significant attention in recent years. A prompt is an instruction or description provided by the user, serving as a guide for the generative language model in content generation. Despite existing methods for prompt selection that are based on human labor, we consider facilitating this selection through simulati… ▽ More

    Submitted 19 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  4. arXiv:2403.08635  [pdf, other

    cs.LG cs.AI stat.ML

    Human Alignment of Large Language Models through Online Preference Optimisation

    Authors: Daniele Calandriello, Daniel Guo, Remi Munos, Mark Rowland, Yunhao Tang, Bernardo Avila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot

    Abstract: Ensuring alignment of language models' outputs with human preferences is critical to guarantee a useful, safe, and pleasant user experience. Thus, human alignment has been extensively studied recently and several methods such as Reinforcement Learning from Human Feedback (RLHF), Direct Policy Optimisation (DPO) and Sequence Likelihood Calibration (SLiC) have emerged. In this paper, our contributio… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  5. arXiv:2402.01000  [pdf, other

    stat.ML cs.LG

    Multivariate Probabilistic Time Series Forecasting with Correlated Errors

    Authors: Vincent Zhihao Zheng, Lijun Sun

    Abstract: Accurately modeling the correlation structure of errors is essential for reliable uncertainty quantification in probabilistic time series forecasting. Recent deep learning models for multivariate time series have developed efficient parameterizations for time-varying contemporaneous covariance, but they often assume temporal independence of errors for simplicity. However, real-world data frequentl… ▽ More

    Submitted 31 May, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: This paper extends the work presented in arXiv:2305.17028 to a multivariate setting

  6. arXiv:2402.00899  [pdf, other

    cs.LG cs.AI stat.ML

    Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

    Authors: Ivan Y. Tyukin, Tatiana Tyukina, Daniel van Helden, Zedong Zheng, Evgeny M. Mirkes, Oliver J. Sutton, Qinghua Zhou, Alexander N. Gorban, Penelope Allison

    Abstract: We present a new methodology for handling AI errors by introducing weakly supervised AI error correctors with a priori performance guarantees. These AI correctors are auxiliary maps whose role is to moderate the decisions of some previously constructed underlying classifier by either approving or rejecting its decisions. The rejection of a decision can be used as a signal to suggest abstaining fro… ▽ More

    Submitted 13 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    MSC Class: 68T05; 68T37

  7. arXiv:2312.15023  [pdf, other

    cs.LG stat.ML

    Federated Q-Learning: Linear Regret Speedup with Low Communication Cost

    Authors: Zhong Zheng, Fengyu Gao, Lingzhou Xue, **g Yang

    Abstract: In this paper, we consider federated reinforcement learning for tabular episodic Markov Decision Processes (MDP) where, under the coordination of a central server, multiple agents collaboratively explore the environment and learn an optimal policy without sharing their raw data. While linear speedup in the number of agents has been achieved for some metrics, such as convergence rate and sample com… ▽ More

    Submitted 7 May, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: 51 pages

  8. arXiv:2312.08583  [pdf, other

    cs.CL stat.ML

    ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

    Authors: Xiaoxia Wu, Haojun Xia, Stephen Youn, Zhen Zheng, Shiyang Chen, Arash Bakhtiari, Michael Wyatt, Reza Yazdani Aminabadi, Yuxiong He, Olatunji Ruwase, Leon Song, Zhewei Yao

    Abstract: This study examines 4-bit quantization methods like GPTQ in large language models (LLMs), highlighting GPTQ's overfitting and limited enhancement in Zero-Shot tasks. While prior works merely focusing on zero-shot measurement, we extend task scope to more generative categories such as code generation and abstractive summarization, in which we found that INT4 quantization can significantly underperf… ▽ More

    Submitted 18 December, 2023; v1 submitted 13 December, 2023; originally announced December 2023.

  9. arXiv:2311.01709  [pdf, other

    stat.ME stat.ML

    Causal inference with Machine Learning-Based Covariate Representation

    Authors: Yuhang Wu, **ghai He, Zeyu Zheng

    Abstract: Utilizing covariate information has been a powerful approach to improve the efficiency and accuracy for causal inference, which support massive amount of randomized experiments run on data-driven enterprises. However, state-of-art approaches can become practically unreliable when the dimension of covariate increases to just 50, whereas experiments on large platforms can observe even higher dimensi… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  10. arXiv:2309.15032  [pdf, other

    stat.ME math.ST stat.ML

    SOFARI: High-Dimensional Manifold-Based Inference

    Authors: Zemin Zheng, Xin Zhou, Yingying Fan, **chi Lv

    Abstract: Multi-task learning is a widely used technique for harnessing information from various tasks. Recently, the sparse orthogonal factor regression (SOFAR) framework, based on the sparse singular value decomposition (SVD) within the coefficient matrix, was introduced for interpretable multi-task learning, enabling the discovery of meaningful latent feature-response association networks across differen… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: 114 pages, 2 figures

  11. arXiv:2305.17028  [pdf, other

    stat.ML cs.LG

    Better Batch for Deep Probabilistic Time Series Forecasting

    Authors: Vincent Zhihao Zheng, Seong** Choi, Lijun Sun

    Abstract: Deep probabilistic time series forecasting has gained attention for its ability to provide nonlinear approximation and valuable uncertainty quantification for decision-making. However, existing models often oversimplify the problem by assuming a time-independent error process and overlooking serial correlation. To overcome this limitation, we propose an innovative training method that incorporates… ▽ More

    Submitted 21 May, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 11 pages, 3 figures, 3 tables, The 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024); We corrected some misleading notations in the published version

  12. arXiv:2304.12522  [pdf, other

    math.OC cs.LG eess.SP stat.CO stat.ML

    A New Inexact Proximal Linear Algorithm with Adaptive Stop** Criteria for Robust Phase Retrieval

    Authors: Zhong Zheng, Shiqian Ma, Lingzhou Xue

    Abstract: This paper considers the robust phase retrieval problem, which can be cast as a nonsmooth and nonconvex optimization problem. We propose a new inexact proximal linear algorithm with the subproblem being solved inexactly. Our contributions are two adaptive stop** criteria for the subproblem. The convergence behavior of the proposed methods is analyzed. Through experiments on both synthetic and re… ▽ More

    Submitted 8 February, 2024; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: 23 pages

  13. arXiv:2304.04341  [pdf, ps, other

    stat.ML cs.LG math.ST stat.ME

    Regret Distribution in Stochastic Bandits: Optimal Trade-off between Expectation and Tail Risk

    Authors: David Simchi-Levi, Zeyu Zheng, Feng Zhu

    Abstract: We study the trade-off between expectation and tail risk for regret distribution in the stochastic multi-armed bandit problem. We fully characterize the interplay among three desired properties for policy design: worst-case optimality, instance-dependent consistency, and light-tailed risk. We show how the order of expected regret exactly affects the decaying rate of the regret tail probability for… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

  14. arXiv:2304.04091  [pdf, other

    cs.LG cs.CY stat.ML

    Best Arm Identification with Fairness Constraints on Subpopulations

    Authors: Yuhang Wu, Zeyu Zheng, Tingyu Zhu

    Abstract: We formulate, analyze and solve the problem of best arm identification with fairness constraints on subpopulations (BAICS). Standard best arm identification problems aim at selecting an arm that has the largest expected reward where the expectation is taken over the entire population. The BAICS problem requires that an selected arm must be fair to all subpopulations (e.g., different ethnic groups,… ▽ More

    Submitted 8 April, 2023; originally announced April 2023.

  15. arXiv:2301.06650  [pdf, other

    cs.LG stat.ML

    Enhancing Deep Traffic Forecasting Models with Dynamic Regression

    Authors: Vincent Zhihao Zheng, Seong** Choi, Lijun Sun

    Abstract: Deep learning models for traffic forecasting often assume the residual is independent and isotropic across time and space. This assumption simplifies loss functions such as mean absolute error, but real-world residual processes often exhibit significant autocorrelation and structured spatiotemporal correlation. This paper introduces a dynamic regression (DR) framework to enhance existing spatiotem… ▽ More

    Submitted 31 May, 2024; v1 submitted 16 January, 2023; originally announced January 2023.

  16. arXiv:2301.01620  [pdf, ps, other

    stat.AP

    Anonymous Pattern Molecular Fingerprint and its Applications on Property Identification

    Authors: Xue Liu, Qian Cheng, Dan Sun, Xing Li, Wei Wei, Zhiming Zheng

    Abstract: Molecular fingerprints are significant cheminformatics tools to map molecules into vectorial space according to their characteristics in diverse functional groups, atom sequences, and other topological structures. In this paper, we set out to investigate a novel molecular fingerprint \emph{Anonymous-FP} that possesses abundant perception about the underlying interactions shaped in small, medium, a… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

    Comments: 11 pages

  17. arXiv:2211.14671  [pdf, other

    stat.ME stat.AP

    Efficient Targeted Learning of Heterogeneous Treatment Effects for Multiple Subgroups

    Authors: Waverly Wei, Maya Petersen, Mark J van der Laan, Zeyu Zheng, Chong Wu, **gshen Wang

    Abstract: In biomedical science, analyzing treatment effect heterogeneity plays an essential role in assisting personalized medicine. The main goals of analyzing treatment effect heterogeneity include estimating treatment effects in clinically relevant subgroups and predicting whether a patient subpopulation might benefit from a particular treatment. Conventional approaches often evaluate the subgroup treat… ▽ More

    Submitted 26 November, 2022; originally announced November 2022.

    Comments: Accepted by Biometrics 2022

  18. arXiv:2210.06737  [pdf, other

    stat.ME

    Adaptive A/B Tests and Simultaneous Treatment Parameter Optimization

    Authors: Yuhang Wu, Zeyu Zheng, Guangyu Zhang, Zuohua Zhang, Chu Wang

    Abstract: Constructing asymptotically valid confidence intervals through a valid central limit theorem is crucial for A/B tests, where a classical goal is to statistically assert whether a treatment plan is significantly better than a control plan. In some emerging applications for online platforms, the treatment plan is not a single plan, but instead encompasses an infinite continuum of plans indexed by a… ▽ More

    Submitted 6 December, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

  19. arXiv:2209.09168  [pdf

    cs.LG stat.ML

    Application of Neural Network in the Prediction of NOx Emissions from Degrading Gas Turbine

    Authors: Zhenkun Zheng, Alan Rezazadeh

    Abstract: This paper is aiming to apply neural network algorithm for predicting the process response (NOx emissions) from degrading natural gas turbines. Nine different process variables, or predictors, are considered in the predictive modelling. It is found out that the model trained by neural network algorithm should use part of recent data in the training and validation sets accounting for the impact of… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

  20. arXiv:2206.11868  [pdf, other

    stat.ME

    Inference on the Best Policies with Many Covariates

    Authors: Waverly Wei, Yuqing Zhou, Zeyu Zheng, **gshen Wang

    Abstract: Understanding the impact of the most effective policies or treatments on a response variable of interest is desirable in many empirical works in economics, statistics and other disciplines. Due to the widespread winner's curse phenomenon, conventional statistical inference assuming that the top policies are chosen independent of the random sample may lead to overly optimistic evaluations of the be… ▽ More

    Submitted 21 October, 2022; v1 submitted 23 June, 2022; originally announced June 2022.

    Comments: Accepted by The Journal of Econometrics

  21. arXiv:2206.02969  [pdf, other

    stat.ML cs.LG math.ST

    A Simple and Optimal Policy Design with Safety against Heavy-tailed Risk for Stochastic Bandits

    Authors: David Simchi-Levi, Zeyu Zheng, Feng Zhu

    Abstract: We study the stochastic multi-armed bandit problem and design new policies that enjoy both worst-case optimality for expected regret and light-tailed risk for regret distribution. Starting from the two-armed bandit setting with time horizon $T$, we propose a simple policy and prove that the policy (i) enjoys the worst-case optimality for the expected regret at order $O(\sqrt{T\ln T})$ and (ii) has… ▽ More

    Submitted 14 November, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Preliminary version appeared in NeurIPS 2022

  22. arXiv:2201.11949  [pdf, other

    cs.LG stat.ML

    Higher Order Correlation Analysis for Multi-View Learning

    Authors: Jiawang Nie, Li Wang, Zequn Zheng

    Abstract: Multi-view learning is frequently used in data science. The pairwise correlation maximization is a classical approach for exploring the consensus of multiple views. Since the pairwise correlation is inherent for two views, the extensions to more views can be diversified and the intrinsic interconnections among views are generally lost. To address this issue, we propose to maximize higher order cor… ▽ More

    Submitted 28 January, 2022; originally announced January 2022.

  23. arXiv:2201.03065  [pdf, ps, other

    stat.ME math.OC math.ST stat.ML

    Selecting the Best Optimizing System

    Authors: Nian Si, Zeyu Zheng

    Abstract: We formulate selecting the best optimizing system (SBOS) problems and provide solutions for those problems. In an SBOS problem, a finite number of systems are contenders. Inside each system, a continuous decision variable affects the system's expected performance. An SBOS problem compares different systems based on their expected performances under their own optimally chosen decision to select the… ▽ More

    Submitted 9 January, 2022; originally announced January 2022.

    Comments: Code in https://github.com/nian-si/SelectOptSys

  24. arXiv:2106.14813  [pdf, other

    stat.ML cs.DM cs.LG math.OC

    Offline Planning and Online Learning under Recovering Rewards

    Authors: David Simchi-Levi, Zeyu Zheng, Feng Zhu

    Abstract: Motivated by emerging applications such as live-streaming e-commerce, promotions and recommendations, we introduce and solve a general class of non-stationary multi-armed bandit problems that have the following two features: (i) the decision maker can pull and collect rewards from up to $K\,(\ge 1)$ out of $N$ different arms in each time period; (ii) the expected reward of an arm immediately drops… ▽ More

    Submitted 21 December, 2021; v1 submitted 28 June, 2021; originally announced June 2021.

    Comments: v1 accepted by ICML 2021

  25. arXiv:2104.05076  [pdf, other

    stat.ME stat.AP stat.CO stat.ML

    Parallel integrative learning for large-scale multi-response regression with incomplete outcomes

    Authors: Ruipeng Dong, Daoji Li, Zemin Zheng

    Abstract: Multi-task learning is increasingly used to investigate the association structure between multiple responses and a single set of predictor variables in many applications. In the era of big data, the coexistence of incomplete outcomes, large number of responses, and high dimensionality in predictors poses unprecedented challenges in estimation, prediction, and computation. In this paper, we propose… ▽ More

    Submitted 11 April, 2021; originally announced April 2021.

    Comments: 32 pages

    Journal ref: Computational Statistics and Data Analysis, 2021

  26. arXiv:2012.14936  [pdf, other

    stat.ML cs.LG

    Learning Energy-Based Model with Variational Auto-Encoder as Amortized Sampler

    Authors: Jianwen Xie, Zilong Zheng, ** Li

    Abstract: Due to the intractable partition function, training energy-based models (EBMs) by maximum likelihood requires Markov chain Monte Carlo (MCMC) sampling to approximate the gradient of the Kullback-Leibler divergence between data and model distributions. However, it is non-trivial to sample from an EBM because of the difficulty of mixing between modes. In this paper, we propose to learn a variational… ▽ More

    Submitted 23 December, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

    Journal ref: Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), pp10441--10451, 2021

  27. arXiv:2012.13940  [pdf, other

    stat.ML cs.LG

    A Doubly Stochastic Simulator with Applications in Arrivals Modeling and Simulation

    Authors: Yufeng Zheng, Zeyu Zheng, Tingyu Zhu

    Abstract: We propose a framework that integrates classical Monte Carlo simulators and Wasserstein generative adversarial networks to model, estimate, and simulate a broad class of arrival processes with general non-stationary and multi-dimensional random arrival rates. Classical Monte Carlo simulators have advantages at capturing the interpretable "physics" of a stochastic object, whereas neural-network-bas… ▽ More

    Submitted 9 June, 2023; v1 submitted 27 December, 2020; originally announced December 2020.

    Comments: We appreciate a lot the comments and suggestions from anonymous reviewers and editors. This is updated version, and with title changed from "Doubly Stochastic Generative Arrivals Modeling" to "A Doubly Stochastic Simulator with Applications in Arrivals Modeling and Simulation"

  28. arXiv:2011.08521  [pdf, other

    stat.ME

    Sequential scaled sparse factor regression

    Authors: Zemin Zheng, Yang Li, Jie Wu, Yuchen Wang

    Abstract: Large-scale association analysis between multivariate responses and predictors is of great practical importance, as exemplified by modern business applications including social media marketing and crisis management. Despite the rapid methodological advances, how to obtain scalable estimators with free tuning of the regularization parameters remains unclear under general noise covariance structures… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  29. MixTwice: large-scale hypothesis testing for peptide arrays by variance mixing

    Authors: Zihao Zheng, Aisha M. Mergaert, Irene M. Ong, Miriam A. Shelef, Michael A. Newton

    Abstract: Peptide microarrays have emerged as a powerful technology in immunoproteomics as they provide a tool to measure the abundance of different antibodies in patient serum samples. The high dimensionality and small sample size of many experiments challenge conventional statistical approaches, including those aiming to control the false discovery rate (FDR). Motivated by limitations in reproducibility a… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

    Journal ref: Bioinformatics 2021

  30. arXiv:2010.05311  [pdf, other

    econ.EM cs.AI cs.LG econ.GN stat.ML

    Interpretable Neural Networks for Panel Data Analysis in Economics

    Authors: Yucheng Yang, Zhong Zheng, Weinan E

    Abstract: The lack of interpretability and transparency are preventing economists from using advanced tools like neural networks in their empirical research. In this paper, we propose a class of interpretable neural network models that can achieve both high prediction accuracy and interpretability. The model can be written as a simple function of a regularized number of interpretable features, which are out… ▽ More

    Submitted 29 November, 2020; v1 submitted 11 October, 2020; originally announced October 2020.

  31. arXiv:2009.03488  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Adversarial Attack on Large Scale Graph

    Authors: **tang Li, Tao Xie, Liang Chen, Fenfang Xie, Xiangnan He, Zibin Zheng

    Abstract: Recent studies have shown that graph neural networks (GNNs) are vulnerable against perturbations due to lack of robustness and can therefore be easily fooled. Currently, most works on attacking GNNs are mainly using gradient information to guide the attack and achieve outstanding performance. However, the high complexity of time and space makes them unmanageable for large scale graphs and becomes… ▽ More

    Submitted 6 May, 2021; v1 submitted 7 September, 2020; originally announced September 2020.

    Comments: Accepted by TKDE, the codes are availiable at https://github.com/EdisonLeeeee/SGAttack

  32. arXiv:2008.08931  [pdf, other

    cs.SI cs.LG stat.ML

    A Deep Prediction Network for Understanding Advertiser Intent and Satisfaction

    Authors: Liyi Guo, Rui Lu, Haoqi Zhang, Junqi **, Zhenzhe Zheng, Fan Wu, ** Li, Haiyang Xu, Han Li, Wenkai Lu, Jian Xu, Kun Gai

    Abstract: For e-commerce platforms such as Taobao and Amazon, advertisers play an important role in the entire digital ecosystem: their behaviors explicitly influence users' browsing and shop** experience; more importantly, advertiser's expenditure on advertising constitutes a primary source of platform revenue. Therefore, providing better services for advertisers is essential for the long-term prosperity… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

    Journal ref: CIKM 2020, Virtual Event, Ireland

  33. arXiv:2007.00240  [pdf, other

    cs.LG stat.ML

    Temporal Calibrated Regularization for Robust Noisy Label Learning

    Authors: Dongxian Wu, Yisen Wang, Zhuobin Zheng, Shu-tao Xia

    Abstract: Deep neural networks (DNNs) exhibit great success on many tasks with the help of large-scale well annotated datasets. However, labeling large-scale data can be very costly and error-prone so that it is difficult to guarantee the annotation quality (i.e., having noisy labels). Training on these noisy labeled datasets may adversely deteriorate their generalization performance. Existing methods eithe… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

    Comments: Published as a conference paper at IJCNN 2020

  34. arXiv:2006.16312  [pdf, other

    cs.LG cs.DS cs.IR eess.SY stat.ML

    Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising

    Authors: Xiaotian Hao, Zhaoqing Peng, Yi Ma, Guan Wang, Junqi **, Jianye Hao, Shan Chen, Rongquan Bai, Mingzhou Xie, Miao Xu, Zhenzhe Zheng, Chuan Yu, Han Li, Jian Xu, Kun Gai

    Abstract: In E-commerce, advertising is essential for merchants to reach their target users. The typical objective is to maximize the advertiser's cumulative revenue over a period of time under a budget constraint. In real applications, an advertisement (ad) usually needs to be exposed to the same user multiple times until the user finally contributes revenue (e.g., places an order). However, existing adver… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: accepted by ICML 2020

  35. arXiv:2006.12301  [pdf, other

    math.ST cs.LG stat.ML

    On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification

    Authors: Tianyi Lin, Zeyu Zheng, Elynn Y. Chen, Marco Cuturi, Michael I. Jordan

    Abstract: Optimal transport (OT) distances are increasingly used as loss functions for statistical inference, notably in the learning of generative models or supervised learning. Yet, the behavior of minimum Wasserstein estimators is poorly understood, notably in high-dimensional regimes or under model misspecification. In this work we adopt the viewpoint of projection robust (PR) OT, which seeks to maximiz… ▽ More

    Submitted 17 July, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: Accepted by AISTATS 2021; Fix some inaccuracy in the definition and proof; 49 Pages, 41 figures

  36. arXiv:2005.04914  [pdf, ps, other

    stat.ME math.ST stat.AP

    Scalable Interpretable Learning for Multi-Response Error-in-Variables Regression

    Authors: J. Wu, Z. Zheng, Y. Li, Y. Zhang

    Abstract: Corrupted data sets containing noisy or missing observations are prevalent in various contemporary applications such as economics, finance and bioinformatics. Despite the recent methodological and algorithmic advances in high-dimensional multi-response regression, how to achieve scalable and interpretable estimation under contaminated covariates is unclear. In this paper, we develop a new methodol… ▽ More

    Submitted 11 May, 2020; originally announced May 2020.

  37. arXiv:2003.09638  [pdf, other

    cs.LG cs.SI stat.ML

    An Uncoupled Training Architecture for Large Graph Learning

    Authors: Dalong Yang, Chuan Chen, Youhao Zheng, Zibin Zheng, Shih-wei Liao

    Abstract: Graph Convolutional Network (GCN) has been widely used in graph learning tasks. However, GCN-based models (GCNs) is an inherently coupled training framework repetitively conducting the complex neighboring aggregation, which leads to the limitation of flexibility in processing large-scale graph. With the depth of layers increases, the computational and memory cost of GCNs grow explosively due to th… ▽ More

    Submitted 21 July, 2020; v1 submitted 21 March, 2020; originally announced March 2020.

  38. arXiv:2003.07898  [pdf, ps, other

    stat.ML cs.LG math.ST stat.CO

    Statistically Guided Divide-and-Conquer for Sparse Factorization of Large Matrix

    Authors: Kun Chen, Ruipeng Dong, Wanwan Xu, Zemin Zheng

    Abstract: The sparse factorization of a large matrix is fundamental in modern statistical learning. In particular, the sparse singular value decomposition and its variants have been utilized in multivariate regression, factor analysis, biclustering, vector time series modeling, among others. The appeal of this factorization is owing to its power in discovering a highly-interpretable latent association netwo… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

  39. arXiv:2003.05730  [pdf, other

    cs.LG cs.AI stat.ML

    A Survey of Adversarial Learning on Graphs

    Authors: Liang Chen, **tang Li, Jiaying Peng, Tao Xie, Zengxu Cao, Kun Xu, Xiangnan He, Zibin Zheng, Bingzhe Wu

    Abstract: Deep learning models on graphs have achieved remarkable performance in various graph analysis tasks, e.g., node classification, link prediction, and graph clustering. However, they expose uncertainty and unreliability against the well-designed inputs, i.e., adversarial examples. Accordingly, a line of studies has emerged for both attack and defense addressed in different graph analysis tasks, lead… ▽ More

    Submitted 5 April, 2022; v1 submitted 10 March, 2020; originally announced March 2020.

    Comments: Preprint; 16 pages, 2 figures

  40. arXiv:2002.07454  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Distributed Optimization over Block-Cyclic Data

    Authors: Yucheng Ding, Chaoyue Niu, Yikai Yan, Zhenzhe Zheng, Fan Wu, Guihai Chen, Shaojie Tang, Rongfei Jia

    Abstract: We consider practical data characteristics underlying federated learning, where unbalanced and non-i.i.d. data from clients have a block-cyclic structure: each cycle contains several blocks, and each client's training data follow block-specific and non-i.i.d. distributions. Such a data structure would introduce client and block biases during the collaborative training: the single global model woul… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

  41. arXiv:2002.07399  [pdf, other

    stat.ML cs.DC cs.LG math.OC

    Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

    Authors: Yikai Yan, Chaoyue Niu, Yucheng Ding, Zhenzhe Zheng, Fan Wu, Guihai Chen, Shaojie Tang, Zhihua Wu

    Abstract: Federated learning is a new distributed machine learning framework, where a bunch of heterogeneous clients collaboratively train a model without sharing training data. In this work, we consider a practical and ubiquitous issue when deploying federated learning in mobile environments: intermittent client availability, where the set of eligible clients may change during the training process. Such in… ▽ More

    Submitted 21 December, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

  42. arXiv:2002.00577   

    cs.LG cs.DC stat.ML

    Prophet: Proactive Candidate-Selection for Federated Learning by Predicting the Qualities of Training and Reporting Phases

    Authors: Huawei Huang, Kangying Lin, Song Guo, Pan Zhou, Zibin Zheng

    Abstract: Although the challenge of the device connection is much relieved in 5G networks, the training latency is still an obstacle preventing Federated Learning (FL) from being largely adopted. One of the most fundamental problems that lead to large latency is the bad candidate-selection for FL. In the dynamic environment, the mobile devices selected by the existing reactive candidate-selection algorithms… ▽ More

    Submitted 18 May, 2020; v1 submitted 3 February, 2020; originally announced February 2020.

    Comments: We found significant technique errors in our previous version. The proposed DRL-based algorithm cannot solve the large-scale scheduling for federated learning. For the health of relevant research communities, we decide to withdraw our submission

  43. arXiv:2001.07646  [pdf

    stat.AP

    How Fast You Can Actually Fly: A Comparative Investigation of Flight Airborne Time in China and the U.S

    Authors: Ke Liu, Zhe Zheng, Bo Zou, Mark Hansen

    Abstract: Actual airborne time (AAT) is the time between wheels-off and wheels-on of a flight. Understanding the behavior of AAT is increasingly important given the ever growing demand for air travel and flight delays becoming more rampant. As no research on AAT exists, this paper performs the first empirical analysis of AAT behavior, comparatively for the U.S. and China. The focus is on how AAT is affected… ▽ More

    Submitted 21 January, 2020; originally announced January 2020.

    Comments: 44 pages, 11 figures

    MSC Class: 62P30

  44. arXiv:1912.05122  [pdf, other

    cs.LG stat.ML

    Towards Better Forecasting by Fusing Near and Distant Future Visions

    Authors: Jiezhu Cheng, Kaizhu Huang, Zibin Zheng

    Abstract: Multivariate time series forecasting is an important yet challenging problem in machine learning. Most existing approaches only forecast the series value of one future moment, ignoring the interactions between predictions of future moments with different temporal distance. Such a deficiency probably prevents the model from getting enough information about the future, thus limiting the forecasting… ▽ More

    Submitted 11 December, 2019; originally announced December 2019.

    Comments: Accepted by AAAI 2020

  45. arXiv:1912.04109  [pdf, other

    cs.IR cs.CR cs.LG stat.ML

    Data Poisoning Attacks on Neighborhood-based Recommender Systems

    Authors: Liang Chen, Yangjun Xu, Fenfang Xie, Min Huang, Zibin Zheng

    Abstract: Nowadays, collaborative filtering recommender systems have been widely deployed in many commercial companies to make profit. Neighbourhood-based collaborative filtering is common and effective. To date, despite its effectiveness, there has been little effort to explore their robustness and the impact of data poisoning attacks on their performance. Can the neighbourhood-based recommender systems be… ▽ More

    Submitted 1 December, 2019; originally announced December 2019.

  46. arXiv:1908.06369  [pdf, ps, other

    cs.LG stat.ML

    Robust DCD-Based Recursive Adaptive Algorithms

    Authors: Y. Yu, L. Lu, Z. Zheng, W. Wang, Y. Zakharov, R. C. de Lamare

    Abstract: The dichotomous coordinate descent (DCD) algorithm has been successfully used for significant reduction in the complexity of recursive least squares (RLS) algorithms. In this work, we generalize the application of the DCD algorithm to RLS adaptive filtering in impulsive noise scenarios and derive a unified update formula. By employing different robust strategies against impulsive noise, we develop… ▽ More

    Submitted 17 August, 2019; originally announced August 2019.

    Comments: 6 pages, 4 figures

  47. T-EDGE: Temporal WEighted MultiDiGraph Embedding for Ethereum Transaction Network Analysis

    Authors: Jia**g Wu, Dan Lin, Zibin Zheng, Qi Yuan

    Abstract: Recently, graph embedding techniques have been widely used in the analysis of various networks, but most of the existing embedding methods omit the network dynamics and the multiplicity of edges, so it is difficult to accurately describe the detailed characteristics of the transaction networks. Ethereum is a blockchain-based platform supporting smart contracts. The open nature of blockchain makes… ▽ More

    Submitted 31 July, 2020; v1 submitted 13 May, 2019; originally announced May 2019.

    Comments: 12 pages

    Journal ref: Front. Phys. 8:204 (2020)

  48. arXiv:1902.02812  [pdf, other

    stat.ML cs.LG

    Cooperative Training of Fast Thinking Initializer and Slow Thinking Solver for Conditional Learning

    Authors: Jianwen Xie, Zilong Zheng, Xiaolin Fang, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper studies the problem of learning the conditional distribution of a high-dimensional output given an input, where the output and input may belong to two different domains, e.g., the output is a photo image and the input is a sketch image. We solve this problem by cooperative training of a fast thinking initializer and slow thinking solver. The initializer generates the output directly by… ▽ More

    Submitted 7 April, 2021; v1 submitted 7 February, 2019; originally announced February 2019.

    Comments: 16 pages

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2021

  49. arXiv:1812.10587  [pdf, other

    stat.ML cs.CV cs.LG

    Learning Dynamic Generator Model by Alternating Back-Propagation Through Time

    Authors: Jianwen Xie, Ruiqi Gao, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper studies the dynamic generator model for spatial-temporal processes such as dynamic textures and action sequences in video data. In this model, each time frame of the video sequence is generated by a generator model, which is a non-linear transformation of a latent state vector, where the non-linear transformation is parametrized by a top-down neural network. The sequence of latent state… ▽ More

    Submitted 26 December, 2018; originally announced December 2018.

    Comments: 10 pages

    Journal ref: The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI) 2019

  50. arXiv:1810.06877  [pdf, other

    cs.LG cs.CV stat.ML

    Collaborative Deep Learning Across Multiple Data Centers

    Authors: Kele Xu, Haibo Mi, Dawei Feng, Huaimin Wang, Chuan Chen, Zibin Zheng, Xu Lan

    Abstract: Valuable training data is often owned by independent organizations and located in multiple data centers. Most deep learning approaches require to centralize the multi-datacenter data for performance purpose. In practice, however, it is often infeasible to transfer all data to a centralized data center due to not only bandwidth limitation but also the constraints of privacy regulations. Model avera… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

    Comments: Submitted to AAAI 2019