Skip to main content

Showing 1–50 of 89 results for author: Lu, Z

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.12588  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    UIFV: Data Reconstruction Attack in Vertical Federated Learning

    Authors: Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao

    Abstract: Vertical Federated Learning (VFL) facilitates collaborative machine learning without the need for participants to share raw private data. However, recent studies have revealed privacy risks where adversaries might reconstruct sensitive features through data leakage during the learning process. Although data reconstruction methods based on gradient or model information are somewhat effective, they… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.01799  [pdf, other

    cs.LG math.OC stat.ML

    Online Control in Population Dynamics

    Authors: Noah Golowich, Elad Hazan, Zhou Lu, Dhruv Rohatgi, Y. Jennifer Sun

    Abstract: The study of population dynamics originated with early sociological works but has since extended into many fields, including biology, epidemiology, evolutionary game theory, and economics. Most studies on population dynamics focus on the problem of prediction rather than control. Existing mathematical models for control in population dynamics are often restricted to specific, noise-free dynamics,… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  3. arXiv:2405.18577  [pdf, other

    math.OC cs.LG stat.ML

    Single-loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions

    Authors: Quanqi Hu, Qi Qi, Zhaosong Lu, Tianbao Yang

    Abstract: In this paper, we study a class of non-smooth non-convex problems in the form of $\min_{x}[\max_{y\in Y}φ(x, y) - \max_{z\in Z}ψ(x, z)]$, where both $Φ(x) = \max_{y\in Y}φ(x, y)$ and $Ψ(x)=\max_{z\in Z}ψ(x, z)$ are weakly convex functions, and $φ(x, y), ψ(x, z)$ are strongly concave functions in terms of $y$ and $z$, respectively. It covers two families of problems that have been studied but are m… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  4. arXiv:2404.13177  [pdf, other

    stat.ME stat.AP

    A Bayesian Hybrid Design with Borrowing from Historical Study

    Authors: Zhaohua Lu, John Toso, Girma Ayele, Philip He

    Abstract: In early phase drug development of combination therapy, the primary objective is to preliminarily assess whether there is additive activity when a novel agent combined with an established monotherapy. Due to potential feasibility issues with a large randomized study, uncontrolled single-arm trials have been the mainstream approach in cancer clinical trials. However, such trials often present signi… ▽ More

    Submitted 29 April, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  5. arXiv:2402.02701  [pdf, other

    cs.LG cs.AI stat.ML

    Understanding What Affects Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence

    Authors: Jiafei Lyu, Le Wan, Xiu Li, Zongqing Lu

    Abstract: Recently, there are many efforts attempting to learn useful policies for continuous control in visual reinforcement learning (RL). In this scenario, it is important to learn a generalizable policy, as the testing environment may differ from the training environment, e.g., there exist distractors during deployment. Many practical algorithms are proposed to handle this problem. However, to the best… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Part of this work is accepted as AAMAS 2024 extended abstract

  6. arXiv:2401.06904  [pdf

    stat.ME

    Non-collapsibility and Built-in Selection Bias of Hazard Ratio in Randomized Controlled Trials

    Authors: Helen Bian, Menglan Pang, Guanbo Wang, Zihang Lu

    Abstract: Background: The hazard ratio of the Cox proportional hazards model is widely used in randomized controlled trials to assess treatment effects. However, two properties of the hazard ratio including the non-collapsibility and built-in selection bias need to be further investigated. Methods: We conduct simulations to differentiate the non-collapsibility effect and built-in selection bias from the dif… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: 17 pages, 2 figures

  7. arXiv:2311.14655  [pdf, other

    stat.ME stat.AP

    A Sparse Factor Model for Clustering High-Dimensional Longitudinal Data

    Authors: Zihang Lu, Noirrit Kiran Chandra

    Abstract: Recent advances in engineering technologies have enabled the collection of a large number of longitudinal features. This wealth of information presents unique opportunities for researchers to investigate the complex nature of diseases and uncover underlying disease mechanisms. However, analyzing such kind of data can be difficult due to its high dimensionality, heterogeneity and computational chal… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

  8. arXiv:2311.06928  [pdf, other

    cs.LG stat.ME

    Attention for Causal Relationship Discovery from Biological Neural Dynamics

    Authors: Ziyu Lu, Anika Tabassum, Shruti Kulkarni, Lu Mi, J. Nathan Kutz, Eric Shea-Brown, Seung-Hwan Lim

    Abstract: This paper explores the potential of the transformer models for learning Granger causality in networks with complex nonlinear dynamics at every node, as in neurobiological and biophysical networks. Our study primarily focuses on a proof-of-concept investigation based on simulated neural dynamics, for which the ground-truth causality is known through the underlying connectivity matrix. For transfor… ▽ More

    Submitted 23 November, 2023; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: Accepted to the NeurIPS 2023 Workshop on Causal Representation Learning

  9. arXiv:2309.16578  [pdf, other

    stat.ML cs.LG physics.chem-ph

    Overcoming the Barrier of Orbital-Free Density Functional Theory for Molecular Systems Using Deep Learning

    Authors: He Zhang, Siyuan Liu, Jiacheng You, Chang Liu, Shuxin Zheng, Ziheng Lu, Tong Wang, Nanning Zheng, Bin Shao

    Abstract: Orbital-free density functional theory (OFDFT) is a quantum chemistry formulation that has a lower cost scaling than the prevailing Kohn-Sham DFT, which is increasingly desired for contemporary molecular research. However, its accuracy is limited by the kinetic energy density functional, which is notoriously hard to approximate for non-periodic molecular systems. Here we propose M-OFDFT, an OFDFT… ▽ More

    Submitted 9 March, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: Published in Nature Computational Science, March 2024. Full paper with supplementary information

  10. arXiv:2305.10187  [pdf, other

    stat.ME cs.LG stat.ML

    Evaluating Dynamic Conditional Quantile Treatment Effects with Applications in Ridesharing

    Authors: Ting Li, Chengchun Shi, Zhaohua Lu, Yi Li, Hongtu Zhu

    Abstract: Many modern tech companies, such as Google, Uber, and Didi, utilize online experiments (also known as A/B testing) to evaluate new policies against existing ones. While most studies concentrate on average treatment effects, situations with skewed and heavy-tailed outcome distributions may benefit from alternative criteria, such as quantiles. However, assessing dynamic quantile treatment effects (Q… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  11. arXiv:2302.11032  [pdf, other

    stat.ML cs.LG

    Boosting Nyström Method

    Authors: Keaton Hamm, Zhaoying Lu, Wenbo Ouyang, Hao Helen Zhang

    Abstract: The Nyström method is an effective tool to generate low-rank approximations of large matrices, and it is particularly useful for kernel-based learning. To improve the standard Nyström approximation, ensemble Nyström algorithms compute a mixture of Nyström approximations which are generated independently based on column resampling. We propose a new family of algorithms, boosting Nyström, which iter… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

  12. arXiv:2301.04204  [pdf, ps, other

    math.OC cs.LG math.NA stat.ML

    A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization

    Authors: Chuan He, Heng Huang, Zhaosong Lu

    Abstract: In this paper we consider finding an approximate second-order stationary point (SOSP) of general nonconvex conic optimization that minimizes a twice differentiable function subject to nonlinear equality constraints and also a convex conic constraint. In particular, we propose a Newton-conjugate gradient (Newton-CG) based barrier-augmented Lagrangian method for finding an approximate SOSP of this p… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

    Comments: 34 pages. arXiv admin note: substantial text overlap with arXiv:2301.03139

    MSC Class: 49M05; 49M15; 68Q25; 90C26; 90C30; 90C60

  13. arXiv:2301.03139  [pdf, ps, other

    math.OC cs.LG math.NA stat.ML

    A Newton-CG based augmented Lagrangian method for finding a second-order stationary point of nonconvex equality constrained optimization with complexity guarantees

    Authors: Chuan He, Zhaosong Lu, Ting Kei Pong

    Abstract: In this paper we consider finding a second-order stationary point (SOSP) of nonconvex equality constrained optimization when a nearly feasible point is known. In particular, we first propose a new Newton-CG method for finding an approximate SOSP of unconstrained optimization and show that it enjoys a substantially better complexity than the Newton-CG method [56]. We then propose a Newton-CG based… ▽ More

    Submitted 8 January, 2023; originally announced January 2023.

    Comments: 29 pages, accepted by SIAM Journal on Optimization

    MSC Class: 49M15; 68Q25; 90C06; 90C26; 90C30; 90C60

  14. arXiv:2301.02060  [pdf, ps, other

    math.OC cs.LG math.NA stat.ML

    A first-order augmented Lagrangian method for constrained minimax optimization

    Authors: Zhaosong Lu, Sanyou Mei

    Abstract: In this paper we study a class of constrained minimax problems. In particular, we propose a first-order augmented Lagrangian method for solving them, whose subproblems turn out to be a much simpler structured minimax problem and are suitably solved by a first-order method recently developed in [26] by the authors. Under some suitable assumptions, an \emph{operation complexity} of… ▽ More

    Submitted 17 April, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

    Comments: added a new section. arXiv admin note: substantial text overlap with arXiv:2301.01716

    MSC Class: 90C26; 90C30; 90C47; 90C99; 65K05

  15. arXiv:2301.01716  [pdf, ps, other

    math.OC cs.LG math.NA stat.ML

    First-order penalty methods for bilevel optimization

    Authors: Zhaosong Lu, Sanyou Mei

    Abstract: In this paper we study a class of unconstrained and constrained bilevel optimization problems in which the lower level is a possibly nonsmooth convex optimization problem, while the upper level is a possibly nonconvex optimization problem. We introduce a notion of $\varepsilon$-KKT solution for them and show that an $\varepsilon$-KKT solution leads to an $O(\sqrt{\varepsilon})$- or… ▽ More

    Submitted 7 March, 2024; v1 submitted 4 January, 2023; originally announced January 2023.

    Comments: Accepted by SIAM Journal on Optimization

    MSC Class: 90C26; 90C30; 90C47; 90C99; 65K05

  16. arXiv:2212.08756  [pdf, other

    cs.CL stat.AP

    Multi-Scales Data Augmentation Approach In Natural Language Inference For Artifacts Mitigation And Pre-Trained Model Optimization

    Authors: Zhenyuan Lu

    Abstract: Machine learning models can reach high performance on benchmark natural language processing (NLP) datasets but fail in more challenging settings. We study this issue when a pre-trained model learns dataset artifacts in natural language inference (NLI), the topic of studying the logical relationship between a pair of text sequences. We provide a variety of techniques for analyzing and locating data… ▽ More

    Submitted 16 March, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

  17. arXiv:2211.12638  [pdf, ps, other

    cs.LG math.OC stat.ML

    Projection-free Adaptive Regret with Membership Oracles

    Authors: Zhou Lu, Nataly Brukhim, Paula Gradu, Elad Hazan

    Abstract: In the framework of online convex optimization, most iterative algorithms require the computation of projections onto convex sets, which can be computationally expensive. To tackle this problem HK12 proposed the study of projection-free methods that replace projections with less expensive computations. The most common approach is based on the Frank-Wolfe method, that uses linear optimization compu… ▽ More

    Submitted 14 December, 2022; v1 submitted 22 November, 2022; originally announced November 2022.

  18. Review and Analysis of Pain Research Literature through Keyword Co-occurrence Networks

    Authors: Burcu Ozek, Zhenyuan Lu, Fatemeh Pouromran, Sagar Kamarthi

    Abstract: Pain is a significant public health problem as the number of individuals with a history of pain globally keeps growing. In response, many synergistic research areas have been coming together to address pain-related issues. This work conducts a review and analysis of a vast body of pain-related literature using the keyword co-occurrence network (KCN) methodology. In this method, a set of KCNs is co… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

  19. arXiv:2210.08385  [pdf, other

    stat.ME stat.AP

    A Joint Modeling Approach for Clustering Mixed-Type Multivariate Longitudinal Data: Application to the CHILD Cohort Study

    Authors: Zhiwen Tan, Chang Shen, Padmaja Subbarao, Wendy Lou, Zihang Lu

    Abstract: In epidemiological and clinical studies, identifying patients' phenotypes based on longitudinal profiles is critical to understanding the disease's developmental patterns. The current study was motivated by data from a Canadian birth cohort study, the CHILD Cohort Study. Our goal was to use multiple longitudinal respiratory traits to cluster the participants into subgroups with similar longitudina… ▽ More

    Submitted 21 March, 2023; v1 submitted 15 October, 2022; originally announced October 2022.

    Comments: 21 pages, 4 figures, 2 tables

  20. arXiv:2207.05697  [pdf, ps, other

    math.OC cs.LG math.NA stat.ML

    A Newton-CG based barrier method for finding a second-order stationary point of nonconvex conic optimization with complexity guarantees

    Authors: Chuan He, Zhaosong Lu

    Abstract: In this paper we consider finding an approximate second-order stationary point (SOSP) of nonconvex conic optimization that minimizes a twice differentiable function over the intersection of an affine subspace and a convex cone. In particular, we propose a Newton-conjugate gradient (Newton-CG) based barrier method for finding an $(ε,\sqrtε)$-SOSP of this problem. Our method is not only implementabl… ▽ More

    Submitted 11 October, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

    Comments: accepted by SIAM Journal on Optimization

    MSC Class: 49M05; 49M15; 65F10; 90C06; 90C60

  21. Optimal Parallel Sequential Change Detection under Generalized Performance Measures

    Authors: Zexian Lu, Yunxiao Chen, Xiaoou Li

    Abstract: This paper considers the detection of change points in parallel data streams, a problem widely encountered when analyzing large-scale real-time streaming data. Each stream may have its own change point, at which its data has a distributional change. With sequentially observed data, a decision maker needs to declare whether changes have already occurred to the streams at each time point.Once a stre… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

  22. arXiv:2206.01209  [pdf, ps, other

    math.OC cs.LG math.NA stat.ML

    Accelerated first-order methods for convex optimization with locally Lipschitz continuous gradient

    Authors: Zhaosong Lu, Sanyou Mei

    Abstract: In this paper we develop accelerated first-order methods for convex optimization with locally Lipschitz continuous gradient (LLCG), which is beyond the well-studied class of convex optimization with Lipschitz continuous gradient. In particular, we first consider unconstrained convex optimization with LLCG and propose accelerated proximal gradient (APG) methods for solving it. The proposed APG meth… ▽ More

    Submitted 10 April, 2023; v1 submitted 2 June, 2022; originally announced June 2022.

    Comments: Accepted by SIAM Journal on Optimization

    MSC Class: 90C25; 90C30; 90C46; 49M37

  23. arXiv:2206.00973  [pdf, ps, other

    math.OC cs.LG math.NA stat.ML

    Primal-dual extrapolation methods for monotone inclusions under local Lipschitz continuity with applications to variational inequality, conic constrained saddle point, and convex conic optimization problems

    Authors: Zhaosong Lu, Sanyou Mei

    Abstract: In this paper we consider a class of structured monotone inclusion (MI) problems that consist of finding a zero in the sum of two monotone operators, in which one is maximal monotone while another is locally Lipschitz continuous. In particular, we first propose a primal-dual extrapolation (PDE) method for solving a structured strongly MI problem by modifying the classical forward-backward splittin… ▽ More

    Submitted 24 June, 2022; v1 submitted 2 June, 2022; originally announced June 2022.

    Comments: corrected some typos

    MSC Class: 47H05; 47J20; 49M29; 65K15; 90C25

  24. arXiv:2202.05683  [pdf

    stat.CO

    Rare event estimation with sequential directional importance sampling (SDIS)

    Authors: Kai Cheng, Iason Papaioannou, Zhenzhou Lu, Xiaobo Zhang, Yan** Wang

    Abstract: In this paper, we propose a sequential directional importance sampling (SDIS) method for rare event estimation. SDIS expresses a small failure probability in terms of a sequence of auxiliary failure probabilities, defined by magnifying the input variability. The first probability in the sequence is estimated with Monte Carlo simulation in Cartesian coordinates, and all the subsequent ones are comp… ▽ More

    Submitted 12 January, 2022; originally announced February 2022.

  25. arXiv:2110.13391  [pdf

    stat.AP

    Analyzing the Data of COVID-19 with Quasi-Distribution Fitting Based on Piecewise B-spline Curves

    Authors: Qingliang Zhao, Zhenhuan Lu, Yiduo Wang

    Abstract: Facing the world wide coronavirus disease 2019 (COVID-19) pandemic, a new fitting method (QDF, quasi-distribution fitting) which could be used to analyze the data of COVID-19 is developed based on piecewise quasi-uniform B-spline curves. For any given country or district, it simulates the distribution histogram data which is made from the daily confirmed cases (or the other data including daily re… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

  26. arXiv:2107.06089  [pdf, other

    econ.EM stat.ME

    MinP Score Tests with an Inequality Constrained Parameter Space

    Authors: Giuseppe Cavaliere, Zeng-Hua Lu, Anders Rahbek, Yuhong Yang

    Abstract: Score tests have the advantage of requiring estimation alone of the model restricted by the null hypothesis, which often is much simpler than models defined under the alternative hypothesis. This is typically so when the alternative hypothesis involves inequality constraints. However, existing score tests address only jointly testing all parameters of interest; a leading example is testing all ARC… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

  27. arXiv:2106.14588  [pdf, other

    math.OC cs.LG stat.ML

    The Convergence Rate of SGD's Final Iterate: Analysis on Dimension Dependence

    Authors: Daogao Liu, Zhou Lu

    Abstract: Stochastic Gradient Descent (SGD) is among the simplest and most popular methods in optimization. The convergence rate for SGD has been extensively studied and tight analyses have been established for the running average scheme, but the sub-optimality of the final iterate is still not well-understood. shamir2013stochastic gave the best known upper bound for the final iterate of SGD minimizing non-… ▽ More

    Submitted 28 June, 2021; originally announced June 2021.

  28. arXiv:2105.12921  [pdf, other

    stat.ME

    Score test for missing at random or not

    Authors: Hairu Wang, Zhi** Lu, Yukun Liu

    Abstract: Missing data are frequently encountered in various disciplines and can be divided into three categories: missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR). Valid statistical approaches to missing data depend crucially on correct identification of the underlying missingness mechanism. Although the problem of testing whether this mechanism is MCAR or MAR h… ▽ More

    Submitted 26 May, 2021; originally announced May 2021.

    Comments: 22 pages, 4 tables, 2 figures

  29. arXiv:2103.00719  [pdf, ps, other

    cs.LG cs.AI stat.ML

    LocalDrop: A Hybrid Regularization for Deep Neural Networks

    Authors: Ziqing Lu, Chang Xu, Bo Du, Takashi Ishida, Lefei Zhang, Masashi Sugiyama

    Abstract: In neural networks, develo** regularization algorithms to settle overfitting is one of the major study areas. We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop. A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs), including drop rates and weight matrices, has been dev… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

  30. arXiv:2102.05363  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons

    Authors: Bohang Zhang, Tianle Cai, Zhou Lu, Di He, Liwei Wang

    Abstract: It is well-known that standard neural networks, even with a high classification accuracy, are vulnerable to small $\ell_\infty$-norm bounded adversarial perturbations. Although many attempts have been made, most previous works either can only provide empirical verification of the defense to a particular attack method, or can only develop a certified guarantee of the model robustness in limited sce… ▽ More

    Submitted 14 June, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

    Comments: Appearing at International Conference on Machine Learning (ICML) 2021

  31. arXiv:2101.11286  [pdf, ps, other

    cs.LG cs.IT stat.ML

    A Note on the Representation Power of GHHs

    Authors: Zhou Lu

    Abstract: In this note we prove a sharp lower bound on the necessary number of nestings of nested absolute-value functions of generalized hinging hyperplanes (GHH) to represent arbitrary CPWL functions. Previous upper bound states that $n+1$ nestings is sufficient for GHH to achieve universal representation power, but the corresponding lower bound was unknown. We prove that $n$ nestings is necessary for uni… ▽ More

    Submitted 27 January, 2021; originally announced January 2021.

  32. arXiv:2012.13326  [pdf, ps, other

    cs.LG cs.AI stat.ML

    A Tight Lower Bound for Uniformly Stable Algorithms

    Authors: Qinghua Liu, Zhou Lu

    Abstract: Leveraging algorithmic stability to derive sharp generalization bounds is a classic and powerful approach in learning theory. Since Vapnik and Chervonenkis [1974] first formalized the idea for analyzing SVMs, it has been utilized to study many fundamental learning algorithms (e.g., $k$-nearest neighbors [Rogers and Wagner, 1978], stochastic gradient method [Hardt et al., 2016], linear regression [… ▽ More

    Submitted 24 January, 2021; v1 submitted 24 December, 2020; originally announced December 2020.

  33. arXiv:2011.10020  [pdf

    stat.AP

    Modelling fertility potential in survivors of childhood cancer: An introduction to modern statistical and computational methods

    Authors: L. Yu, Z. Lu, P. C. Nathan, S. Mostoufi-Moab, Y. Yuan

    Abstract: Statistical and computational methods are widely used in today's scientific studies. Using a female fertility potential in childhood cancer survivors as an example, we illustrate how these methods can be used to extract insight regarding biological processes from noisy observational data in order to inform decision making. We start by contextualizing the computational methods with the working exam… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

    Comments: 15 pages, 9 figures and 1 table

  34. arXiv:2010.09822  [pdf, other

    stat.ME

    Is the new model better? One metric says yes, but the other says no. Which metric do I use?

    Authors: Qian M. Zhou, Zhe Lu, Russell J. Brooke, Melissa M Hudson, Yan Yuan

    Abstract: Incremental value (IncV) evaluates the performance change from an existing risk model to a new model. It is one of the key considerations in deciding whether a new risk model performs better than the existing one. Problems arise when different IncV metrics contradict each other. For example, compared with a prescribed-dose model, an ovarian-dose model for predicting acute ovarian failure has a sli… ▽ More

    Submitted 15 December, 2020; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 25 pages, 6 figures, 1 table. Compared to Version 1, the title and overall structure of the manuscript have been changed significantly

  35. arXiv:2006.07584  [pdf, other

    cs.LG stat.ML

    Mean-Field Approximation to Gaussian-Softmax Integral with Application to Uncertainty Estimation

    Authors: Zhiyun Lu, Eugene Ie, Fei Sha

    Abstract: Many methods have been proposed to quantify the predictive uncertainty associated with the outputs of deep neural networks. Among them, ensemble methods often lead to state-of-the-art results, though they require modifications to the training procedures and are computationally costly for both training and inference. In this paper, we propose a new single-model based approach. The main idea is insp… ▽ More

    Submitted 9 May, 2021; v1 submitted 13 June, 2020; originally announced June 2020.

  36. arXiv:2006.06455  [pdf, other

    cs.LG cs.MA stat.ML

    Learning Individually Inferred Communication for Multi-Agent Cooperation

    Authors: Ziluo Ding, Tiejun Huang, Zongqing Lu

    Abstract: Communication lays the foundation for human cooperation. It is also crucial for multi-agent cooperation. However, existing work focuses on broadcast communication, which is not only impractical but also leads to information redundancy that could even impair the learning process. To tackle these difficulties, we propose Individually Inferred Communication (I2C), a simple yet effective model to enab… ▽ More

    Submitted 28 April, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020, oral presentation. The code is available at https://github.com/PKU-AI-Edge/I2C

  37. arXiv:2006.05842  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    The Emergence of Individuality

    Authors: Jiechuan Jiang, Zongqing Lu

    Abstract: Individuality is essential in human society, which induces the division of labor and thus improves the efficiency and productivity. Similarly, it should also be the key to multi-agent cooperation. Inspired by that individuality is of being an individual separate from others, we propose a simple yet efficient method for the emergence of individuality (EOI) in multi-agent reinforcement learning (MAR… ▽ More

    Submitted 18 October, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: The extended version of ICML 2021 paper

  38. arXiv:2004.05707  [pdf, other

    cs.CL cs.LG stat.ML

    VGCN-BERT: Augmenting BERT with Graph Embedding for Text Classification

    Authors: Zhibin Lu, Pan Du, Jian-Yun Nie

    Abstract: Much progress has been made recently on text classification with methods based on neural networks. In particular, models using attention mechanism such as BERT have shown to have the capability of capturing the contextual information within a sentence or document. However, their ability of capturing the global information about the vocabulary of a language is more limited. This latter is the stren… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: 12 pages, 2 figures

    ACM Class: I.2.4; I.2.7

    Journal ref: in J. M. Jose et al. (Eds.): ECIR 2020, LNCS 12035, pp.369-382, 2020

  39. arXiv:2002.12641  [pdf, other

    cs.LG stat.ML

    AdarGCN: Adaptive Aggregation GCN for Few-Shot Learning

    Authors: Jianhong Zhang, Manli Zhang, Zhiwu Lu, Tao Xiang, Jirong Wen

    Abstract: Existing few-shot learning (FSL) methods assume that there exist sufficient training samples from source classes for knowledge transfer to target classes with few training samples. However, this assumption is often invalid, especially when it comes to fine-grained recognition. In this work, we define a new FSL setting termed few-shot fewshot learning (FSFSL), under which both the source and target… ▽ More

    Submitted 9 March, 2020; v1 submitted 28 February, 2020; originally announced February 2020.

    Comments: The code is at github - https://github.com/RiceZJH/AdarGCN

  40. arXiv:2002.06856  [pdf, other

    cs.LG stat.ML

    Data and Model Dependencies of Membership Inference Attack

    Authors: Shakila Mahjabin Tonni, Dinusha Vatsalan, Farhad Farokhi, Dali Kaafar, Zhigang Lu, Gioacchino Tangari

    Abstract: Machine learning (ML) models have been shown to be vulnerable to Membership Inference Attacks (MIA), which infer the membership of a given data point in the target dataset by observing the prediction output of the ML model. While the key factors for the success of MIA have not yet been fully understood, existing defense mechanisms such as using L2 regularization \cite{10shokri2017membership} and d… ▽ More

    Submitted 25 July, 2020; v1 submitted 17 February, 2020; originally announced February 2020.

  41. arXiv:2002.04274   

    cs.LG stat.ML

    Meta-Learning across Meta-Tasks for Few-Shot Learning

    Authors: Nanyi Fei, Zhiwu Lu, Yizhao Gao, Jia Tian, Tao Xiang, Ji-Rong Wen

    Abstract: Existing meta-learning based few-shot learning (FSL) methods typically adopt an episodic training strategy whereby each episode contains a meta-task. Across episodes, these tasks are sampled randomly and their relationships are ignored. In this paper, we argue that the inter-meta-task relationships should be exploited and those tasks are sampled strategically to assist in meta-learning. Specifical… ▽ More

    Submitted 26 September, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: There are some mistakes in the experiments. We thus choose to withdraw this paper

  42. arXiv:2002.02050   

    cs.LG stat.ML

    Few-Shot Learning as Domain Adaptation: Algorithm and Analysis

    Authors: Jiechao Guan, Zhiwu Lu, Tao Xiang, Ji-Rong Wen

    Abstract: To recognize the unseen classes with only few samples, few-shot learning (FSL) uses prior knowledge learned from the seen classes. A major challenge for FSL is that the distribution of the unseen classes is different from that of those seen, resulting in poor generalization even when a model is meta-trained on the seen classes. This class-difference-caused distribution shift can be considered as a… ▽ More

    Submitted 27 July, 2020; v1 submitted 5 February, 2020; originally announced February 2020.

    Comments: There exist some mistakes in the experiments

  43. arXiv:1911.01545  [pdf, other

    cs.LG cs.NE stat.ML

    Compositional Generalization with Tree Stack Memory Units

    Authors: Forough Arabshahi, Zhichu Lu, Pranay Mundra, Sameer Singh, Animashree Anandkumar

    Abstract: We study compositional generalization, viz., the problem of zero-shot generalization to novel compositions of concepts in a domain. Standard neural networks fail to a large extent on compositional learning. We propose Tree Stack Memory Units (Tree-SMU) to enable strong compositional generalization. Tree-SMU is a recursive neural network with Stack Memory Units (\SMU s), a novel memory augmented ne… ▽ More

    Submitted 15 October, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

  44. arXiv:1910.14472  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Learning Fairness in Multi-Agent Systems

    Authors: Jiechuan Jiang, Zongqing Lu

    Abstract: Fairness is essential for human society, contributing to stability and productivity. Similarly, fairness is also the key for many multi-agent systems. Taking fairness into multi-agent learning could help multi-agent systems become both efficient and stable. However, learning efficiency and fairness simultaneously is a complex, multi-objective, joint-policy optimization. To tackle these difficultie… ▽ More

    Submitted 31 October, 2019; originally announced October 2019.

    Comments: NeurIPS'19

  45. arXiv:1909.03044  [pdf

    cs.CL cs.LG stat.ML

    Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records

    Authors: Qingyu Chen, **gcheng Du, Sun Kim, W. John Wilbur, Zhiyong Lu

    Abstract: Capturing sentence semantics plays a vital role in a range of text mining applications. Despite continuous efforts on the development of related datasets and models in the general domain, both datasets and models are limited in biomedical and clinical domains. The BioCreative/OHNLP organizers have made the first attempt to annotate 1,068 sentence pairs from clinical notes and have called for a com… ▽ More

    Submitted 6 September, 2019; originally announced September 2019.

    Comments: 15 pages, 5 figures, 2 tables

  46. arXiv:1908.08401  [pdf, ps, other

    cs.LG cs.IT stat.ML

    A Deep Actor-Critic Reinforcement Learning Framework for Dynamic Multichannel Access

    Authors: Chen Zhong, Ziyang Lu, M. Cenk Gursoy, Senem Velipasalar

    Abstract: To make efficient use of limited spectral resources, we in this work propose a deep actor-critic reinforcement learning based framework for dynamic multichannel access. We consider both a single-user case and a scenario in which multiple users attempt to access channels simultaneously. We employ the proposed framework as a single agent in the single-user case, and extend it to a decentralized mult… ▽ More

    Submitted 20 August, 2019; originally announced August 2019.

    Comments: 14 figures. arXiv admin note: text overlap with arXiv:1810.03695

  47. arXiv:1907.13177  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Towards More Accurate Automatic Sleep Staging via Deep Transfer Learning

    Authors: Huy Phan, Oliver Y. Chén, Philipp Koch, Zongqing Lu, Ian McLoughlin, Alfred Mertins, Maarten De Vos

    Abstract: Background: Despite recent significant progress in the development of automatic sleep staging methods, building a good model still remains a big challenge for sleep studies with a small cohort due to the data-variability and data-inefficiency issues. This work presents a deep transfer learning approach to overcome these issues and enable transferring knowledge from a large dataset to a small cohor… ▽ More

    Submitted 27 August, 2020; v1 submitted 30 July, 2019; originally announced July 2019.

    Comments: This article has been published in IEEE Transactions on Biomedical Engineering

  48. arXiv:1907.01175  [pdf, other

    stat.ME stat.AP

    Volatility Analysis with Realized GARCH-Ito Models

    Authors: Xinyu Song, Donggyu Kim, Huiling Yuan, Xiangyu Cui, Zhi** Lu, Yong Zhou, Yazhen Wang

    Abstract: This paper introduces a unified approach for modeling high-frequency financial data that can accommodate both the continuous-time jump-diffusion and discrete-time realized GARCH model by embedding the discrete realized GARCH structure in the continuous instantaneous volatility process. The key feature of the proposed model is that the corresponding conditional daily integrated volatility adopts an… ▽ More

    Submitted 15 June, 2020; v1 submitted 2 July, 2019; originally announced July 2019.

    Comments: 39 pages, 4 tables, 3 figures

  49. arXiv:1906.08720  [pdf, other

    cs.LG stat.ML

    Boosting for Control of Dynamical Systems

    Authors: Naman Agarwal, Nataly Brukhim, Elad Hazan, Zhou Lu

    Abstract: We study the question of how to aggregate controllers for dynamical systems in order to improve their performance. To this end, we propose a framework of boosting for online control. Our main result is an efficient boosting algorithm that combines weak controllers into a provably more accurate one. Empirical evaluation on a host of control settings supports our theoretical findings.

    Submitted 23 February, 2020; v1 submitted 20 June, 2019; originally announced June 2019.

  50. arXiv:1905.03041  [pdf, other

    cs.SI cs.LG physics.soc-ph stat.ML

    Tag2Vec: Learning Tag Representations in Tag Networks

    Authors: Junshan Wang, Zhicong Lu, Guojie Song, Yue Fan, Lun Du, Wei Lin

    Abstract: Network embedding is a method to learn low-dimensional representation vectors for nodes in complex networks. In real networks, nodes may have multiple tags but existing methods ignore the abundant semantic and hierarchical information of tags. This information is useful to many network applications and usually very stable. In this paper, we propose a tag representation learning model, Tag2Vec, whi… ▽ More

    Submitted 24 September, 2020; v1 submitted 19 April, 2019; originally announced May 2019.

    Comments: 6 pages