Skip to main content

Showing 1–50 of 107 results for author: Shen, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.15473  [pdf, other

    stat.ML cs.LG cs.SI

    Encoder Embedding for General Graph and Node Classification

    Authors: Cencheng Shen

    Abstract: Graph encoder embedding, a recent technique for graph data, offers speed and scalability in producing vertex-level representations from binary graphs. In this paper, we extend the applicability of this method to a general graph model, which includes weighted graphs, distance matrices, and kernel matrices. We prove that the encoder embedding satisfies the law of large numbers and the central limit… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 26 pages

  2. arXiv:2405.12797  [pdf, other

    cs.SI stat.ML

    Refined Graph Encoder Embedding via Self-Training and Latent Community Recovery

    Authors: Cencheng Shen, Jonathan Larson, Ha Trinh, Carey E. Priebe

    Abstract: This paper introduces a refined graph encoder embedding method, enhancing the original graph encoder embedding using linear transformation, self-training, and hidden community recovery within observed communities. We provide the theoretical rationale for the refinement procedure, demonstrating how and why our proposed method can effectively identify useful hidden communities via stochastic block m… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 12 pages main + 4 pages appendix

  3. arXiv:2402.09723  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    Efficient Prompt Optimization Through the Lens of Best Arm Identification

    Authors: Chengshuai Shi, Kun Yang, Zihan Chen, Jundong Li, **g Yang, Cong Shen

    Abstract: The remarkable instruction-following capability of large language models (LLMs) has sparked a growing interest in automatically finding good prompts, i.e., prompt optimization. Most existing works follow the scheme of selecting from a pre-generated pool of candidate prompts. However, these designs mainly focus on the generation strategy, while limited attention has been paid to the selection metho… ▽ More

    Submitted 30 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  4. arXiv:2402.08069  [pdf, other

    stat.ME

    Interrater agreement statistics under the two-rater dichotomous-response case with correlated decisions

    Authors: Zizhong Tian, Vernon M. Chinchilli, Chan Shen, Shouhao Zhou

    Abstract: Measurement of the interrater agreement (IRA) is critical in various disciplines. To correct for potential confounding chance agreement in IRA, Cohen's kappa and many other methods have been proposed. However, owing to the varied strategies and assumptions across these methods, there is a lack of practical guidelines on how these methods should be preferred even for the common two-rater dichotomou… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: about 48 pages in total (including web appendices), 7 figures in the main paper, and 5 figures in the supplementary materials

  5. arXiv:2312.16341  [pdf, other

    stat.ML cs.IT cs.LG cs.MA

    Harnessing the Power of Federated Learning in Federated Contextual Bandits

    Authors: Chengshuai Shi, Ruida Zhou, Kun Yang, Cong Shen

    Abstract: Federated learning (FL) has demonstrated great potential in revolutionizing distributed machine learning, and tremendous efforts have been made to extend it beyond the original focus on supervised learning. Among many directions, federated contextual bandits (FCB), a pivotal integration of FL and sequential decision-making, has garnered significant attention in recent years. Despite substantial pr… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: A preliminary version appeared in the Multi-Agent Security Workshop at NeurIPS 2023

  6. arXiv:2312.15148  [pdf, other

    cs.LG cs.IT eess.SP stat.ML

    Personalized Federated Learning with Attention-based Client Selection

    Authors: Zihan Chen, Jundong Li, Cong Shen

    Abstract: Personalized Federated Learning (PFL) relies on collective data knowledge to build customized models. However, non-IID data between clients poses significant challenges, as collaborating with clients who have diverse data distributions can harm local model performance, especially with limited training data. To address this issue, we propose FedACS, a new PFL algorithm with an Attention-based Clien… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  7. arXiv:2311.00973  [pdf, other

    cs.LG cs.IT stat.ML

    Federated Linear Bandits with Finite Adversarial Actions

    Authors: Li Fan, Ruida Zhou, Chao Tian, Cong Shen

    Abstract: We study a federated linear bandits model, where $M$ clients communicate with a central server to solve a linear contextual bandits problem with finite adversarial action sets that may be different across clients. To address the unique challenges of adversarial finite action sets, we propose the FedSupLinUCB algorithm, which extends the principles of SupLinUCB and OFUL algorithms in linear context… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted to NeurIPS 2023, camera-ready version

  8. arXiv:2311.00944  [pdf, other

    stat.ML cs.IT cs.LG math.OC

    Stochastic Smoothed Gradient Descent Ascent for Federated Minimax Optimization

    Authors: Wei Shen, Minhui Huang, Jiawei Zhang, Cong Shen

    Abstract: In recent years, federated minimax optimization has attracted growing interest due to its extensive applications in various machine learning tasks. While Smoothed Alternative Gradient Descent Ascent (Smoothed-AGDA) has proved its success in centralized nonconvex minimax optimization, how and whether smoothing technique could be helpful in federated setting remains unexplored. In this paper, we pro… ▽ More

    Submitted 18 April, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

  9. arXiv:2307.13868  [pdf, other

    stat.ME cs.LG stat.ML

    Learning sources of variability from high-dimensional observational studies

    Authors: Eric W. Bridgeford, Jaewon Chung, Brian Gilbert, Sambit Panda, Adam Li, Cencheng Shen, Alexandra Badea, Brian Caffo, Joshua T. Vogelstein

    Abstract: Causal inference studies whether the presence of a variable influences an observed outcome. As measured by quantities such as the "average treatment effect," this paradigm is employed across numerous biological fields, from vaccine and drug development to policy interventions. Unfortunately, the majority of these methods are often limited to univariate outcomes. Our work generalizes causal estiman… ▽ More

    Submitted 28 November, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

  10. arXiv:2307.00260  [pdf, other

    stat.ME math.ST stat.ML

    Bootstrap** the Cross-Validation Estimate

    Authors: Bryan Cai, Fabio Pellegrini, Menglan Pang, Carl de Moor, Changyu Shen, Vivek Charu, Lu Tian

    Abstract: Cross-validation is a widely used technique for evaluating the performance of prediction models. It helps avoid the optimism bias in error estimates, which can be significant for models built using complex statistical learning algorithms. However, since the cross-validation estimate is a random value dependent on observed data, it is essential to accurately quantify the uncertainty associated with… ▽ More

    Submitted 1 July, 2023; originally announced July 2023.

  11. arXiv:2306.08364  [pdf, other

    stat.ML cs.IT cs.LG

    Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources

    Authors: Chengshuai Shi, Wei Xiong, Cong Shen, **g Yang

    Abstract: Existing theoretical studies on offline reinforcement learning (RL) mostly consider a dataset sampled directly from the target task. In practice, however, data often come from several heterogeneous but related sources. Motivated by this gap, this work aims at rigorously understanding offline RL with multiple datasets that are collected from randomly perturbed versions of the target task instead of… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  12. arXiv:2306.08280  [pdf, other

    cs.IT cs.CR cs.LG eess.SP stat.ML

    Differentially Private Wireless Federated Learning Using Orthogonal Sequences

    Authors: Xizixiang Wei, Tianhao Wang, Ruiquan Huang, Cong Shen, **g Yang, H. Vincent Poor

    Abstract: We propose a privacy-preserving uplink over-the-air computation (AirComp) method, termed FLORAS, for single-input single-output (SISO) wireless federated learning (FL) systems. From the perspective of communication designs, FLORAS eliminates the requirement of channel state information at the transmitters (CSIT) by leveraging the properties of orthogonal sequences. From the privacy perspective, we… ▽ More

    Submitted 21 November, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 33 pages, 5 figures

  13. arXiv:2306.06265  [pdf, other

    cs.LG cs.IT stat.ML

    Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints

    Authors: Donghao Li, Ruiquan Huang, Cong Shen, **g Yang

    Abstract: This paper investigates conservative exploration in reinforcement learning where the performance of the learning agent is guaranteed to be above a certain threshold throughout the learning process. It focuses on the tabular episodic Markov Decision Process (MDP) setting that has finite states and actions. With the knowledge of an existing safe baseline policy, an algorithm termed as StepMix is pro… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: Accepted by ICML2023

  14. arXiv:2305.19947  [pdf, other

    cs.CV cs.LG stat.ML

    A Geometric Perspective on Diffusion Models

    Authors: Defang Chen, Zhenyu Zhou, Jian-** Mei, Chunhua Shen, Chun Chen, Can Wang

    Abstract: Recent years have witnessed significant progress in develo** effective training and fast sampling techniques for diffusion models. A remarkable advancement is the use of stochastic differential equations (SDEs) and their marginal-preserving ordinary differential equations (ODEs) to describe data perturbation and generative modeling in a unified framework. In this paper, we carefully inspect the… ▽ More

    Submitted 30 September, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: 38 pages

  15. arXiv:2305.03884  [pdf, other

    stat.ML cs.IT cs.LG eess.SP

    On High-dimensional and Low-rank Tensor Bandits

    Authors: Chengshuai Shi, Cong Shen, Nicholas D. Sidiropoulos

    Abstract: Most existing studies on linear bandits focus on the one-dimensional characterization of the overall system. While being representative, this formulation may fail to model applications with high-dimensional but favorable structures, such as the low-rank tensor representation for recommender systems. To address this limitation, this work studies a general tensor bandits model, where actions and sys… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: Accepted to the 2023 IEEE International Symposium on Information Theory (ISIT 2023)

  16. arXiv:2305.02441  [pdf, other

    stat.ML cs.IT cs.LG eess.SP

    Reward Teaching for Federated Multi-armed Bandits

    Authors: Chengshuai Shi, Wei Xiong, Cong Shen, **g Yang

    Abstract: Most of the existing federated multi-armed bandits (FMAB) designs are based on the presumption that clients will implement the specified design to collaborate with the server. In reality, however, it may not be possible to modify the clients' existing protocols. To address this challenge, this work focuses on clients who always maximize their individual cumulative rewards, and introduces a novel i… ▽ More

    Submitted 20 November, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: Accepted to IEEE Transactions on Signal Processing

  17. Discovering Communication Pattern Shifts in Large-Scale Labeled Networks using Encoder Embedding and Vertex Dynamics

    Authors: Cencheng Shen, Jonathan Larson, Ha Trinh, Xihan Qin, Youngser Park, Carey E. Priebe

    Abstract: Analyzing large-scale time-series network data, such as social media and email communications, poses a significant challenge in understanding social dynamics, detecting anomalies, and predicting trends. In particular, the scalability of graph analysis is a critical hurdle impeding progress in large-scale downstream inference. To address this challenge, we introduce a temporal encoder embedding met… ▽ More

    Submitted 29 November, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: 10 pages + 2 pages appendix, 8 figures

    Journal ref: IEEE Transactions on Network Science and Engineering 11(2), 2100-2109, 2024

  18. Synergistic Graph Fusion via Encoder Embedding

    Authors: Cencheng Shen, Carey E. Priebe, Jonathan Larson, Ha Trinh

    Abstract: In this paper, we introduce a method called graph fusion embedding, designed for multi-graph embedding with shared vertex sets. Under the framework of supervised learning, our method exhibits a remarkable and highly desirable synergistic effect: for sufficiently large vertex size, the accuracy of vertex classification consistently benefits from the incorporation of additional graphs. We establish… ▽ More

    Submitted 5 June, 2024; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: 19 pages main + 11 pages appendix

    Journal ref: Information Sciences 678, 120912, 2024

  19. arXiv:2302.05599  [pdf, ps, other

    cs.IT cs.LG eess.SP stat.ML

    Communication and Storage Efficient Federated Split Learning

    Authors: Yujia Mu, Cong Shen

    Abstract: Federated learning (FL) is a popular distributed machine learning (ML) paradigm, but is often limited by significant communication costs and edge device computation capabilities. Federated Split Learning (FSL) preserves the parallel model training principle of FL, with a reduced device computation requirement thanks to splitting the ML model between the server and clients. However, FSL still incur… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

  20. arXiv:2301.11290  [pdf, other

    cs.SI cs.LG stat.ML

    Graph Encoder Ensemble for Simultaneous Vertex Embedding and Community Detection

    Authors: Cencheng Shen, Youngser Park, Carey E. Priebe

    Abstract: In this paper, we introduce a novel and computationally efficient method for vertex embedding, community detection, and community size determination. Our approach leverages a normalized one-hot graph encoder and a rank-based cluster size measure. Through extensive simulations, we demonstrate the excellent numerical performance of our proposed graph encoder ensemble algorithm.

    Submitted 18 November, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

    Comments: 8 pages

    Journal ref: in Proceedings of 2023 2nd International Conference on Algorithms, Data Mining, and Information Technology, 13-18, ACM, 2023

  21. arXiv:2211.00463  [pdf, other

    cs.CR stat.ML

    Amplifying Membership Exposure via Data Poisoning

    Authors: Yufei Chen, Chao Shen, Yun Shen, Cong Wang, Yang Zhang

    Abstract: As in-the-wild data are increasingly involved in the training stage, machine learning applications become more susceptible to data poisoning attacks. Such attacks typically lead to test-time accuracy degradation or controlled misprediction. In this paper, we investigate the third type of exploitation of data poisoning - increasing the risks of privacy leakage of benign training samples. To this en… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: To Appear in the 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  22. arXiv:2210.09881  [pdf, other

    cs.IT cs.LG eess.SP stat.ML

    Random Orthogonalization for Federated Learning in Massive MIMO Systems

    Authors: Xizixiang Wei, Cong Shen, **g Yang, H. Vincent Poor

    Abstract: We propose a novel communication design, termed random orthogonalization, for federated learning (FL) in a massive multiple-input and multiple-output (MIMO) wireless system. The key novelty of random orthogonalization comes from the tight coupling of FL and two unique characteristics of massive MIMO -- channel hardening and favorable propagation. As a result, random orthogonalization can achieve n… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: 31 pages, 7 figures, submitted for possible journal publication

  23. arXiv:2210.08385  [pdf, other

    stat.ME stat.AP

    A Joint Modeling Approach for Clustering Mixed-Type Multivariate Longitudinal Data: Application to the CHILD Cohort Study

    Authors: Zhiwen Tan, Chang Shen, Padmaja Subbarao, Wendy Lou, Zihang Lu

    Abstract: In epidemiological and clinical studies, identifying patients' phenotypes based on longitudinal profiles is critical to understanding the disease's developmental patterns. The current study was motivated by data from a Canadian birth cohort study, the CHILD Cohort Study. Our goal was to use multiple longitudinal respiratory traits to cluster the participants into subgroups with similar longitudina… ▽ More

    Submitted 21 March, 2023; v1 submitted 15 October, 2022; originally announced October 2022.

    Comments: 21 pages, 4 figures, 2 tables

  24. arXiv:2209.04968  [pdf, other

    stat.CO

    Population-Based Hierarchical Non-negative Matrix Factorization for Survey Data

    Authors: Xiaofu Ding, Xinyu Dong, Olivia McGough, Chenxin Shen, Annie Ulichney, Ruiyao Xu, William Swartworth, Jocelyn T. Chi, Deanna Needell

    Abstract: Motivated by the problem of identifying potential hierarchical population structure on modern survey data containing a wide range of complex data types, we introduce population-based hierarchical non-negative matrix factorization (PHNMF). PHNMF is a variant of hierarchical non-negative matrix factorization based on feature similarity. As such, it enables an automatic and interpretable approach for… ▽ More

    Submitted 11 September, 2022; originally announced September 2022.

  25. arXiv:2205.15512  [pdf, ps, other

    cs.LG cs.GT stat.ML

    Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game

    Authors: Wei Xiong, Han Zhong, Chengshuai Shi, Cong Shen, Liwei Wang, Tong Zhang

    Abstract: Offline reinforcement learning (RL) aims at learning an optimal strategy using a pre-collected dataset without further interactions with the environment. While various algorithms have been proposed for offline RL in the previous literature, the minimax optimality has only been (nearly) established for tabular Markov decision processes (MDPs). In this paper, we focus on offline RL with linear funct… ▽ More

    Submitted 1 March, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

  26. arXiv:2111.07608  [pdf, other

    cs.CR cs.AI cs.LG stat.ML

    Property Inference Attacks Against GANs

    Authors: Junhao Zhou, Yufei Chen, Chao Shen, Yang Zhang

    Abstract: While machine learning (ML) has made tremendous progress during the past decade, recent research has shown that ML models are vulnerable to various security and privacy attacks. So far, most of the attacks in this field focus on discriminative models, represented by classifiers. Meanwhile, little attention has been paid to the security and privacy risks of generative models, such as generative adv… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

    Comments: To Appear in NDSS 2022

  27. arXiv:2110.14628  [pdf, ps, other

    stat.ML cs.IT cs.LG

    (Almost) Free Incentivized Exploration from Decentralized Learning Agents

    Authors: Chengshuai Shi, Haifeng Xu, Wei Xiong, Cong Shen

    Abstract: Incentivized exploration in multi-armed bandits (MAB) has witnessed increasing interests and many progresses in recent years, where a principal offers bonuses to agents to do explorations on her behalf. However, almost all existing studies are confined to temporary myopic agents. In this work, we break this barrier and study incentivized exploration with multiple and long-term strategic agents, wh… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Comments: Accepted to NeurIPS 2021, camera-ready version

  28. arXiv:2110.14622  [pdf, ps, other

    stat.ML cs.IT cs.LG

    Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization

    Authors: Chengshuai Shi, Wei Xiong, Cong Shen, **g Yang

    Abstract: Despite the significant interests and many progresses in decentralized multi-player multi-armed bandits (MP-MAB) problems in recent years, the regret gap to the natural centralized lower bound in the heterogeneous MP-MAB setting remains open. In this paper, we propose BEACON -- Batched Exploration with Adaptive COmmunicatioN -- that closes this gap. BEACON accomplishes this goal with novel contrib… ▽ More

    Submitted 29 October, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: Accepted to NeurIPS 2021, camera-ready version

  29. arXiv:2110.14177  [pdf, other

    stat.ML cs.IT cs.LG

    Federated Linear Contextual Bandits

    Authors: Ruiquan Huang, Weiqiang Wu, **g Yang, Cong Shen

    Abstract: This paper presents a novel federated linear contextual bandits model, where individual clients face different $K$-armed stochastic bandits coupled through common global parameters. By leveraging the geometric structure of the linear rewards, a collaborative algorithm called Fed-PE is proposed to cope with the heterogeneity across clients without exchanging local feature vectors or raw data. Fed-P… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  30. One-Hot Graph Encoder Embedding

    Authors: Cencheng Shen, Qizhe Wang, Carey E. Priebe

    Abstract: In this paper we propose a lightning fast graph embedding method called one-hot graph encoder embedding. It has a linear computational complexity and the capacity to process billions of edges within minutes on standard PC -- making it an ideal candidate for huge graph processing. It is applicable to either adjacency matrix or graph Laplacian, and can be viewed as a transformation of the spectral e… ▽ More

    Submitted 1 December, 2022; v1 submitted 27 September, 2021; originally announced September 2021.

    Comments: 7 pages main + 7 pages appendix

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 45(6), 7933 - 7938, 2023

  31. arXiv:2106.13669  [pdf, other

    cs.IT cs.LG eess.SP stat.ML

    Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions

    Authors: Chengshuai Shi, Cong Shen

    Abstract: We study a new stochastic multi-player multi-armed bandits (MP-MAB) problem, where the reward distribution changes if a collision occurs on the arm. Existing literature always assumes a zero reward for involved players if collision happens, but for applications such as cognitive radio, the more realistic scenario is that collision reduces the mean reward but not necessarily to zero. We focus on th… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

    Comments: 17 pages, 14 figures. Accepted to IEEE Transactions on Signal Processing

  32. arXiv:2106.06918  [pdf, ps, other

    stat.ME q-bio.PE

    A Phylogenetic Trees Analysis of SARS-CoV-2

    Authors: Chen Shen, Vic Patrangenaru, Roland Moore

    Abstract: One regards spaces of trees as stratified spaces, to study distributions of phylogenetic trees. Stratified spaces with may have cycles, however spaces of trees with a fixed number of leafs are contractible. Spaces of trees with three leafs, in particular, are spiders with three legs. One gives an elementary proof of the stickiness of intrinsic sample means on spiders. One also represents four leaf… ▽ More

    Submitted 14 June, 2021; v1 submitted 13 June, 2021; originally announced June 2021.

    Comments: 22 pages, 16 figures

    MSC Class: 62R30

  33. arXiv:2104.02320  [pdf, other

    physics.soc-ph stat.ME

    Inferring Network Structures via Signal Lasso

    Authors: Lei Shi, Chen Shen, Libin **, Qi Shi, Zhen Wang, Marko Jusup, Stefano Boccaletti

    Abstract: Inferring the connectivity structure of networked systems from data is an extremely important task in many areas of science. Most of real-world networks exhibit sparsely connected topologies, with links between nodes that in some cases may be even associated to a binary state (0 or 1, denoting respectively the absence or the existence of a connection). Such un-weighted topologies are elusive to cl… ▽ More

    Submitted 1 June, 2022; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: 11 pages, 6 figures, 3 tables

    MSC Class: 62Jxx; 91-XX; 68T09

    Journal ref: Phys. Rev. Research 3, 043210(2021)

  34. arXiv:2102.13101  [pdf, other

    cs.LG cs.IT stat.ML

    Federated Multi-armed Bandits with Personalization

    Authors: Chengshuai Shi, Cong Shen, **g Yang

    Abstract: A general framework of personalized federated multi-armed bandits (PF-MAB) is proposed, which is a new bandit paradigm analogous to the federated learning (FL) framework in supervised learning and enjoys the features of FL with personalization. Under the PF-MAB framework, a mixed bandit learning problem that flexibly balances generalization and personalization is studied. A lower bound analysis fo… ▽ More

    Submitted 25 February, 2021; originally announced February 2021.

    Comments: Accepted to AISTATS 2021, oral presentation

  35. arXiv:2101.12204  [pdf, other

    cs.LG cs.IT cs.MA stat.ML

    Federated Multi-Armed Bandits

    Authors: Chengshuai Shi, Cong Shen

    Abstract: Federated multi-armed bandits (FMAB) is a new bandit paradigm that parallels the federated learning (FL) framework in supervised learning. It is inspired by practical applications in cognitive radio and recommender systems, and enjoys features that are analogous to FL. This paper proposes a general framework of FMAB and then studies two specific federated bandit models. We first study the approxim… ▽ More

    Submitted 3 March, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

    Comments: AAAI 2021, Camera Ready. Code is available at: https://github.com/ShenGroup/FMAB

  36. arXiv:2101.10998  [pdf, ps, other

    cs.LG stat.AP stat.ML

    SDF-Bayes: Cautious Optimism in Safe Dose-Finding Clinical Trials with Drug Combinations and Heterogeneous Patient Groups

    Authors: Hyun-Suk Lee, Cong Shen, William Zame, Jang-Won Lee, Mihaela van der Schaar

    Abstract: Phase I clinical trials are designed to test the safety (non-toxicity) of drugs and find the maximum tolerated dose (MTD). This task becomes significantly more challenging when multiple-drug dose-combinations (DC) are involved, due to the inherent conflict between the exponentially increasing DC candidates and the limited patient budget. This paper proposes a novel Bayesian design, SDF-Bayes, for… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: Accepted to AISTATS 2021

  37. arXiv:2101.07934  [pdf, other

    stat.ME stat.AP

    Meta-analysis of Censored Adverse Events

    Authors: Xinyue Qi, Shouhao Zhou, Christine B. Peterson, Yucai Wang, Xinying Fang, Michael L. Wang, Chan Shen

    Abstract: Meta-analysis is a powerful tool for assessing drug safety by combining treatment-related toxicological findings across multiple studies, as clinical trials are typically underpowered for detecting adverse drug effects. However, incomplete reporting of adverse events (AEs) in published clinical studies is a frequent issue, especially if the observed number of AEs is below a pre-specified study-dep… ▽ More

    Submitted 8 February, 2024; v1 submitted 19 January, 2021; originally announced January 2021.

  38. arXiv:2101.07771  [pdf, other

    stat.AP

    Critical Risk Indicators (CRIs) for the electric power grid: A survey and discussion of interconnected effects

    Authors: Judy P. Che-Castaldo, Rémi Cousin, Stefani Daryanto, Grace Deng, Mei-Ling E. Feng, Rajesh K. Gupta, Dezhi Hong, Ryan M. McGranaghan, Olukunle O. Owolabi, Tianyi Qu, Wei Ren, Toryn L. J. Schafer, Ashutosh Sharma, Chaopeng Shen, Mila Getmansky Sherman, Deborah A. Sunter, Lan Wang, David S. Matteson

    Abstract: The electric power grid is a critical societal resource connecting multiple infrastructural domains such as agriculture, transportation, and manufacturing. The electrical grid as an infrastructure is shaped by human activity and public policy in terms of demand and supply requirements. Further, the grid is subject to changes and stresses due to solar weather, climate, hydrology, and ecology. The e… ▽ More

    Submitted 9 June, 2021; v1 submitted 19 January, 2021; originally announced January 2021.

  39. The data synergy effects of time-series deep learning models in hydrology

    Authors: Kuai Fang, Daniel Kifer, Kathryn Lawson, Dapeng Feng, Chaopeng Shen

    Abstract: When fitting statistical models to variables in geoscientific disciplines such as hydrology, it is a customary practice to regionalize - to divide a large spatial domain into multiple regions and study each region separately - instead of fitting a single model on the entire data (also known as unification). Traditional wisdom in these fields suggests that models built for each region separately wi… ▽ More

    Submitted 6 January, 2021; originally announced January 2021.

    Journal ref: Water Resources Research, 2022

  40. arXiv:2012.15005  [pdf, other

    cs.LG stat.ML

    Infer-AVAE: An Attribute Inference Model Based on Adversarial Variational Autoencoder

    Authors: Yadong Zhou, Zhihao Ding, Xiaoming Liu, Chao Shen, Lingling Tong, Xiaohong Guan

    Abstract: User attributes, such as gender and education, face severe incompleteness in social networks. In order to make this kind of valuable data usable for downstream tasks like user profiling and personalized recommendation, attribute inference aims to infer users' missing attribute labels based on observed data. Recently, variational autoencoder (VAE), an end-to-end deep generative model, has shown pro… ▽ More

    Submitted 29 May, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

  41. arXiv:2011.11124  [pdf, other

    cs.LG stat.ML

    Uncorrelated Semi-paired Subspace Learning

    Authors: Li Wang, Lei-Hong Zhang, Chungen Shen, Ren-Cang Li

    Abstract: Multi-view datasets are increasingly collected in many real-world applications, and we have seen better learning performance by existing multi-view learning methods than by conventional single-view learning methods applied to each view individually. But, most of these multi-view learning methods are built on the assumption that at each instance no view is missing and all data points from all views… ▽ More

    Submitted 22 November, 2020; originally announced November 2020.

  42. arXiv:2011.01090  [pdf, ps, other

    cs.IT cs.LG stat.ML

    On No-Sensing Adversarial Multi-player Multi-armed Bandits with Collision Communications

    Authors: Chengshuai Shi, Cong Shen

    Abstract: We study the notoriously difficult no-sensing adversarial multi-player multi-armed bandits (MP-MAB) problem from a new perspective. Instead of focusing on the hardness of multiple players, we introduce a new dimension of hardness, called attackability. All adversaries can be categorized based on the attackability and we introduce Adversary-Adaptive Collision-Communication (A2C2), a family of algor… ▽ More

    Submitted 24 April, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: 19 pages, 8 figures. Accepted to IEEE Journal on Selected Areas in Information Theory, Special Issue on Sequential, Active, and Reinforcement Learning

  43. A Simple Spectral Failure Mode for Graph Convolutional Networks

    Authors: Carey E. Priebe, Cencheng Shen, Ningyuan Huang, Tianyi Chen

    Abstract: Neural networks have achieved remarkable successes in machine learning tasks. This has recently been extended to graph learning using neural networks. However, there is limited theoretical work in understanding how and when they perform well, especially relative to established statistical learning techniques such as spectral embedding. In this short paper, we present a simple generative model wher… ▽ More

    Submitted 11 August, 2021; v1 submitted 25 October, 2020; originally announced October 2020.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 44(11), 8689-8693, 2022

  44. arXiv:2010.01632  [pdf, other

    cs.LG math.NA stat.ML

    Orthogonal Multi-view Analysis by Successive Approximations via Eigenvectors

    Authors: Li Wang, Leihong Zhang, Chungen Shen, Ren-cang Li

    Abstract: We propose a unified framework for multi-view subspace learning to learn individual orthogonal projections for all views. The framework integrates the correlations within multiple views, supervised discriminant capacity, and distance preservation in a concise and compact way. It not only includes several existing models as special cases, but also inspires new novel models. To demonstrate its versa… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

  45. arXiv:2009.06847  [pdf, other

    cs.LG cs.AI cs.CR cs.CV stat.ML

    Toward Deep Supervised Anomaly Detection: Reinforcement Learning from Partially Labeled Anomaly Data

    Authors: Guansong Pang, Anton van den Hengel, Chunhua Shen, Longbing Cao

    Abstract: We consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset. This is a common scenario in many important applications. Existing related methods either exclusively fit the limited anomaly examples that typically do not span the entire set of anomalies, or proceed with unsupervised learning from the unlabeled data. We propos… ▽ More

    Submitted 10 June, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

    Comments: Accepted to KDD 2021

  46. arXiv:2008.00942  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    Improving Generative Adversarial Networks with Local Coordinate Coding

    Authors: Jiezhang Cao, Yong Guo, Qingyao Wu, Chunhua Shen, Junzhou Huang, Mingkui Tan

    Abstract: Generative adversarial networks (GANs) have shown remarkable success in generating realistic data from some predefined prior distribution (e.g., Gaussian noises). However, such prior distribution is often independent of real data and thus may lose semantic information (e.g., geometric structure or content in images) of data. In practice, the semantic information might be represented by some latent… ▽ More

    Submitted 28 July, 2020; originally announced August 2020.

    Comments: 20 pages, 5 figures

  47. arXiv:2007.02500  [pdf, other

    cs.LG cs.CV stat.ML

    Deep Learning for Anomaly Detection: A Review

    Authors: Guansong Pang, Chunhua Shen, Longbing Cao, Anton van den Hengel

    Abstract: Anomaly detection, a.k.a. outlier detection or novelty detection, has been a lasting yet active research area in various research communities for several decades. There are still some unique problem complexities and challenges that require advanced approaches. In recent years, deep learning enabled anomaly detection, i.e., deep anomaly detection, has emerged as a critical direction. This paper sur… ▽ More

    Submitted 4 December, 2020; v1 submitted 5 July, 2020; originally announced July 2020.

    Comments: Survey paper, 36 pages, 180 references, 2 figures, 4 tables

    Journal ref: ACM Computing Surveys, 2020

  48. arXiv:2006.14871  [pdf, other

    cs.LG stat.ML

    Can We Mitigate Backdoor Attack Using Adversarial Detection Methods?

    Authors: Kaidi **, Tianwei Zhang, Chao Shen, Yufei Chen, Ming Fan, Chenhao Lin, Ting Liu

    Abstract: Deep Neural Networks are well known to be vulnerable to adversarial attacks and backdoor attacks, where minor modifications on the input are able to mislead the models to give wrong results. Although defenses against adversarial attacks have been widely studied, investigation on mitigating backdoor attacks is still at an early stage. It is unknown whether there are any connections and common chara… ▽ More

    Submitted 28 July, 2022; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: Accepted by IEEE TDSC

  49. arXiv:2006.10932  [pdf, other

    cs.IR cs.LG stat.ML

    Convolutional Gaussian Embeddings for Personalized Recommendation with Uncertainty

    Authors: Junyang Jiang, Deqing Yang, Yanghua Xiao, Chenlu Shen

    Abstract: Most of existing embedding based recommendation models use embeddings (vectors) corresponding to a single fixed point in low-dimensional space, to represent users and items. Such embeddings fail to precisely represent the users/items with uncertainty often observed in recommender systems. Addressing this problem, we propose a unified deep recommendation framework employing Gaussian embeddings, whi… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Journal ref: IJCAI 2019

  50. arXiv:2006.07917  [pdf, ps, other

    stat.ML cs.LG

    Robust Recursive Partitioning for Heterogeneous Treatment Effects with Uncertainty Quantification

    Authors: Hyun-Suk Lee, Yao Zhang, William Zame, Cong Shen, Jang-Won Lee, Mihaela van der Schaar

    Abstract: Subgroup analysis of treatment effects plays an important role in applications from medicine to public policy to recommender systems. It allows physicians (for example) to identify groups of patients for whom a given drug or treatment is likely to be effective and groups of patients for which it is not. Most of the current methods of subgroup analysis begin with a particular algorithm for estimati… ▽ More

    Submitted 17 October, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

    Comments: 19 pages, 7 figures, NeurIPS 2020