Skip to main content

Showing 1–50 of 54 results for author: Zheng, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.11490  [pdf, other

    cs.LG stat.ME

    Interventional Imbalanced Multi-Modal Representation Learning via $β$-Generalization Front-Door Criterion

    Authors: Yi Li, Jiangmeng Li, Fei Song, Qingmeng Zhu, Changwen Zheng, Wenwen Qiang

    Abstract: Multi-modal methods establish comprehensive superiority over uni-modal methods. However, the imbalanced contributions of different modalities to task-dependent predictions constantly degrade the discriminative performance of canonical multi-modal methods. Based on the contribution to task-dependent predictions, modalities can be identified as predominant and auxiliary modalities. Benchmark methods… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2406.08709  [pdf, other

    cs.LG stat.ME

    Introducing Diminutive Causal Structure into Graph Representation Learning

    Authors: Hang Gao, Peng Qiao, Yifan **, Fengge Wu, Jiangmeng Li, Changwen Zheng

    Abstract: When engaging in end-to-end graph representation learning with Graph Neural Networks (GNNs), the intricate causal relationships and rules inherent in graph data pose a formidable challenge for the model in accurately capturing authentic data relationships. A proposed mitigating strategy involves the direct integration of rules or relationships corresponding to the graph data into the model. Howeve… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  3. arXiv:2405.16845  [pdf, other

    cs.LG cs.CL stat.ML

    On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability

    Authors: Chenyu Zheng, Wei Huang, Rongzhen Wang, Guoqiang Wu, Jun Zhu, Chongxuan Li

    Abstract: Autoregressively trained transformers have brought a profound revolution to the world, especially with their in-context learning (ICL) ability to address downstream tasks. Recently, several studies suggest that transformers learn a mesa-optimizer during autoregressive (AR) pretraining to implement ICL. Namely, the forward pass of the trained transformer is equivalent to optimizing an inner objecti… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 37pages

  4. arXiv:2404.19620  [pdf, other

    cs.LG cs.IR stat.ML

    Be Aware of the Neighborhood Effect: Modeling Selection Bias under Interference

    Authors: Haoxuan Li, Chunyuan Zheng, Sihao Ding, Peng Wu, Zhi Geng, Fuli Feng, Xiangnan He

    Abstract: Selection bias in recommender system arises from the recommendation process of system filtering and the interactive process of user selection. Many previous studies have focused on addressing selection bias to achieve unbiased learning of the prediction model, but ignore the fact that potential outcomes for a given user-item pair may vary with the treatments assigned to other user-item pairs, name… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: ICLR 24

  5. arXiv:2312.09613  [pdf, other

    cs.LG cs.AI stat.ML

    Rethinking Causal Relationships Learning in Graph Neural Networks

    Authors: Hang Gao, Chengyu Yao, Jiangmeng Li, Lingyu Si, Yifan **, Fengge Wu, Changwen Zheng, Hua** Liu

    Abstract: Graph Neural Networks (GNNs) demonstrate their significance by effectively modeling complex interrelationships within graph-structured data. To enhance the credibility and robustness of GNNs, it becomes exceptionally crucial to bolster their ability to capture causal relationships. However, despite recent advancements that have indeed strengthened GNNs from a causal learning perspective, conductin… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  6. arXiv:2312.05771  [pdf, other

    cs.LG stat.ML

    Hacking Task Confounder in Meta-Learning

    Authors: **gyao Wang, Yi Ren, Zeen Song, Jianqi Zhang, Changwen Zheng, Wenwen Qiang

    Abstract: Meta-learning enables rapid generalization to new tasks by learning knowledge from various tasks. It is intuitively assumed that as the training progresses, a model will acquire richer knowledge, leading to better generalization performance. However, our experiments reveal an unexpected result: there is negative knowledge transfer between tasks, affecting generalization performance. To explain thi… ▽ More

    Submitted 29 May, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

    Comments: Accepted by IJCAI 2024, 9 pages, 5 figures, 4 tables

  7. arXiv:2310.06696  [pdf, ps, other

    stat.ME

    Variable selection with FDR control for noisy data -- an application to screening metabolites that are associated with breast and colorectal cancer

    Authors: Runqiu Wang, Ran Dai, Ying Huang, Marian L. Neuhouser, Johanna W. Lampe, Daniel Raftery, Fred K. Tabung, Cheng Zheng

    Abstract: The rapidly expanding field of metabolomics presents an invaluable resource for understanding the associations between metabolites and various diseases. However, the high dimensionality, presence of missing values, and measurement errors associated with metabolomics data can present challenges in develo** reliable and reproducible methodologies for disease association studies. Therefore, there i… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  8. arXiv:2305.17476  [pdf, other

    cs.LG stat.ML

    Toward Understanding Generative Data Augmentation

    Authors: Chenyu Zheng, Guoqiang Wu, Chongxuan Li

    Abstract: Generative data augmentation, which scales datasets by obtaining fake labeled examples from a trained conditional generative model, boosts classification performance in various learning tasks including (semi-)supervised learning, few-shot learning, and adversarially robust learning. However, little work has theoretically investigated the effect of generative data augmentation. To fill this gap, we… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

    Comments: 39 pages

  9. arXiv:2303.02566  [pdf, other

    stat.ML cs.LG stat.CO

    MFAI: A Scalable Bayesian Matrix Factorization Approach to Leveraging Auxiliary Information

    Authors: Zhiwei Wang, Fa Zhang, Cong Zheng, Xianghong Hu, Mingxuan Cai, Can Yang

    Abstract: In various practical situations, matrix factorization methods suffer from poor data quality, such as high data sparsity and low signal-to-noise ratio (SNR). Here, we consider a matrix factorization problem by utilizing auxiliary information, which is massively available in real-world applications, to overcome the challenges caused by poor data quality. Unlike existing methods that mainly rely on s… ▽ More

    Submitted 12 February, 2024; v1 submitted 4 March, 2023; originally announced March 2023.

  10. arXiv:2303.01599  [pdf, other

    stat.ME stat.AP

    Controlling FDR in selecting group-level simultaneous signals from multiple data sources with application to the National Covid Collaborative Cohort data

    Authors: Runqiu Wang, Ran Dai, Cheng Zheng

    Abstract: One challenge in exploratory association studies using observational data is that the signals are potentially weak and the features have complex correlation structures. False discovery rate (FDR) controlling procedures can provide important statistical guarantees for replicability in risk factor identification in exploratory research. In the recently established National COVID Collaborative Cohort… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

  11. arXiv:2302.02334  [pdf, other

    cs.LG cs.AI stat.ML

    Revisiting Discriminative vs. Generative Classifiers: Theory and Implications

    Authors: Chenyu Zheng, Guoqiang Wu, Fan Bao, Yue Cao, Chongxuan Li, Jun Zhu

    Abstract: A large-scale deep model pre-trained on massive labeled or unlabeled data transfers well to downstream tasks. Linear evaluation freezes parameters in the pre-trained model and trains a linear classifier separately, which is efficient and attractive for transfer. However, little work has investigated the classifier in linear evaluation except for the default logistic regression. Inspired by the sta… ▽ More

    Submitted 29 May, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: Accepted by ICML 2023, 58 pages

  12. arXiv:2211.09295  [pdf, other

    stat.ML cs.LG

    Testing for context-dependent changes in neural encoding in naturalistic experiments

    Authors: Yenho Chen, Carl W. Harris, Xiaoyu Ma, Zheng Li, Francisco Pereira, Charles Y. Zheng

    Abstract: We propose a decoding-based approach to detect context effects on neural codes in longitudinal neural recording data. The approach is agnostic to how information is encoded in neural activity, and can control for a variety of possible confounding factors present in the data. We demonstrate our approach by determining whether it is possible to decode location encoding from prefrontal cortex in the… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: 39 pages, 13 figures

  13. arXiv:2208.08584  [pdf, other

    cs.LG stat.ME

    Robust Causal Graph Representation Learning against Confounding Effects

    Authors: Hang Gao, Jiangmeng Li, Wenwen Qiang, Lingyu Si, Bing Xu, Changwen Zheng, Fuchun Sun

    Abstract: The prevailing graph neural network models have achieved significant progress in graph representation learning. However, in this paper, we uncover an ever-overlooked phenomenon: the pre-trained graph representation learning model tested with full graphs underperforms the model tested with well-pruned graphs. This observation reveals that there exist confounders in graphs, which may interfere with… ▽ More

    Submitted 10 February, 2023; v1 submitted 17 August, 2022; originally announced August 2022.

    Comments: Accepted by AAAI 2023 as Oral Presentation

  14. arXiv:2206.05216  [pdf, other

    stat.ME stat.AP

    Quantification of follow-up time in oncology clinical trials with a time-to-event endpoint: Asking the right questions

    Authors: Kaspar Rufibach, Lynda Grinsted, Jiang Li, Hans-Jochen Weber, Cheng Zheng, Jiangxiu Zhou

    Abstract: For the analysis of a time-to-event endpoint in a single-arm or randomized clinical trial it is generally perceived that interpretation of a given estimate of the survival function, or the comparison between two groups, hinges on some quantification of the amount of follow-up. Typically, a median of some loosely defined quantity is reported. However, whatever median is reported, is typically not a… ▽ More

    Submitted 13 March, 2023; v1 submitted 10 June, 2022; originally announced June 2022.

    Comments: 31 pages

    MSC Class: 62N02

  15. arXiv:2205.04701  [pdf, other

    cs.LG stat.ML

    StableDR: Stabilized Doubly Robust Learning for Recommendation on Data Missing Not at Random

    Authors: Haoxuan Li, Chunyuan Zheng, Peng Wu

    Abstract: In recommender systems, users always choose the favorite items to rate, which leads to data missing not at random and poses a great challenge for unbiased evaluation and learning of prediction models. Currently, the doubly robust (DR) methods have been widely studied and demonstrate superior performance. However, in this paper, we show that DR methods are unstable and have unbounded bias, variance… ▽ More

    Submitted 23 August, 2023; v1 submitted 10 May, 2022; originally announced May 2022.

    Comments: ICLR 23

  16. arXiv:2205.00756  [pdf, other

    cs.LG stat.AP stat.ML

    VICE: Variational Interpretable Concept Embeddings

    Authors: Lukas Muttenthaler, Charles Y. Zheng, Patrick McClure, Robert A. Vandermeulen, Martin N. Hebart, Francisco Pereira

    Abstract: A central goal in the cognitive sciences is the development of numerical models for mental representations of object concepts. This paper introduces Variational Interpretable Concept Embeddings (VICE), an approximate Bayesian method for embedding object concepts in a vector space using data collected from humans in a triplet odd-one-out task. VICE uses variational inference to obtain sparse, non-n… ▽ More

    Submitted 6 October, 2022; v1 submitted 2 May, 2022; originally announced May 2022.

    Comments: Accepted at NeurIPS 2022

  17. arXiv:2203.10975  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace

    Authors: Shu Wan, Chen Zheng, Zhonggen Sun, Mengfan Xu, Xiaoqing Yang, Hongtu Zhu, Jiecheng Guo

    Abstract: Uplift modeling is a rapidly growing approach that utilizes causal inference and machine learning methods to directly estimate the heterogeneous treatment effects, which has been widely applied to various online marketplaces to assist large-scale decision-making in recent years. The existing popular models, like causal forest (CF), are limited to either discrete treatments or posing parametric ass… ▽ More

    Submitted 23 September, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

  18. arXiv:2203.10258  [pdf, other

    cs.IR cs.LG stat.ML

    TDR-CL: Targeted Doubly Robust Collaborative Learning for Debiased Recommendations

    Authors: Haoxuan Li, Yan Lyu, Chunyuan Zheng, Peng Wu

    Abstract: Bias is a common problem inherent in recommender systems, which is entangled with users' preferences and poses a great challenge to unbiased learning. For debiasing tasks, the doubly robust (DR) method and its variants show superior performance due to the double robustness property, that is, DR is unbiased when either imputed errors or learned propensities are accurate. However, our theoretical an… ▽ More

    Submitted 2 March, 2023; v1 submitted 19 March, 2022; originally announced March 2022.

  19. arXiv:2106.12719  [pdf, other

    stat.ME

    FDR Controlled Multiple Testing for Union Null Hypotheses: A Knockoff-based Approach

    Authors: Ran Dai, Cheng Zheng

    Abstract: False discovery rate (FDR) controlling procedures provide important statistical guarantees for the replicability in signal identification based on multiple hypotheses testing. In many fields of study, FDR controlling procedures are used in high-dimensional (HD) analyses to discover features that are truly associated with the outcome. In some recent applications, data on the same set of candidate f… ▽ More

    Submitted 3 October, 2022; v1 submitted 23 June, 2021; originally announced June 2021.

  20. arXiv:2010.09797  [pdf, other

    cs.IR cs.LG stat.AP

    Surprise: Result List Truncation via Extreme Value Theory

    Authors: Dara Bahri, Che Zheng, Yi Tay, Donald Metzler, Andrew Tomkins

    Abstract: Work in information retrieval has largely been centered around ranking and relevance: given a query, return some number of results ordered by relevance to the user. The problem of result list truncation, or where to truncate the ranked list of results, however, has received less attention despite being crucial in a variety of applications. Such truncation is a balancing act between the overall rel… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

  21. arXiv:2008.13533  [pdf, other

    cs.CL cs.LG stat.ML

    Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study

    Authors: Dara Bahri, Yi Tay, Che Zheng, Donald Metzler, Cliff Brunk, Andrew Tomkins

    Abstract: Large generative language models such as GPT-2 are well-known for their ability to generate text as well as their utility in supervised downstream tasks via fine-tuning. Our work is twofold: firstly we demonstrate via human evaluation that classifiers trained to discriminate between human and machine-generated text emerge as unsupervised predictors of "page quality", able to detect low quality con… ▽ More

    Submitted 17 August, 2020; originally announced August 2020.

  22. arXiv:2005.12830  [pdf

    cs.SI cs.CL cs.CY stat.ML

    Twitter discussions and emotions about COVID-19 pandemic: a machine learning approach

    Authors: Jia Xue, Junxiang Chen, Ran Hu, Chen Chen, ChengDa Zheng, Xiaoqian Liu, Tingshao Zhu

    Abstract: The objective of the study is to examine coronavirus disease (COVID-19) related discussions, concerns, and sentiments that emerged from tweets posted by Twitter users. We analyze 4 million Twitter messages related to the COVID-19 pandemic using a list of 25 hashtags such as "coronavirus," "COVID-19," "quarantine" from March 1 to April 21 in 2020. We use a machine learning approach, Latent Dirichle… ▽ More

    Submitted 18 June, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

  23. arXiv:2004.13012  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Choppy: Cut Transformer For Ranked List Truncation

    Authors: Dara Bahri, Yi Tay, Che Zheng, Donald Metzler, Andrew Tomkins

    Abstract: Work in information retrieval has traditionally focused on ranking and relevance: given a query, return some number of results ordered by relevance to the user. However, the problem of determining how many results to return, i.e. how to optimally truncate the ranked result list, has received less attention despite being of critical importance in a range of applications. Such truncation is a balanc… ▽ More

    Submitted 25 April, 2020; originally announced April 2020.

    Comments: SIGIR 2020

  24. arXiv:2003.12009  [pdf, other

    eess.SP cs.IT cs.LG stat.ML

    Multi-Lead ECG Classification via an Information-Based Attention Convolutional Neural Network

    Authors: Hao Tung, Chao Zheng, Xinsheng Mao, Dahong Qian

    Abstract: Objective: A novel structure based on channel-wise attention mechanism is presented in this paper. Embedding with the proposed structure, an efficient classification model that accepts multi-lead electrocardiogram (ECG) as input is constructed. Methods: One-dimensional convolutional neural networks (CNN) have proven to be effective in pervasive classification tasks, enabling the automatic extract… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.

  25. arXiv:1911.11219  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    One Man's Trash is Another Man's Treasure: Resisting Adversarial Examples by Adversarial Examples

    Authors: Chang Xiao, Changxi Zheng

    Abstract: Modern image classification systems are often built on deep neural networks, which suffer from adversarial examples--images with deliberately crafted, imperceptible noise to mislead the network's classification. To defend against adversarial examples, a plausible idea is to obfuscate the network's gradient with respect to the input image. This general idea has inspired a long line of defense metho… ▽ More

    Submitted 27 November, 2019; v1 submitted 25 November, 2019; originally announced November 2019.

  26. arXiv:1911.06380  [pdf, other

    stat.AP

    On Data Enriched Logistic Regression

    Authors: Cheng Zheng, Sayan Dasgupta, Yuxiang Xie, Asad Haris, Ying Qing Chen

    Abstract: Biomedical researchers usually study the effects of certain exposures on disease risks among a well-defined population. To achieve this goal, the gold standard is to design a trial with an appropriate sample from that population. Due to the high cost of such trials, usually the sample size collected is limited and is not enough to accurately estimate some exposures' effect. In this paper, we discu… ▽ More

    Submitted 14 November, 2019; originally announced November 2019.

    Comments: 23 pages, 4 Figures, 1 Table

    MSC Class: 62J12

  27. arXiv:1911.05684  [pdf, other

    stat.ME

    A Simulation-free Group Sequential Design with Max-combo Tests in the Presence of Non-proportional Hazards

    Authors: Lili Wang, Xiaodong Luo, Cheng Zheng

    Abstract: Non-proportional hazards (NPH) have been observed recently in many immuno-oncology clinical trials. Weighted log-rank tests (WLRT) with suitably chosen weights can be used to improve the power of detecting the difference of the two survival curves in the presence of NPH. However, it is not easy to choose a proper WLRT in practice when both robustness and efficiency are considered. A versatile maxc… ▽ More

    Submitted 16 January, 2023; v1 submitted 13 November, 2019; originally announced November 2019.

  28. arXiv:1911.01716  [pdf, other

    math.ST stat.ME

    Consistency of a range of penalised cost approaches for detecting multiple changepoints

    Authors: Chao Zheng, Idris A. Eckley, Paul Fearnhead

    Abstract: A common approach to detect multiple changepoints is to minimise a measure of data fit plus a penalty that is linear in the number of changepoints. This paper shows that the general finite sample behaviour of such a method can be related to its behaviour when analysing data with either none or one changepoint. This results in simpler conditions for verifying whether the method will consistently es… ▽ More

    Submitted 12 August, 2022; v1 submitted 5 November, 2019; originally announced November 2019.

  29. arXiv:1910.05366  [pdf, other

    cs.LG stat.ML

    Learning Nearly Decomposable Value Functions Via Communication Minimization

    Authors: Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang

    Abstract: Reinforcement learning encounters major challenges in multi-agent settings, such as scalability and non-stationarity. Recently, value function factorization learning emerges as a promising way to address these challenges in collaborative multi-agent systems. However, existing methods have been focusing on learning fully decentralized value functions, which are not efficient for tasks requiring com… ▽ More

    Submitted 18 July, 2020; v1 submitted 11 October, 2019; originally announced October 2019.

    Comments: 8th International Conference on Learning Representations (ICLR 2020)

  30. arXiv:1905.11837  [pdf

    cs.LG stat.ML

    Supervised Discriminative Sparse PCA for Com-Characteristic Gene Selection and Tumor Classification on Multiview Biological Data

    Authors: Chun-Mei Feng, Yong Xu, **-Xing Liu, Ying-Lian Gao, Chun-Hou Zheng

    Abstract: Principal Component Analysis (PCA) has been used to study the pathogenesis of diseases. To enhance the interpretability of classical PCA, various improved PCA methods have been proposed to date. Among these, a typical method is the so-called sparse PCA, which focuses on seeking sparse loadings. However, the performance of these methods is still far from satisfactory due to their limitation of usin… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

    Comments: This paper has been published on TNNLS

  31. arXiv:1905.10510  [pdf, other

    cs.LG cs.AI cs.CR cs.DS stat.ML

    Enhancing Adversarial Defense by k-Winners-Take-All

    Authors: Chang Xiao, Peilin Zhong, Changxi Zheng

    Abstract: We propose a simple change to existing neural network structures for better defending against gradient-based adversarial attacks. Instead of using popular activation functions (such as ReLU), we advocate the use of k-Winners-Take-All (k-WTA) activation, a C0 discontinuous function that purposely invalidates the neural network model's gradient at densely distributed input data points. The proposed… ▽ More

    Submitted 28 October, 2019; v1 submitted 24 May, 2019; originally announced May 2019.

  32. arXiv:1903.12019  [pdf, ps, other

    cs.LG stat.ML

    Multimodal Deep Network Embedding with Integrated Structure and Attribute Information

    Authors: Conghui Zheng, Li Pan, Peng Wu

    Abstract: Network embedding is the process of learning low-dimensional representations for nodes in a network, while preserving node features. Existing studies only leverage network structure information and focus on preserving structural features. However, nodes in real-world networks often have a rich set of attributes providing extra semantic information. It has been demonstrated that both structural and… ▽ More

    Submitted 28 March, 2019; originally announced March 2019.

    Comments: 15 pages, 10 figures

  33. arXiv:1902.04697  [pdf, other

    stat.ML cs.LG

    Rethinking Generative Mode Coverage: A Pointwise Guaranteed Approach

    Authors: Peilin Zhong, Yuchen Mo, Chang Xiao, Pengyu Chen, Changxi Zheng

    Abstract: Many generative models have to combat $\textit{missing modes}$. The conventional wisdom to this end is by reducing through training a statistical distance (such as $f$-divergence) between the generated distribution and provided data distribution. But this is more of a heuristic than a guarantee. The statistical distance measures a $\textit{global}$, but not $\textit{local}$, similarity between two… ▽ More

    Submitted 24 October, 2019; v1 submitted 12 February, 2019; originally announced February 2019.

  34. arXiv:1901.02915  [pdf, other

    stat.ML cs.CV cs.LG q-bio.NC

    Revealing interpretable object representations from human behavior

    Authors: Charles Y. Zheng, Francisco Pereira, Chris I. Baker, Martin N. Hebart

    Abstract: To study how mental object representations are related to behavior, we estimated sparse, non-negative representations of objects using human behavioral judgments on images representative of 1,854 object categories. These representations predicted a latent similarity structure between objects, which captured most of the explainable variance in human behavioral judgments. Individual dimensions in th… ▽ More

    Submitted 9 January, 2019; originally announced January 2019.

    Comments: Accepted in ICLR 2019

  35. arXiv:1812.02130  [pdf, other

    stat.ME

    On High Dimensional Covariate Adjustment for Estimating Causal Effects in Randomized Trials with Survival Outcomes

    Authors: Ran Dai, Cheng Zheng, Mei-Jie Zhang

    Abstract: The purpose of this work is to improve the efficiency in estimating the average causal effect (ACE) on the survival scale where right-censoring exists and high-dimensional covariate information is available. We propose new estimators using regularized survival regression and survival random forests (SRF) to make the adjustment for the high dimensional covariates to improve efficiency. We study the… ▽ More

    Submitted 25 June, 2021; v1 submitted 5 December, 2018; originally announced December 2018.

  36. arXiv:1812.01719  [pdf, other

    cs.CV cs.LG stat.ML

    Knowing what you know in brain segmentation using Bayesian deep neural networks

    Authors: Patrick McClure, Nao Rho, John A. Lee, Jakub R. Kaczmarzyk, Charles Zheng, Satrajit S. Ghosh, Dylan Nielson, Adam G. Thomas, Peter Bandettini, Francisco Pereira

    Abstract: In this paper, we describe a Bayesian deep neural network (DNN) for predicting FreeSurfer segmentations of structural MRI volumes, in minutes rather than hours. The network was trained and evaluated on a large dataset (n = 11,480), obtained by combining data from more than a hundred different sites, and also evaluated on another completely held-out dataset (n = 418). The network was trained using… ▽ More

    Submitted 18 September, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

    Comments: Submitted to Frontiers in Neuroinformatics

  37. arXiv:1809.10765  [pdf, other

    stat.ME

    Auto-Encoding Knockoff Generator for FDR Controlled Variable Selection

    Authors: Ying Liu, Cheng Zheng

    Abstract: A new statistical procedure (Model-X \cite{candes2018}) has provided a way to identify important factors using any supervised learning method controlling for FDR. This line of research has shown great potential to expand the horizon of machine learning methods beyond the task of prediction, to serve the broader needs in scientific researches for interpretable findings. However, the lack of a pract… ▽ More

    Submitted 27 September, 2018; originally announced September 2018.

    Comments: In submission

  38. arXiv:1805.10863  [pdf, other

    cs.LG cs.CV stat.ML

    Distributed Weight Consolidation: A Brain Segmentation Case Study

    Authors: Patrick McClure, Charles Y. Zheng, Jakub R. Kaczmarzyk, John A. Lee, Satrajit S. Ghosh, Dylan Nielson, Peter Bandettini, Francisco Pereira

    Abstract: Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distrib… ▽ More

    Submitted 16 January, 2019; v1 submitted 28 May, 2018; originally announced May 2018.

    Comments: Published in NeurIPS 2018

  39. arXiv:1805.07674  [pdf, other

    stat.ML cs.LG

    BourGAN: Generative Networks with Metric Embeddings

    Authors: Chang Xiao, Peilin Zhong, Changxi Zheng

    Abstract: This paper addresses the mode collapse for generative adversarial networks (GANs). We view modes as a geometric structure of data distribution in a metric space. Under this geometric lens, we embed subsamples of the dataset from an arbitrary metric space into the l2 space, while preserving their pairwise distance distribution. Not only does this metric embedding determine the dimensionality of the… ▽ More

    Submitted 2 December, 2018; v1 submitted 19 May, 2018; originally announced May 2018.

    Comments: Neural Information Processing Systems, 2018

    Journal ref: Advances in Neural Information Processing Systems 31, 2018

  40. arXiv:1712.09713  [pdf, other

    stat.ML cs.CV cs.LG

    Extrapolating Expected Accuracies for Large Multi-Class Problems

    Authors: Charles Zheng, Rakesh Achanta, Yuval Benjamini

    Abstract: The difficulty of multi-class classification generally increases with the number of classes. Using data from a subset of the classes, can we predict how well a classifier will scale with an increased number of classes? Under the assumptions that the classes are sampled identically and independently from a population, and that the classifier is based on independently learned scoring functions, we s… ▽ More

    Submitted 27 December, 2017; originally announced December 2017.

    Comments: Submitted to JMLR

  41. arXiv:1709.04342  [pdf, other

    stat.ME math.ST

    Model Selection Confidence Sets by Likelihood Ratio Testing

    Authors: Chao Zheng, Davide Ferrari, Yuhong Yang

    Abstract: The traditional activity of model selection aims at discovering a single model superior to other candidate models. In the presence of pronounced noise, however, multiple models are often found to explain the same data equally well. To resolve this model selection ambiguity, we introduce the general approach of model selection confidence sets (MSCSs) based on likelihood ratio testing. A MSCS is def… ▽ More

    Submitted 13 September, 2017; originally announced September 2017.

    Comments: 36 pages, 3 figures

  42. arXiv:1606.05229  [pdf, other

    stat.ML cs.IT

    Estimating mutual information in high dimensions via classification error

    Authors: Charles Y. Zheng, Yuval Benjamini

    Abstract: Multivariate pattern analyses approaches in neuroimaging are fundamentally concerned with investigating the quantity and type of information processed by various regions of the human brain; typically, estimates of classification accuracy are used to quantify information. While a extensive and powerful library of methods can be applied to train and assess classifiers, it is not always clear how to… ▽ More

    Submitted 10 October, 2016; v1 submitted 16 June, 2016; originally announced June 2016.

  43. arXiv:1606.05228  [pdf, other

    stat.ML cs.CV cs.IT cs.LG

    How many faces can be recognized? Performance extrapolation for multi-class classification

    Authors: Charles Y. Zheng, Rakesh Achanta, Yuval Benjamini

    Abstract: The difficulty of multi-class classification generally increases with the number of classes. Using data from a subset of the classes, can we predict how well a classifier will scale with an increased number of classes? Under the assumption that the classes are sampled exchangeably, and under the assumption that the classifier is generative (e.g. QDA or Naive Bayes), we show that the expected accur… ▽ More

    Submitted 16 June, 2016; originally announced June 2016.

    Comments: Submitted to NIPS 2016

  44. arXiv:1603.06988  [pdf, ps, other

    stat.ME

    On a Shape-Invariant Hazard Regression Model

    Authors: Cheng Zheng, Ying Qing Chen

    Abstract: In survival analysis, Cox model is widely used for most clinical trial data. Alternatives include the additive hazard model, the accelerated failure time (AFT) model and a more general transformation model. All these models assume that the effects for all covariates are on the same scale. However, it is possible that for different covariates, the effects are on different scales. In this paper, we… ▽ More

    Submitted 22 March, 2016; originally announced March 2016.

  45. arXiv:1603.01874  [pdf, other

    stat.ME

    Instrumental Variable with Competing Risk Model

    Authors: Cheng Zheng, Ran Dai, Parameswaran Hari, Mei-Jie Zhang

    Abstract: In this paper, we discuss causal inference on the efficacy of a treatment or medication on a time-to-event outcome with competing risks. Although the treatment group can be randomized, there can be confoundings between the compliance and the outcome. Unmeasured confoundings may exist even after adjustment for measured co- variates. Instrumental variable (IV) methods are commonly used to yield cons… ▽ More

    Submitted 4 December, 2016; v1 submitted 6 March, 2016; originally announced March 2016.

  46. arXiv:1601.06743  [pdf, ps, other

    stat.ME

    On estimating causal controlled direct and mediator effects for count outcomes without assuming sequential ignorability

    Authors: Cheng Zheng, David C. Atkins, Melissa A. Lewis, Xiao-Hua Zhou

    Abstract: Causal mediation analysis is an important statistical method in social and medical studies, as it can provide insights about why an intervention works and inform the development of future interventions. Currently, most causal mediation methods focus on mediation effects defined on a mean scale. However, in health-risk studies, such as alcohol or risky sex, outcomes are typically count data and hea… ▽ More

    Submitted 25 January, 2016; originally announced January 2016.

    MSC Class: 62J12; 62P15

  47. arXiv:1512.05419  [pdf, other

    stat.AP

    Ranking genetic factors related to age-related maculardegeneration by variable selection confidence sets

    Authors: Chao Zheng, Davide Ferrari, Michael Zhang, Paul Baird

    Abstract: The widespread use of generalized linear models in case-control genetic studies has helped identify many disease-associated risk factors typically defined as DNA variants, or single nucleotide polymorphisms (SNPs). Up to now, most literature has focused on selecting a unique best subset of SNPs based on some statistical perspectives. In the presence of pronounced noise, however, multiple biologica… ▽ More

    Submitted 4 March, 2019; v1 submitted 16 December, 2015; originally announced December 2015.

    Comments: 23 pages, 4 figures

  48. arXiv:1510.02175  [pdf, other

    stat.ME stat.CO stat.ML

    Learning Summary Statistic for Approximate Bayesian Computation via Deep Neural Network

    Authors: Bai Jiang, Tung-yu Wu, Charles Zheng, Wing H. Wong

    Abstract: Approximate Bayesian Computation (ABC) methods are used to approximate posterior distributions in models with unknown or computationally intractable likelihoods. Both the accuracy and computational efficiency of ABC depend on the choice of summary statistic, but outside of special cases where the optimal summary statistics are known, it is unclear which guiding principles can be used to construct… ▽ More

    Submitted 16 March, 2017; v1 submitted 7 October, 2015; originally announced October 2015.

    Comments: 27 pages, 10 figures

  49. arXiv:1509.03459  [pdf, other

    math.ST stat.ME

    Two-Sample Smooth Tests for the Equality of Distributions

    Authors: Wen-Xin Zhou, Chao Zheng, Zhen Zhang

    Abstract: This paper considers the problem of testing the equality of two unspecified distributions. The classical omnibus tests such as the Kolmogorov-Smirnov and Cramèr-von Mises are known to suffer from low power against essentially all but location-scale alternatives. We propose a new two-sample test that modifies the Neyman's smooth test and extend it to the multivariate case based on the idea of proje… ▽ More

    Submitted 14 September, 2015; v1 submitted 11 September, 2015; originally announced September 2015.

    Comments: 40 pages, 3 figures

  50. arXiv:1502.04765  [pdf, ps, other

    stat.ME

    Reliable inference for complex models by discriminative composite likelihood estimation

    Authors: Davide Ferrari, Chao Zheng

    Abstract: Composite likelihood estimation has an important role in the analysis of multivariate data for which the full likelihood function is intractable. An important issue in composite likelihood inference is the choice of the weights associated with lower-dimensional data sub-sets, since the presence of incompatible sub-models can deteriorate the accuracy of the resulting estimator. In this paper, we in… ▽ More

    Submitted 14 December, 2015; v1 submitted 16 February, 2015; originally announced February 2015.

    Comments: 29 pages, 4 figures