Skip to main content

Showing 1–23 of 23 results for author: Gan, Z

Searching in archive stat. Search in all archives.
.
  1. arXiv:2312.15611  [pdf, other

    stat.ME stat.ML

    Inference of Dependency Knowledge Graph for Electronic Health Records

    Authors: Zhiwei Xu, Ziming Gan, Doudou Zhou, Shuting Shen, Junwei Lu, Tianxi Cai

    Abstract: The effective analysis of high-dimensional Electronic Health Record (EHR) data, with substantial potential for healthcare research, presents notable methodological challenges. Employing predictive modeling guided by a knowledge graph (KG), which enables efficient feature selection, can enhance both statistical efficiency and interpretability. While various methods have emerged for constructing KGs… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

  2. arXiv:2107.01152  [pdf, other

    stat.ML cs.AI cs.CV cs.IT cs.LG

    Simpler, Faster, Stronger: Breaking The log-K Curse On Contrastive Learners With FlatNCE

    Authors: Junya Chen, Zhe Gan, Xuan Li, Qing Guo, Liqun Chen, Shuyang Gao, Tagyoung Chung, Yi Xu, Belinda Zeng, Wenlian Lu, Fan Li, Lawrence Carin, Chenyang Tao

    Abstract: InfoNCE-based contrastive representation learners, such as SimCLR, have been tremendously successful in recent years. However, these contrastive schemes are notoriously resource demanding, as their effectiveness breaks down with small-batch training (i.e., the log-K curse, whereas K is the batch-size). In this work, we reveal mathematically why contrastive learners fail in the small-batch-size reg… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

  3. arXiv:2103.16547  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    The Elastic Lottery Ticket Hypothesis

    Authors: Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, **g**g Liu, Zhangyang Wang

    Abstract: Lottery Ticket Hypothesis (LTH) raises keen attention to identifying sparse trainable subnetworks, or winning tickets, which can be trained in isolation to achieve similar or even better performance compared to the full models. Despite many efforts being made, the most effective method to identify such winning tickets is still Iterative Magnitude-based Pruning (IMP), which is computationally expen… ▽ More

    Submitted 27 October, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: Accepted at NeurIPS 2021

  4. arXiv:2010.01278  [pdf, other

    cs.LG cs.AI stat.ML

    Efficient Robust Training via Backward Smoothing

    Authors: **ghui Chen, Yu Cheng, Zhe Gan, Quanquan Gu, **g**g Liu

    Abstract: Adversarial training is so far the most effective strategy in defending against adversarial examples. However, it suffers from high computational costs due to the iterative adversarial attacks in each training step. Recent studies show that it is possible to achieve fast Adversarial Training by performing a single-step attack with random initialization. However, such an approach still lags behind… ▽ More

    Submitted 30 December, 2021; v1 submitted 3 October, 2020; originally announced October 2020.

    Comments: 12 pages, 15 tables, 6 figures. In AAAI 2022

  5. arXiv:2006.12013  [pdf, other

    cs.LG stat.ML

    CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information

    Authors: Pengyu Cheng, Weituo Hao, Shuyang Dai, Jiachang Liu, Zhe Gan, Lawrence Carin

    Abstract: Mutual information (MI) minimization has gained considerable interests in various machine learning tasks. However, estimating and minimizing MI in high-dimensional spaces remains a challenging problem, especially when only samples, rather than distribution forms, are accessible. Previous works mainly focus on MI lower bound approximation, which is not applicable to MI minimization problems. In thi… ▽ More

    Submitted 23 July, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: Accepted by the 37th International Conference on Machine Learing (ICML2020)

  6. arXiv:2006.11918  [pdf, ps, other

    cs.LG stat.ML

    MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients

    Authors: Chen Zhu, Yu Cheng, Zhe Gan, Furong Huang, **g**g Liu, Tom Goldstein

    Abstract: Adaptive gradient methods such as RMSProp and Adam use exponential moving estimate of the squared gradient to compute adaptive step sizes, achieving better convergence than SGD in face of noisy objectives. However, Adam can have undesirable convergence behaviors due to unstable or extreme adaptive learning rates. Methods such as AMSGrad and AdaBound have been proposed to stabilize the adaptive lea… ▽ More

    Submitted 4 July, 2021; v1 submitted 21 June, 2020; originally announced June 2020.

    Comments: ECML PKDD 2021

  7. arXiv:2005.00054  [pdf, other

    cs.LG stat.ML

    APo-VAE: Text Generation in Hyperbolic Space

    Authors: Shuyang Dai, Zhe Gan, Yu Cheng, Chenyang Tao, Lawrence Carin, **g**g Liu

    Abstract: Natural language often exhibits inherent hierarchical structure ingrained with complex syntax and semantics. However, most state-of-the-art deep generative models learn embeddings only in Euclidean vector space, without accounting for this structural property of language. In this paper, we investigate text generation in a hyperbolic latent space to learn continuous hierarchical representations. An… ▽ More

    Submitted 14 July, 2021; v1 submitted 30 April, 2020; originally announced May 2020.

  8. arXiv:1911.08709  [pdf, other

    cs.LG stat.ML

    Graph-Driven Generative Models for Heterogeneous Multi-Task Learning

    Authors: Wenlin Wang, Hongteng Xu, Zhe Gan, Bai Li, Guoyin Wang, Liqun Chen, Qian Yang, Wenqi Wang, Lawrence Carin

    Abstract: We propose a novel graph-driven generative model, that unifies multiple heterogeneous learning tasks into the same framework. The proposed model is based on the fact that heterogeneous learning tasks, which correspond to different generative processes, often rely on data with a shared graph structure. Accordingly, our model combines a graph convolutional network (GCN) with multiple variational aut… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

    Comments: Accepted by AAAI-2020

  9. arXiv:1909.13456  [pdf, other

    cs.LG cs.CL stat.ML

    Improving Textual Network Learning with Variational Homophilic Embeddings

    Authors: Wenlin Wang, Chenyang Tao, Zhe Gan, Guoyin Wang, Liqun Chen, Xinyuan Zhang, Ruiyi Zhang, Qian Yang, Ricardo Henao, Lawrence Carin

    Abstract: The performance of many network learning applications crucially hinges on the success of network embedding algorithms, which aim to encode rich network information into low-dimensional vertex-based vector representations. This paper considers a novel variational formulation of network embeddings, with special focus on textual networks. Different from most existing methods that optimize a discrimin… ▽ More

    Submitted 30 September, 2019; originally announced September 2019.

    Comments: Accepted to NeurIPS 2019

  10. arXiv:1909.05288  [pdf, other

    cs.LG stat.ML

    Contrastively Smoothed Class Alignment for Unsupervised Domain Adaptation

    Authors: Shuyang Dai, Yu Cheng, Yizhe Zhang, Zhe Gan, **g**g Liu, Lawrence Carin

    Abstract: Recent unsupervised approaches to domain adaptation primarily focus on minimizing the gap between the source and the target domains through refining the feature generator, in order to learn a better alignment between the two domains. This minimization can be achieved via a domain classifier to detect target-domain features that are divergent from source-domain features. However, by optimizing via… ▽ More

    Submitted 6 October, 2020; v1 submitted 11 September, 2019; originally announced September 2019.

  11. arXiv:1812.08352  [pdf, other

    cs.CV cs.AI stat.ML

    Sequential Attention GAN for Interactive Image Editing

    Authors: Yu Cheng, Zhe Gan, Yitong Li, **g**g Liu, Jianfeng Gao

    Abstract: Most existing text-to-image synthesis tasks are static single-turn generation, based on pre-defined textual descriptions of images. To explore more practical and interactive real-life applications, we introduce a new task - Interactive Image Editing, where users can guide an agent to edit images via multi-turn textual commands on-the-fly. In each session, the agent takes a natural language descrip… ▽ More

    Submitted 5 August, 2020; v1 submitted 19 December, 2018; originally announced December 2018.

    Comments: ACM MM 2020

  12. arXiv:1808.03109  [pdf, other

    econ.EM stat.AP

    Change Point Estimation in Panel Data with Time-Varying Individual Effects

    Authors: Otilia Boldea, Bettina Drepper, Zhuojiong Gan

    Abstract: This paper proposes a method for estimating multiple change points in panel data models with unobserved individual effects via ordinary least-squares (OLS). Typically, in this setting, the OLS slope estimators are inconsistent due to the unobserved individual effects bias. As a consequence, existing methods remove the individual effects before change point estimation through data transformations s… ▽ More

    Submitted 9 August, 2018; originally announced August 2018.

    Comments: 26 pages

    MSC Class: 62-07; 62P20; 91B76

  13. arXiv:1806.02978  [pdf, other

    cs.LG stat.ML

    JointGAN: Multi-Domain Joint Distribution Learning with Generative Adversarial Nets

    Authors: Yunchen Pu, Shuyang Dai, Zhe Gan, Weiyao Wang, Guoyin Wang, Yizhe Zhang, Ricardo Henao, Lawrence Carin

    Abstract: A new generative adversarial network is developed for joint distribution matching. Distinct from most existing approaches, that only learn conditional distributions, the proposed model aims to learn a joint distribution of multiple random variables (domains). This is achieved by learning to sample from conditional distributions between the domains, while simultaneously learning to sample from the… ▽ More

    Submitted 8 June, 2018; originally announced June 2018.

    Comments: Accepted by ICML 2018

  14. arXiv:1801.05062  [pdf, other

    stat.ML cs.LG stat.AP

    Multi-Label Learning from Medical Plain Text with Convolutional Residual Models

    Authors: Xinyuan Zhang, Ricardo Henao, Zhe Gan, Yitong Li, Lawrence Carin

    Abstract: Predicting diagnoses from Electronic Health Records (EHRs) is an important medical application of multi-label learning. We propose a convolutional residual model for multi-label classification from doctor notes in EHR data. A given patient may have multiple diagnoses, and therefore multi-label learning is required. We employ a Convolutional Neural Network (CNN) to encode plain text into a fixed-le… ▽ More

    Submitted 8 August, 2018; v1 submitted 15 January, 2018; originally announced January 2018.

    Comments: Machine Learning for Healthcare 2018 spotlight paper

  15. arXiv:1709.06548  [pdf, other

    cs.LG stat.ML

    Triangle Generative Adversarial Networks

    Authors: Zhe Gan, Liqun Chen, Weiyao Wang, Yunchen Pu, Yizhe Zhang, Hao Liu, Chunyuan Li, Lawrence Carin

    Abstract: A Triangle Generative Adversarial Network ($Δ$-GAN) is developed for semi-supervised cross-domain joint distribution matching, where the training data consists of samples from each domain, and supervision of domain correspondence is provided by only a few paired samples. $Δ$-GAN consists of four neural networks, two generators and two discriminators. The generators are designed to learn the two-wa… ▽ More

    Submitted 18 November, 2017; v1 submitted 19 September, 2017; originally announced September 2017.

    Comments: To appear in NIPS 2017

  16. arXiv:1708.04729  [pdf, other

    cs.CL cs.LG stat.ML

    Deconvolutional Paragraph Representation Learning

    Authors: Yizhe Zhang, Dinghan Shen, Guoyin Wang, Zhe Gan, Ricardo Henao, Lawrence Carin

    Abstract: Learning latent representations from long text sequences is an important first step in many natural language processing applications. Recurrent Neural Networks (RNNs) have become a cornerstone for this challenging task. However, the quality of sentences during RNN-based decoding (reconstruction) decreases with the length of the text. We propose a sequence-to-sequence, purely convolutional and deco… ▽ More

    Submitted 22 September, 2017; v1 submitted 15 August, 2017; originally announced August 2017.

    Comments: Accepted by NIPS 2017

  17. arXiv:1706.03850  [pdf, other

    stat.ML cs.CL cs.LG

    Adversarial Feature Matching for Text Generation

    Authors: Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, Lawrence Carin

    Abstract: The Generative Adversarial Network (GAN) has achieved great success in generating realistic (real-valued) synthetic data. However, convergence issues and difficulties dealing with discrete data hinder the applicability of GAN to text. We propose a framework for generating realistic text via adversarial training. We employ a long short-term memory network as generator, and a convolutional network a… ▽ More

    Submitted 18 November, 2017; v1 submitted 12 June, 2017; originally announced June 2017.

    Comments: Accepted by ICML 2017

  18. arXiv:1706.01498  [pdf, other

    stat.ML cs.LG stat.AP

    Stochastic Gradient Monomial Gamma Sampler

    Authors: Yizhe Zhang, Changyou Chen, Zhe Gan, Ricardo Henao, Lawrence Carin

    Abstract: Recent advances in stochastic gradient techniques have made it possible to estimate posterior distributions from large datasets via Markov Chain Monte Carlo (MCMC). However, when the target posterior is multimodal, mixing performance is often poor. This results in inadequate exploration of the posterior distribution. A framework is proposed to improve the sampling efficiency of stochastic gradient… ▽ More

    Submitted 10 January, 2018; v1 submitted 5 June, 2017; originally announced June 2017.

    Comments: Published on ICML 2017

    Journal ref: Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3996-4005, 2017

  19. arXiv:1611.04920  [pdf, other

    stat.ML cs.LG

    Unsupervised Learning with Truncated Gaussian Graphical Models

    Authors: Qinliang Su, Xuejun Liao, Chunyuan Li, Zhe Gan, Lawrence Carin

    Abstract: Gaussian graphical models (GGMs) are widely used for statistical modeling, because of ease of inference and the ubiquitous use of the normal distribution in practical approximations. However, they are also known for their limited modeling abilities, due to the Gaussian assumption. In this paper, we introduce a novel variant of GGMs, which relaxes the Gaussian restriction and yet admits efficient i… ▽ More

    Submitted 20 November, 2016; v1 submitted 15 November, 2016; originally announced November 2016.

    Comments: To appear in AAAI 2017

  20. arXiv:1609.08976  [pdf, other

    stat.ML cs.LG

    Variational Autoencoder for Deep Learning of Images, Labels and Captions

    Authors: Yunchen Pu, Zhe Gan, Ricardo Henao, Xin Yuan, Chunyuan Li, Andrew Stevens, Lawrence Carin

    Abstract: A novel variational autoencoder is developed to model images, as well as associated labels or captions. The Deep Generative Deconvolutional Network (DGDN) is used as a decoder of the latent image features, and a deep Convolutional Neural Network (CNN) is used as an image encoder; the CNN is used to approximate a distribution for the latent DGDN features/code. The latent code is also linked to gene… ▽ More

    Submitted 28 September, 2016; originally announced September 2016.

    Comments: NIPS 2016 (To appear)

  21. arXiv:1605.06715  [pdf, other

    stat.ML cs.LG

    Factored Temporal Sigmoid Belief Networks for Sequence Learning

    Authors: Jiaming Song, Zhe Gan, Lawrence Carin

    Abstract: Deep conditional generative models are developed to simultaneously learn the temporal dependencies of multiple sequences. The model is designed by introducing a three-way weight tensor to capture the multiplicative interactions between side information and sequences. The proposed model builds on the Temporal Sigmoid Belief Network (TSBN), a sequential stack of Sigmoid Belief Networks (SBNs). The t… ▽ More

    Submitted 21 May, 2016; originally announced May 2016.

    Comments: to appear in ICML 2016

  22. arXiv:1512.07962  [pdf, other

    stat.ML cs.LG

    Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization

    Authors: Changyou Chen, David Carlson, Zhe Gan, Chunyuan Li, Lawrence Carin

    Abstract: Stochastic gradient Markov chain Monte Carlo (SG-MCMC) methods are Bayesian analogs to popular stochastic optimization methods; however, this connection is not well studied. We explore this relationship by applying simulated annealing to an SGMCMC algorithm. Furthermore, we extend recent SG-MCMC methods with two key components: i) adaptive preconditioners (as in ADAgrad or RMSprop), and ii) adapti… ▽ More

    Submitted 5 August, 2016; v1 submitted 25 December, 2015; originally announced December 2015.

    Comments: Merry Christmas from the Santa (algorithm). AISTATS 2016

  23. arXiv:1509.07087  [pdf, other

    stat.ML cs.LG

    Deep Temporal Sigmoid Belief Networks for Sequence Modeling

    Authors: Zhe Gan, Chunyuan Li, Ricardo Henao, David Carlson, Lawrence Carin

    Abstract: Deep dynamic generative models are developed to learn sequential dependencies in time-series data. The multi-layered model is designed by constructing a hierarchy of temporal sigmoid belief networks (TSBNs), defined as a sequential stack of sigmoid belief networks (SBNs). Each SBN has a contextual hidden state, inherited from the previous SBNs in the sequence, and is used to regulate its hidden bi… ▽ More

    Submitted 23 September, 2015; originally announced September 2015.

    Comments: to appear in NIPS 2015