Skip to main content

Showing 1–50 of 65 results for author: Wu, N

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.18456  [pdf, other

    stat.ML math.DG

    Boundary Detection Algorithm Inspired by Locally Linear Embedding

    Authors: Pei-Cheng Kuo, Nan Wu

    Abstract: In the study of high-dimensional data, it is often assumed that the data set possesses an underlying lower-dimensional structure. A practical model for this structure is an embedded compact manifold with boundary. Since the underlying manifold structure is typically unknown, identifying boundary points from the data distributed on the manifold is crucial for various applications. In this work, we… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 31 Pages, 5 figures

    MSC Class: 53-08; 53Z50

  2. arXiv:2405.16865  [pdf, other

    q-bio.NC cs.LG stat.ML

    An Investigation of Conformal Isometry Hypothesis for Grid Cells

    Authors: Dehong Xu, Ruiqi Gao, Wen-Hao Zhang, Xue-Xin Wei, Ying Nian Wu

    Abstract: This paper investigates the conformal isometry hypothesis as a potential explanation for the emergence of hexagonal periodic patterns in the response maps of grid cells. The hypothesis posits that the activities of the population of grid cells form a high-dimensional vector in the neural space, representing the agent's self-position in 2D physical space. As the agent moves in the 2D physical space… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.19192

  3. arXiv:2405.16852  [pdf, other

    cs.LG cs.AI stat.ML

    EM Distillation for One-step Diffusion Models

    Authors: Sirui Xie, Zhisheng Xiao, Diederik P Kingma, Tingbo Hou, Ying Nian Wu, Kevin Patrick Murphy, Tim Salimans, Ben Poole, Ruiqi Gao

    Abstract: While diffusion models can learn complex distributions, sampling requires a computationally expensive iterative process. Existing distillation methods enable efficient sampling, but have notable limitations, such as performance degradation with very few sampling steps, reliance on training data access, or mode-seeking optimization that may fail to capture the full distribution. We propose EM Disti… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  4. arXiv:2405.16730  [pdf, other

    cs.LG cs.AI stat.AP

    Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space

    Authors: Peiyu Yu, Dinghuai Zhang, Hengzhi He, Xiaojian Ma, Ruiyao Miao, Yifan Lu, Yasi Zhang, Deqian Kong, Ruiqi Gao, Jianwen Xie, Guang Cheng, Ying Nian Wu

    Abstract: Offline Black-Box Optimization (BBO) aims at optimizing a black-box function using the knowledge from a pre-collected offline dataset of function values and corresponding input designs. However, the high-dimensional and highly-multimodal input design space of black-box function pose inherent challenges for most existing methods that model and operate directly upon input designs. These issues inclu… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  5. arXiv:2405.14018  [pdf, other

    cs.CR cs.LG stat.AP

    Watermarking Generative Tabular Data

    Authors: Hengzhi He, Peiyu Yu, Junpeng Ren, Ying Nian Wu, Guang Cheng

    Abstract: In this paper, we introduce a simple yet effective tabular data watermarking mechanism with statistical guarantees. We show theoretically that the proposed watermark can be effectively detected, while faithfully preserving the data fidelity, and also demonstrates appealing robustness against additive noise attack. The general idea is to achieve the watermarking through a strategic embedding based… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  6. arXiv:2311.06212  [pdf, other

    stat.ML cs.LG stat.AP

    Differentiable VQ-VAE's for Robust White Matter Streamline Encodings

    Authors: Andrew Lizarraga, Brandon Taraku, Edouardo Honig, Ying Nian Wu, Shantanu H. Joshi

    Abstract: Given the complex geometry of white matter streamlines, Autoencoders have been proposed as a dimension-reduction tool to simplify the analysis streamlines in a low-dimensional latent spaces. However, despite these recent successes, the majority of encoder architectures only perform dimension reduction on single streamlines as opposed to a full bundle of streamlines. This is a severe limitation of… ▽ More

    Submitted 18 November, 2023; v1 submitted 10 November, 2023; originally announced November 2023.

    Comments: 5 pages, 4 figures, 1 table

  7. arXiv:2310.19192  [pdf, other

    q-bio.NC cs.LG stat.ML

    Emergence of Grid-like Representations by Training Recurrent Networks with Conformal Normalization

    Authors: Dehong Xu, Ruiqi Gao, Wen-Hao Zhang, Xue-Xin Wei, Ying Nian Wu

    Abstract: Grid cells in the entorhinal cortex of mammalian brains exhibit striking hexagon grid firing patterns in their response maps as the animal (e.g., a rat) navigates in a 2D open environment. In this paper, we study the emergence of the hexagon grid patterns of grid cells based on a general recurrent neural network (RNN) model that captures the navigation process. The responses of grid cells collecti… ▽ More

    Submitted 19 February, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

  8. arXiv:2310.03253  [pdf, other

    cs.LG q-bio.BM stat.ML

    Molecule Design by Latent Prompt Transformer

    Authors: Deqian Kong, Yuhao Huang, Jianwen Xie, Ying Nian Wu

    Abstract: This paper proposes a latent prompt Transformer model for solving challenging optimization problems such as molecule design, where the goal is to find molecules with optimal values of a target chemical or biological property that can be computed by an existing software. Our proposed model consists of three components. (1) A latent vector whose prior distribution is modeled by a Unet transformation… ▽ More

    Submitted 5 February, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

  9. arXiv:2310.03218  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Energy-Based Prior Model with Diffusion-Amortized MCMC

    Authors: Peiyu Yu, Yaxuan Zhu, Sirui Xie, Xiaojian Ma, Ruiqi Gao, Song-Chun Zhu, Ying Nian Wu

    Abstract: Latent space Energy-Based Models (EBMs), also known as energy-based priors, have drawn growing interests in the field of generative modeling due to its flexibility in the formulation and strong modeling power of the latent space. However, the common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progres… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  10. arXiv:2306.14902  [pdf, other

    q-bio.BM cs.LG stat.ML

    Molecule Design by Latent Space Energy-Based Modeling and Gradual Distribution Shifting

    Authors: Deqian Kong, Bo Pang, Tian Han, Ying Nian Wu

    Abstract: Generation of molecules with desired chemical and biological properties such as high drug-likeness, high binding affinity to target proteins, is critical for drug discovery. In this paper, we propose a probabilistic generative model to capture the joint distribution of molecules and their properties. Our model assumes an energy-based model (EBM) in the latent space. Conditional on the latent vecto… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Journal ref: 39th Conference on Uncertainty in Artificial Intelligence 2023

  11. arXiv:2303.11536  [pdf, other

    cs.LG cs.AI cs.CV math.ST stat.ML

    Indeterminate Probability Neural Network

    Authors: Tao Yang, Chuang Liu, Xiaofeng Ma, Weijia Lu, Ning Wu, Bingyang Li, Zhifei Yang, Peng Liu, Lin Sun, Xiaodong Zhang, Can Zhang

    Abstract: We propose a new general model called IPNN - Indeterminate Probability Neural Network, which combines neural network and probability theory together. In the classical probability theory, the calculation of probability is based on the occurrence of events, which is hardly used in current neural networks. In this paper, we propose a new general probability theory, which is an extension of classical… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: 13 pages

  12. arXiv:2301.00729  [pdf, other

    stat.ME cs.IT math.ST

    A Closed-Form EVSI Expression for a Multinomial Data-Generating Process

    Authors: Adam Fleischhacker, Pak-Wing Fok, Mokshay Madiman, Nan Wu

    Abstract: This paper derives analytic expressions for the expected value of sample information (EVSI), the expected value of distribution information (EVDI), and the optimal sample size when data consists of independent draws from a bounded sequence of integers. Due to challenges of creating tractable EVSI expressions, most existing work valuing data does so in one of three ways: 1) analytically through clo… ▽ More

    Submitted 1 December, 2022; originally announced January 2023.

    Journal ref: Decision Analysis, Vol. 20, No. 1, pp. 73--84, March 2023

  13. arXiv:2210.02684  [pdf, other

    q-bio.NC cs.LG stat.ML

    Conformal Isometry of Lie Group Representation in Recurrent Network of Grid Cells

    Authors: Dehong Xu, Ruiqi Gao, Wen-Hao Zhang, Xue-Xin Wei, Ying Nian Wu

    Abstract: The activity of the grid cell population in the medial entorhinal cortex (MEC) of the mammalian brain forms a vector representation of the self-position of the animal. Recurrent neural networks have been proposed to explain the properties of the grid cells by updating the neural activity vector based on the velocity input of the animal. In doing so, the grid cell system effectively performs path i… ▽ More

    Submitted 7 November, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

  14. arXiv:2110.07478  [pdf, other

    stat.ML cs.LG

    Inferring Manifolds From Noisy Data Using Gaussian Processes

    Authors: David B Dunson, Nan Wu

    Abstract: In analyzing complex datasets, it is often of interest to infer lower dimensional structure underlying the higher dimensional observations. As a flexible class of nonlinear structures, it is common to focus on Riemannian manifolds. Most existing manifold learning algorithms replace the original data with lower dimensional coordinates without providing an estimate of the manifold in the observation… ▽ More

    Submitted 24 May, 2024; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 51 pages, 20 figures

  15. arXiv:2101.09875  [pdf, other

    math.ST cs.LG stat.ML

    Eigen-convergence of Gaussian kernelized graph Laplacian by manifold heat interpolation

    Authors: Xiuyuan Cheng, Nan Wu

    Abstract: This work studies the spectral convergence of graph Laplacian to the Laplace-Beltrami operator when the graph affinity matrix is constructed from $N$ random samples on a $d$-dimensional manifold embedded in a possibly high dimensional space. By analyzing Dirichlet form convergence and constructing candidate approximate eigenfunctions via convolution with manifold heat kernel, we prove that, with G… ▽ More

    Submitted 25 June, 2022; v1 submitted 24 January, 2021; originally announced January 2021.

  16. arXiv:2012.08125  [pdf, other

    cs.LG stat.ML

    Learning Energy-Based Models by Diffusion Recovery Likelihood

    Authors: Ruiqi Gao, Yang Song, Ben Poole, Ying Nian Wu, Diederik P. Kingma

    Abstract: While energy-based models (EBMs) exhibit a number of desirable properties, training and sampling on high-dimensional datasets remains challenging. Inspired by recent progress on diffusion probabilistic models, we present a diffusion recovery likelihood method to tractably learn and sample from a sequence of EBMs trained on increasingly noisy versions of a dataset. Each EBM is trained with recovery… ▽ More

    Submitted 27 March, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

  17. arXiv:2010.07242  [pdf, other

    stat.ME math.ST stat.ML

    Graph Based Gaussian Processes on Restricted Domains

    Authors: David B Dunson, Hau-Tieng Wu, Nan Wu

    Abstract: In nonparametric regression, it is common for the inputs to fall in a restricted subset of Euclidean space. Typical kernel-based methods that do not take into account the intrinsic geometry of the domain across which observations are collected may produce sub-optimal results. In this article, we focus on solving this problem in the context of Gaussian process (GP) models, proposing a new class of… ▽ More

    Submitted 2 November, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

    Comments: 38 pages, 15 figures

  18. arXiv:2007.06408  [pdf, ps, other

    math.ST math.PR stat.ML

    Strong Uniform Consistency with Rates for Kernel Density Estimators with General Kernels on Manifolds

    Authors: Hau-Tieng Wu, Nan Wu

    Abstract: When analyzing modern machine learning algorithms, we may need to handle kernel density estimation (KDE) with intricate kernels that are not designed by the user and might even be irregular and asymmetric. To handle this emerging challenge, we provide a strong uniform consistency result with the $L^\infty$ convergence rate for KDE on Riemannian manifolds with Riemann integrable kernels (in the amb… ▽ More

    Submitted 8 June, 2021; v1 submitted 13 July, 2020; originally announced July 2020.

    Comments: 50 pages

    MSC Class: 60F15; 62G07

  19. arXiv:2006.10259  [pdf, other

    q-bio.NC cs.LG stat.ML

    On Path Integration of Grid Cells: Group Representation and Isotropic Scaling

    Authors: Ruiqi Gao, Jianwen Xie, Xue-Xin Wei, Song-Chun Zhu, Ying Nian Wu

    Abstract: Understanding how grid cells perform path integration calculations remains a fundamental problem. In this paper, we conduct theoretical analysis of a general representation model of path integration by grid cells, where the 2D self-position is encoded as a higher dimensional vector, and the 2D self-motion is represented by a general transformation of the vector. We identify two conditions on the t… ▽ More

    Submitted 3 November, 2021; v1 submitted 17 June, 2020; originally announced June 2020.

  20. arXiv:2006.08205  [pdf, other

    stat.ML cs.LG

    Learning Latent Space Energy-Based Prior Model

    Authors: Bo Pang, Tian Han, Erik Nijkamp, Song-Chun Zhu, Ying Nian Wu

    Abstract: We propose to learn energy-based model (EBM) in the latent space of a generator model, so that the EBM serves as a prior model that stands on the top-down network of the generator model. Both the latent space EBM and the top-down network can be learned jointly by maximum likelihood, which involves short-run MCMC sampling from both the prior and posterior distributions of the latent vector. Due to… ▽ More

    Submitted 29 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020 Camera-Ready

  21. arXiv:2006.06897  [pdf, other

    stat.ML cs.LG

    MCMC Should Mix: Learning Energy-Based Model with Neural Transport Latent Space MCMC

    Authors: Erik Nijkamp, Ruiqi Gao, Pavel Sountsov, Srinivas Vasudevan, Bo Pang, Song-Chun Zhu, Ying Nian Wu

    Abstract: Learning energy-based model (EBM) requires MCMC sampling of the learned model as an inner loop of the learning algorithm. However, MCMC sampling of EBMs in high-dimensional data space is generally not mixing, because the energy function, which is usually parametrized by a deep network, is highly multi-modal in the data space. This is a serious handicap for both theory and practice of EBMs. In this… ▽ More

    Submitted 16 March, 2022; v1 submitted 11 June, 2020; originally announced June 2020.

  22. arXiv:2006.06649  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning

    Authors: Qing Li, Siyuan Huang, Yining Hong, Yixin Chen, Ying Nian Wu, Song-Chun Zhu

    Abstract: The goal of neural-symbolic computation is to integrate the connectionist and symbolist paradigms. Prior methods learn the neural-symbolic models using reinforcement learning (RL) approaches, which ignore the error propagation in the symbolic reasoning module and thus converge slowly with sparse rewards. In this paper, we address these issues and close the loop of neural-symbolic learning by (1) i… ▽ More

    Submitted 27 July, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: ICML 2020. Project page: https://liqing-ustc.github.io/NGS

  23. Data-driven Efficient Solvers for Langevin Dynamics on Manifold in High Dimensions

    Authors: Yuan Gao, Jian-Guo Liu, Nan Wu

    Abstract: We study the Langevin dynamics of a physical system with manifold structure $\mathcal{M}\subset\mathbb{R}^p$ based on collected sample points $\{\mathsf{x}_i\}_{i=1}^n \subset \mathcal{M}$ that probe the unknown manifold $\mathcal{M}$. Through the diffusion map, we first learn the reaction coordinates $\{\mathsf{y}_i\}_{i=1}^n\subset \mathcal{N}$ corresponding to $\{\mathsf{x}_i\}_{i=1}^n$, where… ▽ More

    Submitted 27 September, 2022; v1 submitted 22 May, 2020; originally announced May 2020.

    Comments: 53 pages, 12 figures

    Journal ref: Applied and Computational Harmonic Analysis 62 (2023) 261-309

  24. arXiv:2003.08500  [pdf, ps, other

    cs.LG cs.CR eess.SP math.OC stat.ML

    The Cost of Privacy in Asynchronous Differentially-Private Machine Learning

    Authors: Farhad Farokhi, Nan Wu, David Smith, Mohamed Ali Kaafar

    Abstract: We consider training machine learning models using Training data located on multiple private and geographically-scattered servers with different privacy settings. Due to the distributed nature of the data, communicating with all collaborating private data owners simultaneously may prove challenging or altogether impossible. In this paper, we develop differentially-private asynchronous algorithms f… ▽ More

    Submitted 29 June, 2020; v1 submitted 18 March, 2020; originally announced March 2020.

  25. arXiv:2002.07613  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization

    Authors: Yiqiu Shen, Nan Wu, Jason Phang, Jungkyu Park, Kangning Liu, Sudarshini Tyagi, Laura Heacock, S. Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

    Abstract: Medical images differ from natural images in significantly higher resolutions and smaller regions of interest. Because of these differences, neural network architectures that work well for natural images might not be applicable to medical image analysis. In this work, we extend the globally-aware multiple instance classifier, a framework we proposed to address these unique properties of medical im… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

  26. arXiv:2001.08317  [pdf, other

    cs.LG stat.ML

    Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case

    Authors: Neo Wu, Bradley Green, Xue Ben, Shawn O'Banion

    Abstract: In this paper, we present a new approach to time series forecasting. Time series data are prevalent in many scientific and engineering disciplines. Time series forecasting is a crucial task in modeling time series data, and is an important area of machine learning. In this work we developed a novel method that employs Transformer-based machine learning models to forecast time series data. This app… ▽ More

    Submitted 22 January, 2020; originally announced January 2020.

    Comments: 10 pages, 7 figures

  27. arXiv:1912.01909  [pdf, other

    stat.ML cs.LG

    Learning Multi-layer Latent Variable Model via Variational Optimization of Short Run MCMC for Approximate Inference

    Authors: Erik Nijkamp, Bo Pang, Tian Han, Linqi Zhou, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper studies the fundamental problem of learning deep generative models that consist of multiple layers of latent variables organized in top-down architectures. Such models have high expressivity and allow for learning hierarchical representations. Learning such a generative model requires inferring the latent variables for each training example based on the posterior distribution of these l… ▽ More

    Submitted 17 July, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

  28. arXiv:1912.00589  [pdf, other

    stat.ML cs.CV cs.LG

    Flow Contrastive Estimation of Energy-Based Models

    Authors: Ruiqi Gao, Erik Nijkamp, Diederik P. Kingma, Zhen Xu, Andrew M. Dai, Ying Nian Wu

    Abstract: This paper studies a training method to jointly estimate an energy-based model and a flow-based model, in which the two models are iteratively updated based on a shared adversarial value function. This joint training method has the following traits. (1) The update of the energy-based model is based on noise contrastive estimation, with the flow model serving as a strong noise distribution. (2) The… ▽ More

    Submitted 1 April, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

  29. arXiv:1911.11374  [pdf, other

    stat.ML cs.LG

    Representation Learning: A Statistical Perspective

    Authors: Jianwen Xie, Ruiqi Gao, Erik Nijkamp, Song-Chun Zhu, Ying Nian Wu

    Abstract: Learning representations of data is an important problem in statistics and machine learning. While the origin of learning representations can be traced back to factor analysis and multidimensional scaling in statistics, it has become a central theme in deep learning with important applications in computer vision and computational neuroscience. In this article, we review recent advances in learning… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Journal ref: Annual Review of Statistics and Its Application 2020

  30. arXiv:1911.08459  [pdf, other

    stat.ML cs.LG

    Deep Unsupervised Clustering with Clustered Generator Model

    Authors: Dandan Zhu, Tian Han, Linqi Zhou, Xiaokang Yang, Ying Nian Wu

    Abstract: This paper addresses the problem of unsupervised clustering which remains one of the most fundamental challenges in machine learning and artificial intelligence. We propose the clustered generator model for clustering which contains both continuous and discrete latent variables. Discrete latent variables model the cluster label while the continuous ones model variations within each cluster. The le… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

  31. arXiv:1909.04324  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Inducing Hierarchical Compositional Model by Sparsifying Generator Network

    Authors: Xianglei Xing, Tianfu Wu, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper proposes to learn hierarchical compositional AND-OR model for interpretable image synthesis by sparsifying the generator network. The proposed method adopts the scene-objects-parts-subparts-primitives hierarchy in image representation. A scene has different types (i.e., OR) each of which consists of a number of objects (i.e., AND). This can be recursively formulated across the scene-obj… ▽ More

    Submitted 20 June, 2020; v1 submitted 10 September, 2019; originally announced September 2019.

    Comments: This is the CVPR version

  32. arXiv:1909.00337  [pdf, other

    stat.ML cs.LG q-bio.GN

    Neural Architecture Search for Joint Optimization of Predictive Power and Biological Knowledge

    Authors: Zijun Zhang, Linqi Zhou, Liangke Gou, Ying Nian Wu

    Abstract: We report a neural architecture search framework, BioNAS, that is tailored for biomedical researchers to easily build, evaluate, and uncover novel knowledge from interpretable deep learning models. The introduction of knowledge dissimilarity functions in BioNAS enables the joint optimization of predictive power and biological knowledge through searching architectures in a model space. By optimizin… ▽ More

    Submitted 1 September, 2019; originally announced September 2019.

    Comments: 13 pages, 4 figures

  33. arXiv:1908.00615  [pdf, other

    eess.IV cs.CV stat.ML

    Improving localization-based approaches for breast cancer screening exam classification

    Authors: Thibault Févry, Jason Phang, Nan Wu, S. Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

    Abstract: We trained and evaluated a localization-based deep CNN for breast cancer screening exam classification on over 200,000 exams (over 1,000,000 images). Our model achieves an AUC of 0.919 in predicting malignancy in patients undergoing breast cancer screening, reducing the error rate of the baseline (Wu et al., 2019a) by 23%. In addition, the models generates bounding boxes for benign and malignant f… ▽ More

    Submitted 1 August, 2019; originally announced August 2019.

    Comments: MIDL 2019 [arXiv:1907.08612]

    Report number: MIDL/2019/ExtendedAbstract/HyxoAR_AK4

  34. arXiv:1907.13057  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Screening Mammogram Classification with Prior Exams

    Authors: Jungkyu Park, Jason Phang, Yiqiu Shen, Nan Wu, S. Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

    Abstract: Radiologists typically compare a patient's most recent breast cancer screening exam to their previous ones in making informed diagnoses. To reflect this practice, we propose new neural network models that compare pairs of screening mammograms from the same patient. We train and evaluate our proposed models on over 665,000 pairs of images (over 166,000 pairs of exams). Our best model achieves an AU… ▽ More

    Submitted 30 July, 2019; originally announced July 2019.

    Comments: MIDL 2019 [arXiv:1907.08612]

    Report number: MIDL/2019/ExtendedAbstract/HkgCdUaMq4

  35. arXiv:1906.09679  [pdf, ps, other

    cs.CR cs.LG stat.ML

    The Value of Collaboration in Convex Machine Learning with Differential Privacy

    Authors: Nan Wu, Farhad Farokhi, David Smith, Mohamed Ali Kaafar

    Abstract: In this paper, we apply machine learning to distributed private data owned by multiple data owners, entities with access to non-overlap** training datasets. We use noisy, differentially-private gradients to minimize the fitness cost of the machine learning model using stochastic gradient descent. We quantify the quality of the trained model, using the fitness cost, as a function of privacy budge… ▽ More

    Submitted 23 June, 2019; originally announced June 2019.

    Comments: Accepted in IEEE S&P 2020

    Journal ref: IEEE Symposium on Security and Privacy 2020 (IEEE SP 2020)

  36. arXiv:1906.02846  [pdf, other

    cs.LG eess.IV stat.ML

    Globally-Aware Multiple Instance Classifier for Breast Cancer Screening

    Authors: Yiqiu Shen, Nan Wu, Jason Phang, Jungkyu Park, Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

    Abstract: Deep learning models designed for visual classification tasks on natural images have become prevalent in medical image analysis. However, medical images differ from typical natural images in many ways, such as significantly higher resolutions and smaller regions of interest. Moreover, both the global structure and local details play important roles in medical image analysis tasks. To address these… ▽ More

    Submitted 19 August, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: Accepted to MLMI 2019

  37. arXiv:1904.09770  [pdf, other

    stat.ML cs.LG

    Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

    Authors: Erik Nijkamp, Mitch Hill, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper studies a curious phenomenon in learning energy-based model (EBM) using MCMC. In each learning iteration, we generate synthesized examples by running a non-convergent, non-mixing, and non-persistent short-run MCMC toward the current model, always starting from the same initial distribution such as uniform noise distribution, and always running a fixed number of MCMC steps. After generat… ▽ More

    Submitted 25 November, 2019; v1 submitted 22 April, 2019; originally announced April 2019.

  38. arXiv:1904.05453  [pdf, other

    cs.LG stat.ML

    Energy-Based Continuous Inverse Optimal Control

    Authors: Yifei Xu, Jianwen Xie, Tianyang Zhao, Chris Baker, Yibiao Zhao, Ying Nian Wu

    Abstract: The problem of continuous inverse optimal control (over finite time horizon) is to learn the unknown cost function over the sequence of continuous control variables from expert demonstrations. In this article, we study this fundamental problem in the framework of energy-based model, where the observed expert trajectories are assumed to be random samples from a probability density function defined… ▽ More

    Submitted 18 April, 2022; v1 submitted 10 April, 2019; originally announced April 2019.

  39. arXiv:1903.12370  [pdf, other

    stat.ML cs.CV cs.LG

    On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models

    Authors: Erik Nijkamp, Mitch Hill, Tian Han, Song-Chun Zhu, Ying Nian Wu

    Abstract: This study investigates the effects of Markov chain Monte Carlo (MCMC) sampling in unsupervised Maximum Likelihood (ML) learning. Our attention is restricted to the family of unnormalized probability densities for which the negative log density (or energy function) is a ConvNet. We find that many of the techniques used to stabilize training in previous studies are not necessary. ML learning with a… ▽ More

    Submitted 27 November, 2019; v1 submitted 29 March, 2019; originally announced March 2019.

    Comments: Code available at: https://github.com/point0bar1/ebm-anatomy

    Journal ref: AAAI 2020

  40. arXiv:1903.08297  [pdf, other

    cs.LG cs.CV stat.ML

    Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

    Authors: Nan Wu, Jason Phang, Jungkyu Park, Yiqiu Shen, Zhe Huang, Masha Zorin, Stanisław Jastrzębski, Thibault Févry, Joe Katsnelson, Eric Kim, Stacey Wolfson, Ujas Parikh, Sushma Gaddam, Leng Leng Young Lin, Kara Ho, Joshua D. Weinstein, Beatriu Reig, Yiming Gao, Hildegard Toth, Kristine Pysarenko, Alana Lewin, Jiyon Lee, Krystal Airola, Eralda Mema, Stephanie Chung , et al. (7 additional authors not shown)

    Abstract: We present a deep convolutional neural network for breast cancer screening exam classification, trained and evaluated on over 200,000 exams (over 1,000,000 images). Our network achieves an AUC of 0.895 in predicting whether there is a cancer in the breast, when tested on the screening population. We attribute the high accuracy of our model to a two-stage training procedure, which allows us to use… ▽ More

    Submitted 19 March, 2019; originally announced March 2019.

    Comments: MIDL 2019 [arXiv:1907.08612]

    Report number: MIDL/2019/ExtendedAbstract/SkxYez76FE

  41. arXiv:1902.03871  [pdf, other

    cs.NE cs.CV cs.LG stat.ML

    Learning V1 Simple Cells with Vector Representation of Local Content and Matrix Representation of Local Motion

    Authors: Ruiqi Gao, Jianwen Xie, Siyuan Huang, Yufan Ren, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper proposes a representational model for image pairs such as consecutive video frames that are related by local pixel displacements, in the hope that the model may shed light on motion perception in primary visual cortex (V1). The model couples the following two components: (1) the vector representations of local contents of images and (2) the matrix representations of local pixel displace… ▽ More

    Submitted 5 April, 2022; v1 submitted 24 January, 2019; originally announced February 2019.

  42. arXiv:1902.02812  [pdf, other

    stat.ML cs.LG

    Cooperative Training of Fast Thinking Initializer and Slow Thinking Solver for Conditional Learning

    Authors: Jianwen Xie, Zilong Zheng, Xiaolin Fang, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper studies the problem of learning the conditional distribution of a high-dimensional output given an input, where the output and input may belong to two different domains, e.g., the output is a photo image and the input is a sketch image. We solve this problem by cooperative training of a fast thinking initializer and slow thinking solver. The initializer generates the output directly by… ▽ More

    Submitted 7 April, 2021; v1 submitted 7 February, 2019; originally announced February 2019.

    Comments: 16 pages

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2021

  43. arXiv:1901.07538  [pdf, other

    cs.LG cs.AI stat.ML

    Unsupervised Learning of Neural Networks to Explain Neural Networks (extended abstract)

    Authors: Quanshi Zhang, Yu Yang, Ying Nian Wu

    Abstract: This paper presents an unsupervised method to learn a neural network, namely an explainer, to interpret a pre-trained convolutional neural network (CNN), i.e., the explainer uses interpretable visual concepts to explain features in middle conv-layers of a CNN. Given feature maps of a conv-layer of the CNN, the explainer performs like an auto-encoder, which decomposes the feature maps into object-p… ▽ More

    Submitted 21 January, 2019; originally announced January 2019.

    Comments: In AAAI-19 Workshop on Network Interpretability for Deep Learning. arXiv admin note: substantial text overlap with arXiv:1805.07468

  44. arXiv:1901.06978  [pdf, other

    cs.LG stat.ML

    Network Transplanting (extended abstract)

    Authors: Quanshi Zhang, Yu Yang, Qian Yu, Ying Nian Wu

    Abstract: This paper focuses on a new task, i.e., transplanting a category-and-task-specific neural network to a generic, modular network without strong supervision. We design a functionally interpretable structure for the generic network. Like building LEGO blocks, we teach the generic network a new category by directly transplanting the module corresponding to the category from a pre-trained network with… ▽ More

    Submitted 21 January, 2019; originally announced January 2019.

    Comments: In AAAI-19 Workshop on Network Interpretability for Deep Learning. arXiv admin note: substantial text overlap with arXiv:1804.10272

  45. arXiv:1901.02413  [pdf, other

    cs.LG cs.CV stat.ML

    Interpretable CNNs for Object Classification

    Authors: Quanshi Zhang, Xin Wang, Ying Nian Wu, Huilin Zhou, Song-Chun Zhu

    Abstract: This paper proposes a generic method to learn interpretable convolutional filters in a deep convolutional neural network (CNN) for object classification, where each interpretable filter encodes features of a specific object part. Our method does not require additional annotations of object parts or textures for supervision. Instead, we use the same training data as traditional CNNs. Our method aut… ▽ More

    Submitted 12 March, 2020; v1 submitted 8 January, 2019; originally announced January 2019.

  46. arXiv:1812.10907  [pdf, other

    stat.ML cs.CV cs.LG

    Divergence Triangle for Joint Training of Generator Model, Energy-based Model, and Inference Model

    Authors: Tian Han, Erik Nijkamp, Xiaolin Fang, Mitch Hill, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper proposes the divergence triangle as a framework for joint training of generator model, energy-based model and inference model. The divergence triangle is a compact and symmetric (anti-symmetric) objective function that seamlessly integrates variational learning, adversarial learning, wake-sleep algorithm, and contrastive divergence in a unified probabilistic formulation. This unificatio… ▽ More

    Submitted 31 January, 2019; v1 submitted 28 December, 2018; originally announced December 2018.

  47. arXiv:1812.10587  [pdf, other

    stat.ML cs.CV cs.LG

    Learning Dynamic Generator Model by Alternating Back-Propagation Through Time

    Authors: Jianwen Xie, Ruiqi Gao, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper studies the dynamic generator model for spatial-temporal processes such as dynamic textures and action sequences in video data. In this model, each time frame of the video sequence is generated by a generator model, which is a non-linear transformation of a latent state vector, where the non-linear transformation is parametrized by a top-down neural network. The sequence of latent state… ▽ More

    Submitted 26 December, 2018; originally announced December 2018.

    Comments: 10 pages

    Journal ref: The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI) 2019

  48. arXiv:1811.01483  [pdf, other

    cs.LG cs.AI stat.ML

    Contingency-Aware Exploration in Reinforcement Learning

    Authors: Jongwook Choi, Yijie Guo, Marcin Moczulski, Junhyuk Oh, Neal Wu, Mohammad Norouzi, Honglak Lee

    Abstract: This paper investigates whether learning contingency-awareness and controllable aspects of an environment can lead to better exploration in reinforcement learning. To investigate this question, we consider an instantiation of this hypothesis evaluated on the Arcade Learning Element (ALE). In this study, we develop an attentive dynamics model (ADM) that discovers controllable elements of the observ… ▽ More

    Submitted 4 March, 2019; v1 submitted 4 November, 2018; originally announced November 2018.

    Comments: In ICLR 2019

  49. arXiv:1810.05597  [pdf, other

    stat.ML cs.LG cs.NE

    Learning Grid Cells as Vector Representation of Self-Position Coupled with Matrix Representation of Self-Motion

    Authors: Ruiqi Gao, Jianwen Xie, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper proposes a representational model for grid cells. In this model, the 2D self-position of the agent is represented by a high-dimensional vector, and the 2D self-motion or displacement of the agent is represented by a matrix that transforms the vector. Each component of the vector is a unit or a cell. The model consists of the following three sub-models. (1) Vector-matrix multiplication.… ▽ More

    Submitted 24 May, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

  50. arXiv:1810.04261  [pdf, other

    stat.ML cs.CV cs.LG

    A Tale of Three Probabilistic Families: Discriminative, Descriptive and Generative Models

    Authors: Ying Nian Wu, Ruiqi Gao, Tian Han, Song-Chun Zhu

    Abstract: The pattern theory of Grenander is a mathematical framework where patterns are represented by probability models on random variables of algebraic structures. In this paper, we review three families of probability models, namely, the discriminative models, the descriptive models, and the generative models. A discriminative model is in the form of a classifier. It specifies the conditional probabili… ▽ More

    Submitted 3 December, 2018; v1 submitted 9 October, 2018; originally announced October 2018.