Skip to main content

Showing 1–50 of 94 results for author: Song, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.19704  [pdf, other

    stat.ML cs.LG stat.ME

    Enhancing Sufficient Dimension Reduction via Hellinger Correlation

    Authors: Seungbeom Hong, Ilmun Kim, Jun Song

    Abstract: In this work, we develop a new theory and method for sufficient dimension reduction (SDR) in single-index models, where SDR is a sub-field of supervised dimension reduction based on conditional independence. Our work is primarily motivated by the recent introduction of the Hellinger correlation as a dependency measure. Utilizing this measure, we develop a method capable of effectively detecting th… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  2. arXiv:2404.03272  [pdf, other

    cs.LG cs.CC cs.CR math.ST stat.ML

    Cryptographic Hardness of Score Estimation

    Authors: Min Jae Song

    Abstract: We show that $L^2$-accurate score estimation, in the absence of strong assumptions on the data distribution, is computationally hard even when sample complexity is polynomial in the relevant problem parameters. Our reduction builds on the result of Chen et al. (ICLR 2023), who showed that the problem of generating samples from an unknown data distribution reduces to $L^2$-accurate score estimation… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 28 pages

  3. arXiv:2403.13118  [pdf, other

    stat.ME cs.LG math.DS math.SP stat.ML

    Modal Analysis of Spatiotemporal Data via Multivariate Gaussian Process Regression

    Authors: Jiwoo Song, Daning Huang

    Abstract: Modal analysis has become an essential tool to understand the coherent structure of complex flows. The classical modal analysis methods, such as dynamic mode decomposition (DMD) and spectral proper orthogonal decomposition (SPOD), rely on a sufficient amount of data that is regularly sampled in time. However, often one needs to deal with sparse temporally irregular data, e.g., due to experimental… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 43 pages, 35 figures

  4. arXiv:2402.15734  [pdf, other

    cs.LG stat.ML

    Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

    Authors: Wuyang Chen, Jialin Song, Pu Ren, Shashank Subramanian, Dmitriy Morozov, Michael W. Mahoney

    Abstract: Recent years have witnessed the promise of coupling machine learning methods and physical domainspecific insights for solving scientific problems based on partial differential equations (PDEs). However, being data-intensive, these methods still require a large amount of PDE data. This reintroduces the need for expensive numerical PDE solutions, partially undermining the original goal of avoiding t… ▽ More

    Submitted 13 June, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  5. arXiv:2310.13911  [pdf, other

    stat.ME stat.AP

    Multilevel Matrix Factor Model

    Authors: Yuteng Zhang, Yongchang Hui, Junrong Song, Shurong Zheng

    Abstract: Large-scale matrix data has been widely discovered and continuously studied in various fields recently. Considering the multi-level factor structure and utilizing the matrix structure, we propose a multilevel matrix factor model with both global and local factors. The global factors can affect all matrix times series, whereas the local factors are only allow to affect within each specific matrix t… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: 47 pages, 22 figures

  6. arXiv:2310.10232  [pdf, other

    stat.AP stat.CO

    Efficient seismic reliability and fragility analysis of lifeline networks using subset simulation

    Authors: Dongkyu Lee, Ziqi Wang, Junho Song

    Abstract: Various simulation-based and analytical methods have been developed to evaluate the seismic fragilities of individual structures. However, a community's seismic safety and resilience are substantially affected by network reliability, determined not only by component fragilities but also by network topology and commodity/information flows. However, seismic reliability analyses of networks often enc… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  7. arXiv:2310.04861  [pdf, other

    cs.LG cs.AI stat.ML

    Uncovering hidden geometry in Transformers via disentangling position and context

    Authors: Jiajun Song, Yiqiao Zhong

    Abstract: Transformers are widely used to extract semantic meanings from input tokens, yet they usually operate as black-box models. In this paper, we present a simple yet informative decomposition of hidden states (or embeddings) of trained transformers into interpretable components. For any layer, embedding vectors of input sequence samples are represented by a tensor… ▽ More

    Submitted 3 February, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: 38 pages, 34 figures

  8. arXiv:2307.13371  [pdf, other

    cs.LG stat.ML

    Learning Regions of Interest for Bayesian Optimization with Adaptive Level-Set Estimation

    Authors: Fengxue Zhang, Jialin Song, James Bowden, Alexander Ladd, Yisong Yue, Thomas A. Desautels, Yuxin Chen

    Abstract: We study Bayesian optimization (BO) in high-dimensional and non-stationary scenarios. Existing algorithms for such scenarios typically require extensive hyperparameter tuning, which limits their practical effectiveness. We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest (ROI) as a superlevel-set of a nonparametric probabilistic model such as a… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  9. arXiv:2305.04391  [pdf, other

    cs.LG cs.CV math.NA stat.ML

    A Variational Perspective on Solving Inverse Problems with Diffusion Models

    Authors: Morteza Mardani, Jiaming Song, Jan Kautz, Arash Vahdat

    Abstract: Diffusion models have emerged as a key pillar of foundation models in visual domains. One of their critical applications is to universally solve different downstream inverse tasks via a single diffusion prior without re-training for each task. Most inverse tasks can be formulated as inferring a posterior distribution over data (e.g., a full image) given a measurement (e.g., a masked image). This i… ▽ More

    Submitted 29 September, 2023; v1 submitted 7 May, 2023; originally announced May 2023.

  10. arXiv:2304.13836  [pdf, other

    cs.LG cs.AI cs.CV stat.ME

    On Pitfalls of $\textit{RemOve-And-Retrain}$: Data Processing Inequality Perspective

    Authors: Junhwa Song, Keumgang Cha, Junghoon Seo

    Abstract: Approaches for appraising feature importance approximations, alternatively referred to as attribution methods, have been established across an extensive array of contexts. The development of resilient techniques for performance benchmarking constitutes a critical concern in the sphere of explainable deep learning. This study scrutinizes the dependability of the RemOve-And-Retrain (ROAR) procedure,… ▽ More

    Submitted 10 May, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: Code: https://github.com/SIAnalytics/roar

  11. Adaptive active subspace-based metamodeling for high-dimensional reliability analysis

    Authors: Jungho Kim, Ziqi Wang, Junho Song

    Abstract: To address the challenges of reliability analysis in high-dimensional probability spaces, this paper proposes a new metamodeling method that couples active subspace, heteroscedastic Gaussian process, and active learning. The active subspace is leveraged to identify low-dimensional salient features of a high-dimensional computational model. A surrogate computational model is built in the low-dimens… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  12. arXiv:2302.07400  [pdf, other

    cs.LG math.FA stat.ML

    Score-based Diffusion Models in Function Space

    Authors: Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

    Abstract: Diffusion models have recently emerged as a powerful framework for generative modeling. They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising. Despite their tremendous success, they are mostly formulated on finite-dimensional spaces, e.g. Euclidean, limiting their applications to many… ▽ More

    Submitted 22 November, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: 52 pages

    MSC Class: 46B09 (Primary); 60J22 (Secondary) ACM Class: I.2.6; J.2

  13. arXiv:2210.15651  [pdf, other

    cs.LG math.OC stat.ML

    Learning Single-Index Models with Shallow Neural Networks

    Authors: Alberto Bietti, Joan Bruna, Clayton Sanford, Min Jae Song

    Abstract: Single-index models are a class of functions given by an unknown univariate ``link'' function applied to an unknown one-dimensional projection of the input. These models are particularly relevant in high dimension, when the data might present low-dimensional structure that learning algorithms should adapt to. While several statistical aspects of this model, such as the sample complexity of recover… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: 76 pages. To appear at NeurIPS 2022

  14. arXiv:2206.13035  [pdf, other

    cs.LG cs.AI stat.ML

    A General Recipe for Likelihood-free Bayesian Optimization

    Authors: Jiaming Song, Lantao Yu, Willie Neiswanger, Stefano Ermon

    Abstract: The acquisition function, a critical component in Bayesian optimization (BO), can often be written as the expectation of a utility function under a surrogate model. However, to ensure that acquisition functions are tractable to optimize, restrictions must be placed on the surrogate model and utility function. To extend BO to a broader class of models and utilities, we propose likelihood-free BO (L… ▽ More

    Submitted 6 October, 2022; v1 submitted 26 June, 2022; originally announced June 2022.

    Comments: ICML 2022. This version fixes a typo in eq 33

  15. arXiv:2206.05253  [pdf, other

    cs.CV cs.AI cs.LG stat.AP

    Rethinking Spatial Invariance of Convolutional Networks for Object Counting

    Authors: Zhi-Qi Cheng, Qi Dai, Hong Li, **gKuan Song, Xiao Wu, Alexander G. Hauptmann

    Abstract: Previous work generally believes that improving the spatial invariance of convolutional networks is the key to object counting. However, after verifying several mainstream counting networks, we surprisingly found too strict pixel-level spatial invariance would cause overfit noise in the density map generation. In this paper, we try to use locally connected Gaussian kernels to replace the original… ▽ More

    Submitted 18 August, 2022; v1 submitted 10 June, 2022; originally announced June 2022.

    Comments: Accepted to CVPR 2022, Code: https://github.com/zhiqic/Rethinking-Counting

  16. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  17. arXiv:2112.03898  [pdf, ps, other

    cs.LG cs.CC cs.DS math.ST stat.ML

    Lattice-Based Methods Surpass Sum-of-Squares in Clustering

    Authors: Ilias Zadik, Min Jae Song, Alexander S. Wein, Joan Bruna

    Abstract: Clustering is a fundamental primitive in unsupervised learning which gives rise to a rich class of computationally-challenging inference tasks. In this work, we focus on the canonical task of clustering d-dimensional Gaussian mixtures with unknown (and possibly degenerate) covariance. Recent works (Ghosh et al. '20; Mao, Wein '21; Davis, Diaz, Wang '21) have established lower bounds against the cl… ▽ More

    Submitted 7 January, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

    Comments: Added a new tight information-theoretic lower bound for label recovery

  18. arXiv:2107.03502  [pdf, other

    cs.LG stat.ML

    CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation

    Authors: Yusuke Tashiro, Jiaming Song, Yang Song, Stefano Ermon

    Abstract: The imputation of missing values in time series has many applications in healthcare and finance. While autoregressive models are natural candidates for time series imputation, score-based diffusion models have recently outperformed existing counterparts including autoregressive models in many tasks such as image generation and audio synthesis, and would be promising for time series imputation. In… ▽ More

    Submitted 27 October, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

    Comments: NeurIPS 2021

  19. arXiv:2107.02146  [pdf, other

    stat.ME math.ST q-bio.NC stat.AP stat.ML

    Multivariate functional group sparse regression: functional predictor selection

    Authors: Ali Mahzarnia, Jun Song

    Abstract: In this paper, we propose methods for functional predictor selection and the estimation of smooth functional coefficients simultaneously in a scalar-on-function regression problem under high-dimensional multivariate functional data setting. In particular, we develop two methods for functional group-sparse regression under a generic Hilbert space of infinite dimension. We show the convergence of al… ▽ More

    Submitted 8 July, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

    Comments: The R package that is developed for this paper is available at GitHub. See https://github.com/Ali-Mahzarnia/MFSGrp

  20. arXiv:2106.10744  [pdf, other

    cs.LG cs.CC math.PR math.ST stat.ML

    On the Cryptographic Hardness of Learning Single Periodic Neurons

    Authors: Min Jae Song, Ilias Zadik, Joan Bruna

    Abstract: We show a simple reduction which demonstrates the cryptographic hardness of learning a single periodic neuron over isotropic Gaussian distributions in the presence of noise. More precisely, our reduction shows that any polynomial-time algorithm (not necessarily gradient-based) for learning such functions under small noise implies a polynomial-time quantum algorithm for solving worst-case lattice p… ▽ More

    Submitted 16 September, 2021; v1 submitted 20 June, 2021; originally announced June 2021.

    Comments: 64 pages. Added more references, and a proof of the sample complexity lower bound

  21. arXiv:2106.09246  [pdf, other

    cs.CV cs.LG stat.ML

    Federated CycleGAN for Privacy-Preserving Image-to-Image Translation

    Authors: Joonyoung Song, Jong Chul Ye

    Abstract: Unsupervised image-to-image translation methods such as CycleGAN learn to convert images from one domain to another using unpaired training data sets from different domains. Unfortunately, these approaches still require centrally collected unpaired records, potentially violating privacy and security issues. Although the recent federated learning (FL) allows a neural network to be trained without d… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

  22. arXiv:2010.12810  [pdf, other

    cs.LG stat.ML

    Autoregressive Score Matching

    Authors: Chenlin Meng, Lantao Yu, Yang Song, Jiaming Song, Stefano Ermon

    Abstract: Autoregressive models use chain rule to define a joint probability distribution as a product of conditionals. These conditionals need to be normalized, imposing constraints on the functional families that can be used. To increase flexibility, we propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariate log-condit… ▽ More

    Submitted 24 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2020

  23. arXiv:2010.09808  [pdf, other

    cs.LG cs.AI stat.ML

    Imitation with Neural Density Models

    Authors: Kuno Kim, Akshat **dal, Yang Song, Jiaming Song, Yanan Sui, Stefano Ermon

    Abstract: We propose a new framework for Imitation Learning (IL) via density estimation of the expert's occupancy measure followed by Maximum Occupancy Entropy Reinforcement Learning (RL) using the density as a reward. Our approach maximizes a non-adversarial model-free RL objective that provably lower bounds reverse Kullback-Leibler divergence between occupancy measures of the expert and imitator. We prese… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

  24. arXiv:2009.07368  [pdf, other

    cs.LG cs.AI stat.ML

    Evaluating representations by the complexity of learning low-loss predictors

    Authors: William F. Whitney, Min Jae Song, David Brandfonbrener, Jaan Altosaar, Kyunghyun Cho

    Abstract: We consider the problem of evaluating representations of data for use in solving a downstream task. We propose to measure the quality of a representation by the complexity of learning a predictor on top of the representation that achieves low loss on a task of interest, and introduce two methods, surplus description length (SDL) and $\varepsilon$ sample complexity ($\varepsilon$SC). In contrast to… ▽ More

    Submitted 5 February, 2021; v1 submitted 15 September, 2020; originally announced September 2020.

  25. arXiv:2008.09643  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Privacy Preserving Recalibration under Domain Shift

    Authors: Rachel Luo, Shengjia Zhao, Jiaming Song, Jonathan Kuck, Stefano Ermon, Silvio Savarese

    Abstract: Classifiers deployed in high-stakes real-world applications must output calibrated confidence scores, i.e. their predicted probabilities should reflect empirical frequencies. Recalibration algorithms can greatly improve a model's probability estimates; however, existing algorithms are not applicable in real-world situations where the test data follows a different distribution from the training dat… ▽ More

    Submitted 21 August, 2020; originally announced August 2020.

  26. arXiv:2008.07902  [pdf, other

    stat.ML cs.LG physics.geo-ph

    Bayesian geoacoustic inversion using mixture density network

    Authors: Guoli Wu, Hefeng Dong, Junqiang Song, **gya Zhang

    Abstract: Bayesian geoacoustic inversion problems are conventionally solved by Markov chain Monte Carlo methods or its variants, which are computationally expensive. This paper extends the classic Bayesian geoacoustic inversion framework by deriving important geoacoustic statistics of Bayesian geoacoustic inversion from the multidimensional posterior probability density (PPD) using the mixture density netwo… ▽ More

    Submitted 16 January, 2021; v1 submitted 18 August, 2020; originally announced August 2020.

  27. arXiv:2008.02365  [pdf, other

    stat.ME

    Sequential change point test in the presence of outliers: the density power divergence based approach

    Authors: Junmo Song

    Abstract: In this study, we consider a problem of monitoring parameter changes particularly in the presence of outliers. To propose a sequential procedure that is robust against outliers, we use the density power divergence to derive a detector and stop** time that make up our procedure. We first investigate the asymptotic properties of our sequential procedure for i.i.d. sequences, and then extend the pr… ▽ More

    Submitted 27 June, 2021; v1 submitted 5 August, 2020; originally announced August 2020.

  28. arXiv:2007.09852  [pdf, other

    cs.LG stat.ML

    Multi-label Contrastive Predictive Coding

    Authors: Jiaming Song, Stefano Ermon

    Abstract: Variational mutual information (MI) estimators are widely used in unsupervised representation learning methods such as contrastive predictive coding (CPC). A lower bound on MI can be obtained from a multi-class classification problem, where a critic attempts to distinguish a positive sample drawn from the underlying joint distribution from $(m-1)$ negative samples drawn from a suitable proposal di… ▽ More

    Submitted 2 December, 2020; v1 submitted 19 July, 2020; originally announced July 2020.

    Comments: Post camera-ready version. Reorganized the theorems in the last version as corollaries of more general theorems

  29. arXiv:2007.00295  [pdf, ps, other

    cs.LG stat.ML

    Belief Propagation Neural Networks

    Authors: Jonathan Kuck, Shuvam Chakraborty, Hao Tang, Rachel Luo, Jiaming Song, Ashish Sabharwal, Stefano Ermon

    Abstract: Learned neural solvers have successfully been used to solve combinatorial optimization and decision problems. More general counting variants of these problems, however, are still largely solved with hand-crafted solvers. To bridge this gap, we introduce belief propagation neural networks (BPNNs), a class of parameterized operators that operate on factor graphs and generalize Belief Propagation (BP… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

  30. arXiv:2006.07815  [pdf, other

    cs.LG math.OC stat.ML

    Optimistic Distributionally Robust Policy Optimization

    Authors: Jun Song, Chaoyue Zhao

    Abstract: Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO), as the widely employed policy based reinforcement learning (RL) methods, are prone to converge to a sub-optimal solution as they limit the policy representation to a particular parametric distribution class. To address this issue, we develop an innovative Optimistic Distributionally Robust Policy Optimization (ODRPO) a… ▽ More

    Submitted 14 June, 2020; originally announced June 2020.

  31. arXiv:2006.07363  [pdf, other

    physics.comp-ph stat.AP

    Analysis, Design, and Generalization of Electrochemical Impedance Spectroscopy (EIS) Inversion Algorithms

    Authors: Surya Effendy, Juhyun Song, Martin Z. Bazant

    Abstract: We introduce a framework for analyzing and designing EIS inversion algorithms. Our framework stems from the observation of four features common to well-defined EIS inversion algorithms, namely (1) the representation of unknown distributions, (2) the minimization of a metric of error to estimate parameters arising from the chosen representation, subject to constraints on (3) the complexity control… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: 46 pages, to be submitted to the Journal of the Electrochemical Society

  32. arXiv:2005.09595  [pdf, other

    cs.CC cs.DS cs.LG stat.ML

    Continuous LWE

    Authors: Joan Bruna, Oded Regev, Min Jae Song, Yi Tang

    Abstract: We introduce a continuous analogue of the Learning with Errors (LWE) problem, which we name CLWE. We give a polynomial-time quantum reduction from worst-case lattice problems to CLWE, showing that CLWE enjoys similar hardness guarantees to those of LWE. Alternatively, our result can also be seen as opening new avenues of (quantum) attacks on lattice problems. Our work resolves an open problem rega… ▽ More

    Submitted 24 October, 2020; v1 submitted 19 May, 2020; originally announced May 2020.

    Comments: 29 pages

  33. arXiv:2004.00422  [pdf, other

    math.OC cs.LG stat.ML

    A General Large Neighborhood Search Framework for Solving Integer Linear Programs

    Authors: Jialin Song, Ravi Lanka, Yisong Yue, Bistra Dilkina

    Abstract: This paper studies a strategy for data-driven algorithm design for large-scale combinatorial optimization problems that can leverage existing state-of-the-art solvers in general purpose ways. The goal is to arrive at new approaches that can reliably outperform existing solvers in wall-clock time. We focus on solving integer programs, and ground our approach in the large neighborhood search (LNS) p… ▽ More

    Submitted 22 December, 2020; v1 submitted 29 March, 2020; originally announced April 2020.

    Comments: NeurIPS 2020

  34. arXiv:2003.03463  [pdf, other

    cs.LG stat.ML

    Training Deep Energy-Based Models with f-Divergence Minimization

    Authors: Lantao Yu, Yang Song, Jiaming Song, Stefano Ermon

    Abstract: Deep energy-based models (EBMs) are very flexible in distribution parametrization but computationally challenging because of the intractable partition function. They are typically trained via maximum likelihood, using contrastive divergence to approximate the gradient of the KL divergence between data and model distribution. While KL divergence has many desirable properties, other f-divergences ha… ▽ More

    Submitted 20 July, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: ICML 2020

  35. arXiv:2003.02205  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Probabilistic Performance-Pattern Decomposition (PPPD): analysis framework and applications to stochastic mechanical systems

    Authors: Ziqi Wang, Marco Broccardo, Junho Song

    Abstract: Since the early 1900s, numerous research efforts have been devoted to develo** quantitative solutions to stochastic mechanical systems. In general, the problem is perceived as solved when a complete or partial probabilistic description on the quantity of interest (QoI) is determined. However, in the presence of complex system behavior, there is a critical need to go beyond mere probabilistic des… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

    Comments: Autoencoder, clustering, diffusion map, manifold learning, Monte Carlo simulation, pattern recognition, stochastic dynamics, uncertainty quantification. 44 Pages

  36. arXiv:2003.01941  [pdf, other

    cs.LG cs.CV stat.ML

    Gaussianization Flows

    Authors: Chenlin Meng, Yang Song, Jiaming Song, Stefano Ermon

    Abstract: Iterative Gaussianization is a fixed-point iteration procedure that can transform any continuous random vector into a Gaussian one. Based on iterative Gaussianization, we propose a new type of normalizing flow model that enables both efficient computation of likelihoods and efficient inversion for sample generation. We demonstrate that these models, named Gaussianization flows, are universal appro… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

    Comments: AISTATS 2020

  37. arXiv:2003.00638  [pdf, other

    cs.LG stat.ML

    Permutation Invariant Graph Generation via Score-Based Generative Modeling

    Authors: Chenhao Niu, Yang Song, Jiaming Song, Shengjia Zhao, Aditya Grover, Stefano Ermon

    Abstract: Learning generative models for graph-structured data is challenging because graphs are discrete, combinatorial, and the underlying data distribution is invariant to the ordering of nodes. However, most of the existing generative models for graphs are not invariant to the chosen ordering, which might lead to an undesirable bias in the learned distribution. To address this difficulty, we propose a p… ▽ More

    Submitted 1 March, 2020; originally announced March 2020.

    Comments: 14 pages, AISTATS 2020

  38. arXiv:2002.10689  [pdf, other

    cs.LG stat.ML

    A Theory of Usable Information Under Computational Constraints

    Authors: Yilun Xu, Shengjia Zhao, Jiaming Song, Russell Stewart, Stefano Ermon

    Abstract: We propose a new framework for reasoning about information in complex systems. Our foundation is based on a variational extension of Shannon's information theory that takes into account the modeling power and computational constraints of the observer. The resulting \emph{predictive $\mathcal{V}$-information} encompasses mutual information and other notions of informativeness such as the coefficien… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

    Comments: ICLR 2020 (Talk)

  39. arXiv:2002.09847  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Unsupervised Denoising for Satellite Imagery using Wavelet Subband CycleGAN

    Authors: Joonyoung Song, Jae-Heon Jeong, Dae-Soon Park, Hyun-Ho Kim, Doo-Chun Seo, Jong Chul Ye

    Abstract: Multi-spectral satellite imaging sensors acquire various spectral band images such as red (R), green (G), blue (B), near-infrared (N), etc. Thanks to the unique spectroscopic property of each spectral band with respective to the objects on the ground, multi-spectral satellite imagery can be used for various geological survey applications. Unfortunately, image artifacts from imaging sensor noises o… ▽ More

    Submitted 23 February, 2020; originally announced February 2020.

  40. arXiv:2002.04997  [pdf, ps, other

    cs.LG stat.ML

    PCNN: Pattern-based Fine-Grained Regular Pruning towards Optimizing CNN Accelerators

    Authors: Zhanhong Tan, Jiebo Song, Xiaolong Ma, Sia-Huat Tan, Hongyang Chen, Yuanqing Miao, Yifu Wu, Shaokai Ye, Yanzhi Wang, Dehui Li, Kaisheng Ma

    Abstract: Weight pruning is a powerful technique to realize model compression. We propose PCNN, a fine-grained regular 1D pruning method. A novel index format called Sparsity Pattern Mask (SPM) is presented to encode the sparsity in PCNN. Leveraging SPM with limited pruning patterns and non-zero sequences with equal length, PCNN can be efficiently employed in hardware. Evaluated on VGG-16 and ResNet-18, our… ▽ More

    Submitted 14 June, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: 6 pages, DAC 2020 accepted paper

  41. arXiv:1912.11006  [pdf, other

    cs.LG cs.CV stat.ML

    Data-Free Adversarial Distillation

    Authors: Gongfan Fang, Jie Song, Chengchao Shen, Xinchao Wang, Da Chen, Mingli Song

    Abstract: Knowledge Distillation (KD) has made remarkable progress in the last few years and become a popular paradigm for model compression and knowledge transfer. However, almost all existing KD algorithms are data-driven, i.e., relying on a large amount of original training data or alternative data, which is usually unavailable in real-world scenarios. In this paper, we devote ourselves to this challengi… ▽ More

    Submitted 2 March, 2020; v1 submitted 23 December, 2019; originally announced December 2019.

  42. arXiv:1911.09274  [pdf, other

    stat.AP stat.CO stat.ME

    Computer Model Emulation with High-Dimensional Functional Output in Large-Scale Observing System Uncertainty Experiments

    Authors: Pulong Ma, Anirban Mondal, Bledar Konomi, Jonathan Hobbs, Joon Song, Emily Kang

    Abstract: Observing system uncertainty experiments (OSUEs) have been recently proposed as a cost-effective way to perform probabilistic assessment of retrievals for NASA's Orbiting Carbon Observatory-2 (OCO-2) mission. One important component in the OCO-2 retrieval algorithm is a full-physics forward model that describes the mathematical relationship between atmospheric variables such as carbon dioxide and… ▽ More

    Submitted 2 November, 2020; v1 submitted 20 November, 2019; originally announced November 2019.

    Comments: 45 pages

  43. arXiv:1910.09779  [pdf, other

    cs.LG stat.ML

    Bridging the Gap Between $f$-GANs and Wasserstein GANs

    Authors: Jiaming Song, Stefano Ermon

    Abstract: Generative adversarial networks (GANs) have enjoyed much success in learning high-dimensional distributions. Learning objectives approximately minimize an $f$-divergence ($f$-GANs) or an integral probability metric (Wasserstein GANs) between the model and the data distribution using a discriminator. Wasserstein GANs enjoy superior empirical performance, but in $f$-GANs the discriminator can be int… ▽ More

    Submitted 17 June, 2020; v1 submitted 22 October, 2019; originally announced October 2019.

    Comments: updated for ICML camera ready version

  44. arXiv:1910.09115  [pdf, other

    cs.LG stat.ML

    Unsupervised Out-of-Distribution Detection with Batch Normalization

    Authors: Jiaming Song, Yang Song, Stefano Ermon

    Abstract: Likelihood from a generative model is a natural statistic for detecting out-of-distribution (OoD) samples. However, generative models have been shown to assign higher likelihood to OoD samples compared to ones from the training distribution, preventing simple threshold-based detection rules. We demonstrate that OoD detection fails even when using more sophisticated statistics based on the likeliho… ▽ More

    Submitted 20 October, 2019; originally announced October 2019.

  45. arXiv:1910.06222  [pdf, other

    cs.LG cs.IT stat.ML

    Understanding the Limitations of Variational Mutual Information Estimators

    Authors: Jiaming Song, Stefano Ermon

    Abstract: Variational approaches based on neural networks are showing promise for estimating mutual information (MI) between high dimensional variables. However, they can be difficult to use in practice due to poorly understood bias/variance tradeoffs. We theoretically show that, under some conditions, estimators such as MINE exhibit variance that could grow exponentially with the true amount of underlying… ▽ More

    Submitted 24 March, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

    Comments: Fixed some typos, credit to Yilun Xu

  46. arXiv:1910.00105  [pdf, other

    cs.LG cs.AI stat.ML

    Domain Adaptive Imitation Learning

    Authors: Kuno Kim, Yihong Gu, Jiaming Song, Shengjia Zhao, Stefano Ermon

    Abstract: We study the question of how to imitate tasks across domains with discrepancies such as embodiment, viewpoint, and dynamics mismatch. Many prior works require paired, aligned demonstrations and an additional RL step that requires environment interactions. However, paired, aligned demonstrations are seldom obtainable and RL procedures are expensive. We formalize the Domain Adaptive Imitation Learni… ▽ More

    Submitted 18 July, 2020; v1 submitted 30 September, 2019; originally announced October 2019.

    Comments: ICML 2020

  47. arXiv:1908.11466  [pdf, ps, other

    stat.ME

    A robust approach for testing parameter change in Poisson autoregressive models

    Authors: Jiwon Kang, Junmo Song

    Abstract: Parameter change test has been an important issue in time series analysis. The problem has also been actively explored in the field of integer-valued time series, but the testing in the presence of outliers has not yet been extensively investigated. This study considers the problem of testing for parameter change in Poisson autoregressive models particularly when observations are contaminated by o… ▽ More

    Submitted 29 August, 2019; originally announced August 2019.

    Comments: 15 pages

  48. arXiv:1908.07307  [pdf, other

    cs.LG eess.SP stat.ML

    Investigation of wind pressures on tall building under interference effects using machine learning techniques

    Authors: Gang Hu, Lingbo Liu, Dacheng Tao, Jie Song, K. C. S. Kwok

    Abstract: Interference effects of tall buildings have attracted numerous studies due to the boom of clusters of tall buildings in megacities. To fully understand the interference effects of buildings, it often requires a substantial amount of wind tunnel tests. Limited wind tunnel tests that only cover part of interference scenarios are unable to fully reveal the interference effects. This study used machin… ▽ More

    Submitted 20 August, 2019; originally announced August 2019.

    Comments: 15 pages, 14 figures

  49. arXiv:1907.13220  [pdf, other

    cs.LG stat.ML

    Multi-Agent Adversarial Inverse Reinforcement Learning

    Authors: Lantao Yu, Jiaming Song, Stefano Ermon

    Abstract: Reinforcement learning agents are prone to undesired behaviors due to reward mis-specification. Finding a set of reward functions to properly guide agent behaviors is particularly challenging in multi-agent scenarios. Inverse reinforcement learning provides a framework to automatically acquire suitable reward functions from expert demonstrations. Its extension to multi-agent settings, however, is… ▽ More

    Submitted 30 July, 2019; originally announced July 2019.

    Comments: ICML 2019

  50. arXiv:1907.04484  [pdf, other

    cs.LG cs.AI stat.ML

    Co-training for Policy Learning

    Authors: Jialin Song, Ravi Lanka, Yisong Yue, Masahiro Ono

    Abstract: We study the problem of learning sequential decision-making policies in settings with multiple state-action representations. Such settings naturally arise in many domains, such as planning (e.g., multiple integer programming formulations) and various combinatorial optimization problems (e.g., those with both integer programming and graph-based formulations). Inspired by the classical co-training f… ▽ More

    Submitted 2 July, 2019; originally announced July 2019.

    Comments: UAI 2019, oral presentation