Search | arXiv e-print repository

Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models

Authors: Sangwoong Yoon, Himchan Hwang, Dohyun Kwon, Yung-Kyun Noh, Frank C. Park

Abstract: We present a maximum entropy inverse reinforcement learning (IRL) approach for improving the sample quality of diffusion generative models, especially when the number of generation time steps is small. Similar to how IRL trains a policy based on the reward function learned from expert demonstrations, we train (or fine-tune) a diffusion model using the log probability density estimated from trainin… ▽ More We present a maximum entropy inverse reinforcement learning (IRL) approach for improving the sample quality of diffusion generative models, especially when the number of generation time steps is small. Similar to how IRL trains a policy based on the reward function learned from expert demonstrations, we train (or fine-tune) a diffusion model using the log probability density estimated from training data. Since we employ an energy-based model (EBM) to represent the log density, our approach boils down to the joint training of a diffusion model and an EBM. Our IRL formulation, named Diffusion by Maximum Entropy IRL (DxMI), is a minimax problem that reaches equilibrium when both models converge to the data distribution. The entropy maximization plays a key role in DxMI, facilitating the exploration of the diffusion model and ensuring the convergence of the EBM. We also propose Diffusion by Dynamic Programming (DxDP), a novel reinforcement learning algorithm for diffusion models, as a subroutine in DxMI. DxDP makes the diffusion model update in DxMI efficient by transforming the original problem into an optimal control formulation where value functions replace back-propagation in time. Our empirical studies show that diffusion models fine-tuned using DxMI can generate high-quality samples in as few as 4 and 10 steps. Additionally, DxMI enables the training of an EBM without MCMC, stabilizing EBM training dynamics and enhancing anomaly detection performance. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: Code is released at https://github.com/swyoon/Diffusion-by-MaxEntIRL

arXiv:2312.03397 [pdf, other]

Generalized Contrastive Divergence: Joint Training of Energy-Based Model and Diffusion Model through Inverse Reinforcement Learning

Authors: Sangwoong Yoon, Dohyun Kwon, Himchan Hwang, Yung-Kyun Noh, Frank C. Park

Abstract: We present Generalized Contrastive Divergence (GCD), a novel objective function for training an energy-based model (EBM) and a sampler simultaneously. GCD generalizes Contrastive Divergence (Hinton, 2002), a celebrated algorithm for training EBM, by replacing Markov Chain Monte Carlo (MCMC) distribution with a trainable sampler, such as a diffusion model. In GCD, the joint training of EBM and a di… ▽ More We present Generalized Contrastive Divergence (GCD), a novel objective function for training an energy-based model (EBM) and a sampler simultaneously. GCD generalizes Contrastive Divergence (Hinton, 2002), a celebrated algorithm for training EBM, by replacing Markov Chain Monte Carlo (MCMC) distribution with a trainable sampler, such as a diffusion model. In GCD, the joint training of EBM and a diffusion model is formulated as a minimax problem, which reaches an equilibrium when both models converge to the data distribution. The minimax learning with GCD bears interesting equivalence to inverse reinforcement learning, where the energy corresponds to a negative reward, the diffusion model is a policy, and the real data is expert demonstrations. We present preliminary yet promising results showing that joint training is beneficial for both EBM and a diffusion model. GCD enables EBM training without MCMC while improving the sample quality of a diffusion model. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: NeurIPS 2023 Workshop on Diffusion Models

arXiv:2311.03001 [pdf, other]

Variational Weighting for Kernel Density Ratios

Authors: Sangwoong Yoon, Frank C. Park, Gunsu S Yun, Iljung Kim, Yung-Kyun Noh

Abstract: Kernel density estimation (KDE) is integral to a range of generative and discriminative tasks in machine learning. Drawing upon tools from the multidimensional calculus of variations, we derive an optimal weight function that reduces bias in standard kernel density estimates for density ratios, leading to improved estimates of prediction posteriors and information-theoretic measures. In the proces… ▽ More Kernel density estimation (KDE) is integral to a range of generative and discriminative tasks in machine learning. Drawing upon tools from the multidimensional calculus of variations, we derive an optimal weight function that reduces bias in standard kernel density estimates for density ratios, leading to improved estimates of prediction posteriors and information-theoretic measures. In the process, we shed light on some fundamental aspects of density estimation, particularly from the perspective of algorithms that employ KDEs as their main building blocks. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: NeurIPS 2023

arXiv:2310.18677 [pdf, other]

Energy-Based Models for Anomaly Detection: A Manifold Diffusion Recovery Approach

Authors: Sangwoong Yoon, Young-Uk **, Yung-Kyun Noh, Frank C. Park

Abstract: We present a new method of training energy-based models (EBMs) for anomaly detection that leverages low-dimensional structures within data. The proposed algorithm, Manifold Projection-Diffusion Recovery (MPDR), first perturbs a data point along a low-dimensional manifold that approximates the training dataset. Then, EBM is trained to maximize the probability of recovering the original data. The tr… ▽ More We present a new method of training energy-based models (EBMs) for anomaly detection that leverages low-dimensional structures within data. The proposed algorithm, Manifold Projection-Diffusion Recovery (MPDR), first perturbs a data point along a low-dimensional manifold that approximates the training dataset. Then, EBM is trained to maximize the probability of recovering the original data. The training involves the generation of negative samples via MCMC, as in conventional EBM training, but from a different distribution concentrated near the manifold. The resulting near-manifold negative samples are highly informative, reflecting relevant modes of variation in data. An energy function of MPDR effectively learns accurate boundaries of the training data distribution and excels at detecting out-of-distribution samples. Experimental results show that MPDR exhibits strong performance across various anomaly detection tasks involving diverse data types, such as images, vectors, and acoustic signals. △ Less

Submitted 28 October, 2023; originally announced October 2023.

Comments: NeurIPS 2023

arXiv:2309.10237 [pdf, other]

On Explicit Curvature Regularization in Deep Generative Models

Authors: Yonghyeon Lee, Frank Chongwoo Park

Abstract: We propose a family of curvature-based regularization terms for deep generative model learning. Explicit coordinate-invariant formulas for both intrinsic and extrinsic curvature measures are derived for the case of arbitrary data manifolds embedded in higher-dimensional Euclidean space. Because computing the curvature is a highly computation-intensive process involving the evaluation of second-ord… ▽ More We propose a family of curvature-based regularization terms for deep generative model learning. Explicit coordinate-invariant formulas for both intrinsic and extrinsic curvature measures are derived for the case of arbitrary data manifolds embedded in higher-dimensional Euclidean space. Because computing the curvature is a highly computation-intensive process involving the evaluation of second-order derivatives, efficient formulas are derived for approximately evaluating intrinsic and extrinsic curvatures. Comparative studies are conducted that compare the relative efficacy of intrinsic versus extrinsic curvature-based regularization measures, as well as performance comparisons against existing autoencoder training methods. Experiments involving noisy motion capture data confirm that curvature-based methods outperform existing autoencoder regularization methods, with intrinsic curvature measures slightly more effective than extrinsic curvature measures. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 2nd Annual Workshop on Topology, Algebra, and Geometry in Machine Learning (TAG-ML) at the ICML 2023

arXiv:2208.10940 [pdf, other]

Evaluating Out-of-Distribution Detectors Through Adversarial Generation of Outliers

Authors: Sangwoong Yoon, **won Choi, Yonghyeon Lee, Yung-Kyun Noh, Frank Chongwoo Park

Abstract: A reliable evaluation method is essential for building a robust out-of-distribution (OOD) detector. Current robustness evaluation protocols for OOD detectors rely on injecting perturbations to outlier data. However, the perturbations are unlikely to occur naturally or not relevant to the content of data, providing a limited assessment of robustness. In this paper, we propose Evaluation-via-Generat… ▽ More A reliable evaluation method is essential for building a robust out-of-distribution (OOD) detector. Current robustness evaluation protocols for OOD detectors rely on injecting perturbations to outlier data. However, the perturbations are unlikely to occur naturally or not relevant to the content of data, providing a limited assessment of robustness. In this paper, we propose Evaluation-via-Generation for OOD detectors (EvG), a new protocol for investigating the robustness of OOD detectors under more realistic modes of variation in outliers. EvG utilizes a generative model to synthesize plausible outliers, and employs MCMC sampling to find outliers misclassified as in-distribution with the highest confidence by a detector. We perform a comprehensive benchmark comparison of the performance of state-of-the-art OOD detectors using EvG, uncovering previously overlooked weaknesses. △ Less

Submitted 20 August, 2022; originally announced August 2022.

Comments: Code release can be found at https://github.com/EvG-OOD/evaluation-via-generation

arXiv:2105.05735 [pdf, other]

Autoencoding Under Normalization Constraints

Authors: Sangwoong Yoon, Yung-Kyun Noh, Frank Chongwoo Park

Abstract: Likelihood is a standard estimate for outlier detection. The specific role of the normalization constraint is to ensure that the out-of-distribution (OOD) regime has a small likelihood when samples are learned using maximum likelihood. Because autoencoders do not possess such a process of normalization, they often fail to recognize outliers even when they are obviously OOD. We propose the Normaliz… ▽ More Likelihood is a standard estimate for outlier detection. The specific role of the normalization constraint is to ensure that the out-of-distribution (OOD) regime has a small likelihood when samples are learned using maximum likelihood. Because autoencoders do not possess such a process of normalization, they often fail to recognize outliers even when they are obviously OOD. We propose the Normalized Autoencoder (NAE), a normalized probabilistic model constructed from an autoencoder. The probability density of NAE is defined using the reconstruction error of an autoencoder, which is differently defined in the conventional energy-based model. In our model, normalization is enforced by suppressing the reconstruction of negative samples, significantly improving the outlier detection performance. Our experimental results confirm the efficacy of NAE, both in detecting outliers and in generating in-distribution samples. △ Less

Submitted 15 June, 2023; v1 submitted 12 May, 2021; originally announced May 2021.

Comments: Accepted to ICML 2021. The code is released in https://github.com/swyoon/normalized-autoencoders . The interactive web demo on outlier reconstruction phenomenon and normalized autoencoders can be found in https://swyoon.github.io/outlier-reconstruction

arXiv:1911.10135 [pdf, other]

doi 10.1109/TAC.2021.3087559

On the Existence and Computation of Minimum Attention Optimal Control Laws

Authors: Pilhwa Lee, F. C. Park

Abstract: Brockett's minimum attention functional \cite{Brockett} has been proposed as one means of capturing the cost of control implementation--regarded here as the rate of change of the control with respect to both state and time--for general nonlinear control systems, with applications ranging from human motor control to robotics. The main challenge in forging the minimum attention paradigm into a pract… ▽ More Brockett's minimum attention functional \cite{Brockett} has been proposed as one means of capturing the cost of control implementation--regarded here as the rate of change of the control with respect to both state and time--for general nonlinear control systems, with applications ranging from human motor control to robotics. The main challenge in forging the minimum attention paradigm into a practical control design methodology is that the existence of solutions is not always assured, and finding numerical solutions is also difficult. In this paper we prove that, under the assumption of a control that is the sum of a time-varying feedforward term and a time-varying feedback term linear in the state, existence of a solution can be guaranteed. Under these assumptions we appeal to the Liouville equation representation of a nonlinear control system and derive the associated first-order optimality conditions. The one-shot method is then used to prove the existence of a solution and also to iteratively compute a solution. Our methodology is illustrated with an example involving a two degree-of-freedom robot arm. △ Less

Submitted 31 July, 2021; v1 submitted 22 November, 2019; originally announced November 2019.

Comments: 6 pages, 4 figures

arXiv:1609.02898 [pdf, ps, other]

A Linear-Time Variational Integrator for Multibody Systems

Authors: Jeongseok Lee, C. Karen Liu, Frank C. Park, Siddhartha S. Srinivasa

Abstract: We present an efficient variational integrator for multibody systems. Variational integrators reformulate the equations of motion for multibody systems as discrete Euler-Lagrange (DEL) equations, transforming forward integration into a root-finding problem for the DEL equations. Variational integrators have been shown to be more robust and accurate in preserving fundamental properties of systems,… ▽ More We present an efficient variational integrator for multibody systems. Variational integrators reformulate the equations of motion for multibody systems as discrete Euler-Lagrange (DEL) equations, transforming forward integration into a root-finding problem for the DEL equations. Variational integrators have been shown to be more robust and accurate in preserving fundamental properties of systems, such as momentum and energy, than many frequently used numerical integrators. However, state-of-the-art algorithms suffer from $O(n^3)$ complexity, which is prohibitive for articulated multibody systems with a large number of degrees of freedom, $n$, in generalized coordinates. Our key contribution is to derive a recursive algorithm that evaluates DEL equations in $O(n)$, which scales up well for complex multibody systems such as humanoid robots. Inspired by recursive Newton-Euler algorithm, our key insight is to formulate DEL equation individually for each body rather than for the entire system. Furthermore, we introduce a new quasi-Newton method that exploits the impulse-based dynamics algorithm, which is also $O(n)$, to avoid the expensive Jacobian inversion in solving DEL equations. We demonstrate scalability and efficiency, as well as extensibility to holonomic constraints through several case studies. △ Less

Submitted 5 February, 2018; v1 submitted 9 September, 2016; originally announced September 2016.

Comments: Submitted to the International Workshop on the Algorithmic Foundations of Robotics (2016)

arXiv:1411.2525 [pdf, other]

Optimising Credit Portfolio Using a Quadratic Nonlinear Projection Method

Authors: Boguk Kim, Chulwoo Han, Frank Chongwoo Park

Abstract: A novel optimisation framework through quadratic nonlinear projection is introduced for credit portfolio when the portfolio risk is measured by Conditional Value-at-Risk (CVaR). The whole optimisation procedure to search toward the optimal portfolio state is conducted by a series of single-step optimisations under the local constraints described in the multi-dimensional constraint parameter space… ▽ More A novel optimisation framework through quadratic nonlinear projection is introduced for credit portfolio when the portfolio risk is measured by Conditional Value-at-Risk (CVaR). The whole optimisation procedure to search toward the optimal portfolio state is conducted by a series of single-step optimisations under the local constraints described in the multi-dimensional constraint parameter space as functions of the total amount of portfolio adjustment. Each single-step optimisation is approximated by the first-order variation of the weight increments with respect to the total amount of portfolio adjustment and is solved in the form of locally exact formula formulated in the general Lagrange multiplier method. Our method can deal with optimisation for general nonlinear objective functions, such as the return-to-risk ratio maximisation or the diversification index, as well as the risk minimisation or the return maximisation. △ Less

Submitted 19 July, 2016; v1 submitted 10 November, 2014; originally announced November 2014.

MSC Class: 65K10; 90C55 ACM Class: G.1.6

Showing 1–10 of 10 results for author: Park, F C