-
Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models
Authors:
Sangwoong Yoon,
Himchan Hwang,
Dohyun Kwon,
Yung-Kyun Noh,
Frank C. Park
Abstract:
We present a maximum entropy inverse reinforcement learning (IRL) approach for improving the sample quality of diffusion generative models, especially when the number of generation time steps is small. Similar to how IRL trains a policy based on the reward function learned from expert demonstrations, we train (or fine-tune) a diffusion model using the log probability density estimated from trainin…
▽ More
We present a maximum entropy inverse reinforcement learning (IRL) approach for improving the sample quality of diffusion generative models, especially when the number of generation time steps is small. Similar to how IRL trains a policy based on the reward function learned from expert demonstrations, we train (or fine-tune) a diffusion model using the log probability density estimated from training data. Since we employ an energy-based model (EBM) to represent the log density, our approach boils down to the joint training of a diffusion model and an EBM. Our IRL formulation, named Diffusion by Maximum Entropy IRL (DxMI), is a minimax problem that reaches equilibrium when both models converge to the data distribution. The entropy maximization plays a key role in DxMI, facilitating the exploration of the diffusion model and ensuring the convergence of the EBM. We also propose Diffusion by Dynamic Programming (DxDP), a novel reinforcement learning algorithm for diffusion models, as a subroutine in DxMI. DxDP makes the diffusion model update in DxMI efficient by transforming the original problem into an optimal control formulation where value functions replace back-propagation in time. Our empirical studies show that diffusion models fine-tuned using DxMI can generate high-quality samples in as few as 4 and 10 steps. Additionally, DxMI enables the training of an EBM without MCMC, stabilizing EBM training dynamics and enhancing anomaly detection performance.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Generalized Contrastive Divergence: Joint Training of Energy-Based Model and Diffusion Model through Inverse Reinforcement Learning
Authors:
Sangwoong Yoon,
Dohyun Kwon,
Himchan Hwang,
Yung-Kyun Noh,
Frank C. Park
Abstract:
We present Generalized Contrastive Divergence (GCD), a novel objective function for training an energy-based model (EBM) and a sampler simultaneously. GCD generalizes Contrastive Divergence (Hinton, 2002), a celebrated algorithm for training EBM, by replacing Markov Chain Monte Carlo (MCMC) distribution with a trainable sampler, such as a diffusion model. In GCD, the joint training of EBM and a di…
▽ More
We present Generalized Contrastive Divergence (GCD), a novel objective function for training an energy-based model (EBM) and a sampler simultaneously. GCD generalizes Contrastive Divergence (Hinton, 2002), a celebrated algorithm for training EBM, by replacing Markov Chain Monte Carlo (MCMC) distribution with a trainable sampler, such as a diffusion model. In GCD, the joint training of EBM and a diffusion model is formulated as a minimax problem, which reaches an equilibrium when both models converge to the data distribution. The minimax learning with GCD bears interesting equivalence to inverse reinforcement learning, where the energy corresponds to a negative reward, the diffusion model is a policy, and the real data is expert demonstrations. We present preliminary yet promising results showing that joint training is beneficial for both EBM and a diffusion model. GCD enables EBM training without MCMC while improving the sample quality of a diffusion model.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
Variational Weighting for Kernel Density Ratios
Authors:
Sangwoong Yoon,
Frank C. Park,
Gunsu S Yun,
Iljung Kim,
Yung-Kyun Noh
Abstract:
Kernel density estimation (KDE) is integral to a range of generative and discriminative tasks in machine learning. Drawing upon tools from the multidimensional calculus of variations, we derive an optimal weight function that reduces bias in standard kernel density estimates for density ratios, leading to improved estimates of prediction posteriors and information-theoretic measures. In the proces…
▽ More
Kernel density estimation (KDE) is integral to a range of generative and discriminative tasks in machine learning. Drawing upon tools from the multidimensional calculus of variations, we derive an optimal weight function that reduces bias in standard kernel density estimates for density ratios, leading to improved estimates of prediction posteriors and information-theoretic measures. In the process, we shed light on some fundamental aspects of density estimation, particularly from the perspective of algorithms that employ KDEs as their main building blocks.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Energy-Based Models for Anomaly Detection: A Manifold Diffusion Recovery Approach
Authors:
Sangwoong Yoon,
Young-Uk **,
Yung-Kyun Noh,
Frank C. Park
Abstract:
We present a new method of training energy-based models (EBMs) for anomaly detection that leverages low-dimensional structures within data. The proposed algorithm, Manifold Projection-Diffusion Recovery (MPDR), first perturbs a data point along a low-dimensional manifold that approximates the training dataset. Then, EBM is trained to maximize the probability of recovering the original data. The tr…
▽ More
We present a new method of training energy-based models (EBMs) for anomaly detection that leverages low-dimensional structures within data. The proposed algorithm, Manifold Projection-Diffusion Recovery (MPDR), first perturbs a data point along a low-dimensional manifold that approximates the training dataset. Then, EBM is trained to maximize the probability of recovering the original data. The training involves the generation of negative samples via MCMC, as in conventional EBM training, but from a different distribution concentrated near the manifold. The resulting near-manifold negative samples are highly informative, reflecting relevant modes of variation in data. An energy function of MPDR effectively learns accurate boundaries of the training data distribution and excels at detecting out-of-distribution samples. Experimental results show that MPDR exhibits strong performance across various anomaly detection tasks involving diverse data types, such as images, vectors, and acoustic signals.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
On Explicit Curvature Regularization in Deep Generative Models
Authors:
Yonghyeon Lee,
Frank Chongwoo Park
Abstract:
We propose a family of curvature-based regularization terms for deep generative model learning. Explicit coordinate-invariant formulas for both intrinsic and extrinsic curvature measures are derived for the case of arbitrary data manifolds embedded in higher-dimensional Euclidean space. Because computing the curvature is a highly computation-intensive process involving the evaluation of second-ord…
▽ More
We propose a family of curvature-based regularization terms for deep generative model learning. Explicit coordinate-invariant formulas for both intrinsic and extrinsic curvature measures are derived for the case of arbitrary data manifolds embedded in higher-dimensional Euclidean space. Because computing the curvature is a highly computation-intensive process involving the evaluation of second-order derivatives, efficient formulas are derived for approximately evaluating intrinsic and extrinsic curvatures. Comparative studies are conducted that compare the relative efficacy of intrinsic versus extrinsic curvature-based regularization measures, as well as performance comparisons against existing autoencoder training methods. Experiments involving noisy motion capture data confirm that curvature-based methods outperform existing autoencoder regularization methods, with intrinsic curvature measures slightly more effective than extrinsic curvature measures.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Evaluating Out-of-Distribution Detectors Through Adversarial Generation of Outliers
Authors:
Sangwoong Yoon,
**won Choi,
Yonghyeon Lee,
Yung-Kyun Noh,
Frank Chongwoo Park
Abstract:
A reliable evaluation method is essential for building a robust out-of-distribution (OOD) detector. Current robustness evaluation protocols for OOD detectors rely on injecting perturbations to outlier data. However, the perturbations are unlikely to occur naturally or not relevant to the content of data, providing a limited assessment of robustness. In this paper, we propose Evaluation-via-Generat…
▽ More
A reliable evaluation method is essential for building a robust out-of-distribution (OOD) detector. Current robustness evaluation protocols for OOD detectors rely on injecting perturbations to outlier data. However, the perturbations are unlikely to occur naturally or not relevant to the content of data, providing a limited assessment of robustness. In this paper, we propose Evaluation-via-Generation for OOD detectors (EvG), a new protocol for investigating the robustness of OOD detectors under more realistic modes of variation in outliers. EvG utilizes a generative model to synthesize plausible outliers, and employs MCMC sampling to find outliers misclassified as in-distribution with the highest confidence by a detector. We perform a comprehensive benchmark comparison of the performance of state-of-the-art OOD detectors using EvG, uncovering previously overlooked weaknesses.
△ Less
Submitted 20 August, 2022;
originally announced August 2022.
-
Autoencoding Under Normalization Constraints
Authors:
Sangwoong Yoon,
Yung-Kyun Noh,
Frank Chongwoo Park
Abstract:
Likelihood is a standard estimate for outlier detection. The specific role of the normalization constraint is to ensure that the out-of-distribution (OOD) regime has a small likelihood when samples are learned using maximum likelihood. Because autoencoders do not possess such a process of normalization, they often fail to recognize outliers even when they are obviously OOD. We propose the Normaliz…
▽ More
Likelihood is a standard estimate for outlier detection. The specific role of the normalization constraint is to ensure that the out-of-distribution (OOD) regime has a small likelihood when samples are learned using maximum likelihood. Because autoencoders do not possess such a process of normalization, they often fail to recognize outliers even when they are obviously OOD. We propose the Normalized Autoencoder (NAE), a normalized probabilistic model constructed from an autoencoder. The probability density of NAE is defined using the reconstruction error of an autoencoder, which is differently defined in the conventional energy-based model. In our model, normalization is enforced by suppressing the reconstruction of negative samples, significantly improving the outlier detection performance. Our experimental results confirm the efficacy of NAE, both in detecting outliers and in generating in-distribution samples.
△ Less
Submitted 15 June, 2023; v1 submitted 12 May, 2021;
originally announced May 2021.
-
On the Existence and Computation of Minimum Attention Optimal Control Laws
Authors:
Pilhwa Lee,
F. C. Park
Abstract:
Brockett's minimum attention functional \cite{Brockett} has been proposed as one means of capturing the cost of control implementation--regarded here as the rate of change of the control with respect to both state and time--for general nonlinear control systems, with applications ranging from human motor control to robotics. The main challenge in forging the minimum attention paradigm into a pract…
▽ More
Brockett's minimum attention functional \cite{Brockett} has been proposed as one means of capturing the cost of control implementation--regarded here as the rate of change of the control with respect to both state and time--for general nonlinear control systems, with applications ranging from human motor control to robotics. The main challenge in forging the minimum attention paradigm into a practical control design methodology is that the existence of solutions is not always assured, and finding numerical solutions is also difficult. In this paper we prove that, under the assumption of a control that is the sum of a time-varying feedforward term and a time-varying feedback term linear in the state, existence of a solution can be guaranteed. Under these assumptions we appeal to the Liouville equation representation of a nonlinear control system and derive the associated first-order optimality conditions. The one-shot method is then used to prove the existence of a solution and also to iteratively compute a solution. Our methodology is illustrated with an example involving a two degree-of-freedom robot arm.
△ Less
Submitted 31 July, 2021; v1 submitted 22 November, 2019;
originally announced November 2019.
-
A Linear-Time Variational Integrator for Multibody Systems
Authors:
Jeongseok Lee,
C. Karen Liu,
Frank C. Park,
Siddhartha S. Srinivasa
Abstract:
We present an efficient variational integrator for multibody systems. Variational integrators reformulate the equations of motion for multibody systems as discrete Euler-Lagrange (DEL) equations, transforming forward integration into a root-finding problem for the DEL equations. Variational integrators have been shown to be more robust and accurate in preserving fundamental properties of systems,…
▽ More
We present an efficient variational integrator for multibody systems. Variational integrators reformulate the equations of motion for multibody systems as discrete Euler-Lagrange (DEL) equations, transforming forward integration into a root-finding problem for the DEL equations. Variational integrators have been shown to be more robust and accurate in preserving fundamental properties of systems, such as momentum and energy, than many frequently used numerical integrators. However, state-of-the-art algorithms suffer from $O(n^3)$ complexity, which is prohibitive for articulated multibody systems with a large number of degrees of freedom, $n$, in generalized coordinates. Our key contribution is to derive a recursive algorithm that evaluates DEL equations in $O(n)$, which scales up well for complex multibody systems such as humanoid robots. Inspired by recursive Newton-Euler algorithm, our key insight is to formulate DEL equation individually for each body rather than for the entire system. Furthermore, we introduce a new quasi-Newton method that exploits the impulse-based dynamics algorithm, which is also $O(n)$, to avoid the expensive Jacobian inversion in solving DEL equations. We demonstrate scalability and efficiency, as well as extensibility to holonomic constraints through several case studies.
△ Less
Submitted 5 February, 2018; v1 submitted 9 September, 2016;
originally announced September 2016.
-
Optimising Credit Portfolio Using a Quadratic Nonlinear Projection Method
Authors:
Boguk Kim,
Chulwoo Han,
Frank Chongwoo Park
Abstract:
A novel optimisation framework through quadratic nonlinear projection is introduced for credit portfolio when the portfolio risk is measured by Conditional Value-at-Risk (CVaR). The whole optimisation procedure to search toward the optimal portfolio state is conducted by a series of single-step optimisations under the local constraints described in the multi-dimensional constraint parameter space…
▽ More
A novel optimisation framework through quadratic nonlinear projection is introduced for credit portfolio when the portfolio risk is measured by Conditional Value-at-Risk (CVaR). The whole optimisation procedure to search toward the optimal portfolio state is conducted by a series of single-step optimisations under the local constraints described in the multi-dimensional constraint parameter space as functions of the total amount of portfolio adjustment. Each single-step optimisation is approximated by the first-order variation of the weight increments with respect to the total amount of portfolio adjustment and is solved in the form of locally exact formula formulated in the general Lagrange multiplier method. Our method can deal with optimisation for general nonlinear objective functions, such as the return-to-risk ratio maximisation or the diversification index, as well as the risk minimisation or the return maximisation.
△ Less
Submitted 19 July, 2016; v1 submitted 10 November, 2014;
originally announced November 2014.