Search | arXiv e-print repository

Symmetric Equilibrium Learning of VAEs

Authors: Boris Flach, Dmitrij Schlesinger, Alexander Shekhovtsov

Abstract: We view variational autoencoders (VAE) as decoder-encoder pairs, which map distributions in the data space to distributions in the latent space and vice versa. The standard learning approach for VAEs is the maximisation of the evidence lower bound (ELBO). It is asymmetric in that it aims at learning a latent variable model while using the encoder as an auxiliary means only. Moreover, it requires a… ▽ More We view variational autoencoders (VAE) as decoder-encoder pairs, which map distributions in the data space to distributions in the latent space and vice versa. The standard learning approach for VAEs is the maximisation of the evidence lower bound (ELBO). It is asymmetric in that it aims at learning a latent variable model while using the encoder as an auxiliary means only. Moreover, it requires a closed form a-priori latent distribution. This limits its applicability in more complex scenarios, such as general semi-supervised learning and employing complex generative models as priors. We propose a Nash equilibrium learning approach, which is symmetric with respect to the encoder and decoder and allows learning VAEs in situations where both the data and the latent distributions are accessible only by sampling. The flexibility and simplicity of this approach allows its application to a wide range of learning scenarios and downstream tasks. △ Less

Submitted 12 March, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

Comments: 13 pages, 6 figures, accepted for AISTATS 2024

ACM Class: I.2.6

arXiv:2207.05727 [pdf, other]

Enhancing Fairness of Visual Attribute Predictors

Authors: Tobias Hänel, Nishant Kumar, Dmitrij Schlesinger, Mengze Li, Erdem Ünal, Abouzar Eslami, Stefan Gumhold

Abstract: The performance of deep neural networks for image recognition tasks such as predicting a smiling face is known to degrade with under-represented classes of sensitive attributes. We address this problem by introducing fairness-aware regularization losses based on batch estimates of Demographic Parity, Equalized Odds, and a novel Intersection-over-Union measure. The experiments performed on facial a… ▽ More The performance of deep neural networks for image recognition tasks such as predicting a smiling face is known to degrade with under-represented classes of sensitive attributes. We address this problem by introducing fairness-aware regularization losses based on batch estimates of Demographic Parity, Equalized Odds, and a novel Intersection-over-Union measure. The experiments performed on facial and medical images from CelebA, UTKFace, and the SIIM-ISIC melanoma classification challenge show the effectiveness of our proposed fairness losses for bias mitigation as they improve model fairness while maintaining high classification performance. To the best of our knowledge, our work is the first attempt to incorporate these types of losses in an end-to-end training scheme for mitigating biases of visual attribute predictors. Our code is available at https://github.com/nish03/FVAP. △ Less

Submitted 1 October, 2022; v1 submitted 7 July, 2022; originally announced July 2022.

Comments: Camera Ready, ACCV 2022

arXiv:2102.09310 [pdf, other]

VAE Approximation Error: ELBO and Exponential Families

Authors: Alexander Shekhovtsov, Dmitrij Schlesinger, Boris Flach

Abstract: The importance of Variational Autoencoders reaches far beyond standalone generative models -- the approach is also used for learning latent representations and can be generalized to semi-supervised learning. This requires a thorough analysis of their commonly known shortcomings: posterior collapse and approximation errors. This paper analyzes VAE approximation errors caused by the combination of t… ▽ More The importance of Variational Autoencoders reaches far beyond standalone generative models -- the approach is also used for learning latent representations and can be generalized to semi-supervised learning. This requires a thorough analysis of their commonly known shortcomings: posterior collapse and approximation errors. This paper analyzes VAE approximation errors caused by the combination of the ELBO objective and encoder models from conditional exponential families, including, but not limited to, commonly used conditionally independent discrete and continuous models. We characterize subclasses of generative models consistent with these encoder families. We show that the ELBO optimizer is pulled away from the likelihood optimizer towards the consistent subset and study this effect experimentally. Importantly, this subset can not be enlarged, and the respective error cannot be decreased, by considering deeper encoder/decoder networks. △ Less

Submitted 11 April, 2022; v1 submitted 18 February, 2021; originally announced February 2021.

Comments: ICLR 2022 spotlight

arXiv:1710.11052 [pdf, other]

A Connection between Feed-Forward Neural Networks and Probabilistic Graphical Models

Authors: Dmitrij Schlesinger

Abstract: Two of the most popular modelling paradigms in computer vision are feed-forward neural networks (FFNs) and probabilistic graphical models (GMs). Various connections between the two have been studied in recent works, such as e.g. expressing mean-field based inference in a GM as an FFN. This paper establishes a new connection between FFNs and GMs. Our key observation is that any FFN implements a cer… ▽ More Two of the most popular modelling paradigms in computer vision are feed-forward neural networks (FFNs) and probabilistic graphical models (GMs). Various connections between the two have been studied in recent works, such as e.g. expressing mean-field based inference in a GM as an FFN. This paper establishes a new connection between FFNs and GMs. Our key observation is that any FFN implements a certain approximation of a corresponding Bayesian network (BN). We characterize various benefits of having this connection. In particular, it results in a new learning algorithm for BNs. We validate the proposed methods for a classification problem on CIFAR-10 dataset and for binary image segmentation on Weizmann Horse dataset. We show that statistically learned BNs improve performance, having at the same time essentially better generalization capability, than their FFN counterparts. △ Less

Submitted 30 October, 2017; originally announced October 2017.

arXiv:1702.06461 [pdf, other]

Crowd Sourcing Image Segmentation with iaSTAPLE

Authors: Dmitrij Schlesinger, Florian Jug, Gene Myers, Carsten Rother, Dagmar Kainmüller

Abstract: We propose a novel label fusion technique as well as a crowdsourcing protocol to efficiently obtain accurate epithelial cell segmentations from non-expert crowd workers. Our label fusion technique simultaneously estimates the true segmentation, the performance levels of individual crowd workers, and an image segmentation model in the form of a pairwise Markov random field. We term our approach ima… ▽ More We propose a novel label fusion technique as well as a crowdsourcing protocol to efficiently obtain accurate epithelial cell segmentations from non-expert crowd workers. Our label fusion technique simultaneously estimates the true segmentation, the performance levels of individual crowd workers, and an image segmentation model in the form of a pairwise Markov random field. We term our approach image-aware STAPLE (iaSTAPLE) since our image segmentation model seamlessly integrates into the well-known and widely used STAPLE approach. In an evaluation on a light microscopy dataset containing more than 5000 membrane labeled epithelial cells of a fly wing, we show that iaSTAPLE outperforms STAPLE in terms of segmentation accuracy as well as in terms of the accuracy of estimated crowd worker performance levels, and is able to correctly segment 99% of all cells when compared to expert segmentations. These results show that iaSTAPLE is a highly useful tool for crowd sourcing image segmentation. △ Less

Submitted 21 February, 2017; originally announced February 2017.

Comments: Accepted to ISBI2017

arXiv:1612.01397 [pdf, other]

Implicit Modeling -- A Generalization of Discriminative and Generative Approaches

Authors: Dmitrij Schlesinger, Carsten Rother

Abstract: We propose a new modeling approach that is a generalization of generative and discriminative models. The core idea is to use an implicit parameterization of a joint probability distribution by specifying only the conditional distributions. The proposed scheme combines the advantages of both worlds -- it can use powerful complex discriminative models as its parts, having at the same time better gen… ▽ More We propose a new modeling approach that is a generalization of generative and discriminative models. The core idea is to use an implicit parameterization of a joint probability distribution by specifying only the conditional distributions. The proposed scheme combines the advantages of both worlds -- it can use powerful complex discriminative models as its parts, having at the same time better generalization capabilities. We thoroughly evaluate the proposed method for a simple classification task with artificial data and illustrate its advantages for real-word scenarios on a semantic image segmentation problem. △ Less

Submitted 5 December, 2016; originally announced December 2016.

arXiv:1511.05067 [pdf, other]

Joint Training of Generic CNN-CRF Models with Stochastic Optimization

Authors: Alexander Kirillov, Dmitrij Schlesinger, Shuai Zheng, Bogdan Savchynskyy, Philip H. S. Torr, Carsten Rother

Abstract: We propose a new CNN-CRF end-to-end learning framework, which is based on joint stochastic optimization with respect to both Convolutional Neural Network (CNN) and Conditional Random Field (CRF) parameters. While stochastic gradient descent is a standard technique for CNN training, it was not used for joint models so far. We show that our learning method is (i) general, i.e. it applies to arbitrar… ▽ More We propose a new CNN-CRF end-to-end learning framework, which is based on joint stochastic optimization with respect to both Convolutional Neural Network (CNN) and Conditional Random Field (CRF) parameters. While stochastic gradient descent is a standard technique for CNN training, it was not used for joint models so far. We show that our learning method is (i) general, i.e. it applies to arbitrary CNN and CRF architectures and potential functions; (ii) scalable, i.e. it has a low memory footprint and straightforwardly parallelizes on GPUs; (iii) easy in implementation. Additionally, the unified CNN-CRF optimization approach simplifies a potential hardware implementation. We empirically evaluate our method on the task of semantic labeling of body parts in depth images and show that it compares favorably to competing techniques. △ Less

Submitted 14 September, 2016; v1 submitted 16 November, 2015; originally announced November 2015.

Comments: ACCV2016

arXiv:1107.2807 [pdf, other]

Modelling Distributed Shape Priors by Gibbs Random Fields of Second Order

Authors: Boris Flach, Dmitrij Schlesinger

Abstract: We analyse the potential of Gibbs Random Fields for shape prior modelling. We show that the expressive power of second order GRFs is already sufficient to express simple shapes and spatial relations between them simultaneously. This allows to model and recognise complex shapes as spatial compositions of simpler parts. We analyse the potential of Gibbs Random Fields for shape prior modelling. We show that the expressive power of second order GRFs is already sufficient to express simple shapes and spatial relations between them simultaneously. This allows to model and recognise complex shapes as spatial compositions of simpler parts. △ Less

Submitted 14 July, 2011; originally announced July 2011.

Comments: 17 pages, 8 figures

Journal ref: Control Systems and Computers, (2) 2011, pp 14-24

Showing 1–8 of 8 results for author: Schlesinger, D