Search | arXiv e-print repository

Disentangling Domain and Content

Authors: Dan Andrei Iliescu, Aliaksei Mikhailiuk, Damon Wischik, Rafal Mantiuk

Abstract: Many real-world datasets can be divided into groups according to certain salient features (e.g. grou** images by subject, grou** text by font, etc.). Often, machine learning tasks require that these features be represented separately from those manifesting independently of the grou**. For example, image translation entails changing the style of an image while preserving its content. We forma… ▽ More Many real-world datasets can be divided into groups according to certain salient features (e.g. grou** images by subject, grou** text by font, etc.). Often, machine learning tasks require that these features be represented separately from those manifesting independently of the grou**. For example, image translation entails changing the style of an image while preserving its content. We formalize these two kinds of attributes as two complementary generative factors called "domain" and "content", and address the problem of disentangling them in a fully unsupervised way. To achieve this, we propose a principled, generalizable probabilistic model inspired by the Variational Autoencoder. Our model exhibits state-of-the-art performance on the composite task of generating images by combining the domain of one input with the content of another. Distinctively, it can perform this task in a few-shot, unsupervised manner, without being provided with explicit labelling for either domain or content. The disentangled representations are learned through the combination of a group-wise encoder and a novel domain-confusion loss. △ Less

Submitted 15 February, 2022; originally announced February 2022.

arXiv:2103.14616 [pdf, other]

Training a Task-Specific Image Reconstruction Loss

Authors: Aamir Mustafa, Aliaksei Mikhailiuk, Dan Andrei Iliescu, Varun Babbar, Rafal K. Mantiuk

Abstract: The choice of a loss function is an important factor when training neural networks for image restoration problems, such as single image super resolution. The loss function should encourage natural and perceptually pleasing results. A popular choice for a loss is a pre-trained network, such as VGG, which is used as a feature extractor for computing the difference between restored and reference imag… ▽ More The choice of a loss function is an important factor when training neural networks for image restoration problems, such as single image super resolution. The loss function should encourage natural and perceptually pleasing results. A popular choice for a loss is a pre-trained network, such as VGG, which is used as a feature extractor for computing the difference between restored and reference images. However, such an approach has multiple drawbacks: it is computationally expensive, requires regularization and hyper-parameter tuning, and involves a large network trained on an unrelated task. Furthermore, it has been observed that there is no single loss function that works best across all applications and across different datasets. In this work, we instead propose to train a set of loss functions that are application specific in nature. Our loss function comprises a series of discriminators that are trained to detect and penalize the presence of application-specific artifacts. We show that a single natural image and corresponding distortions are sufficient to train our feature extractor that outperforms state-of-the-art loss functions in applications like single image super resolution, denoising, and JPEG artifact removal. Finally, we conclude that an effective loss function does not have to be a good predictor of perceived image quality, but instead needs to be specialized in identifying the distortions for a given restoration method. △ Less

Submitted 17 October, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

Comments: Accepted at WACV 2022

arXiv:2012.10758 [pdf, other]

doi 10.1109/TMM.2021.3076298

Consolidated Dataset and Metrics for High-Dynamic-Range Image Quality

Authors: Aliaksei Mikhailiuk, Maria Perez-Ortiz, Dingcheng Yue, Wilson Suen, Rafal K. Mantiuk

Abstract: Increasing popularity of high-dynamic-range (HDR) image and video content brings the need for metrics that could predict the severity of image impairments as seen on displays of different brightness levels and dynamic range. Such metrics should be trained and validated on a sufficiently large subjective image quality dataset to ensure robust performance. As the existing HDR quality datasets are li… ▽ More Increasing popularity of high-dynamic-range (HDR) image and video content brings the need for metrics that could predict the severity of image impairments as seen on displays of different brightness levels and dynamic range. Such metrics should be trained and validated on a sufficiently large subjective image quality dataset to ensure robust performance. As the existing HDR quality datasets are limited in size, we created a Unified Photometric Image Quality dataset (UPIQ) with over 4,000 images by realigning and merging existing HDR and standard-dynamic-range (SDR) datasets. The realigned quality scores share the same unified quality scale across all datasets. Such realignment was achieved by collecting additional cross-dataset quality comparisons and re-scaling data with a psychometric scaling method. Images in the proposed dataset are represented in absolute photometric and colorimetric units, corresponding to light emitted from a display. We use the new dataset to retrain existing HDR metrics and show that the dataset is sufficiently large for training deep architectures. We show the utility of the dataset on brightness aware image compression. △ Less

Submitted 10 May, 2021; v1 submitted 19 December, 2020; originally announced December 2020.

arXiv:2004.05691 [pdf, other]

Active Sampling for Pairwise Comparisons via Approximate Message Passing and Information Gain Maximization

Authors: Aliaksei Mikhailiuk, Clifford Wilmot, Maria Perez-Ortiz, Dingcheng Yue, Rafal Mantiuk

Abstract: Pairwise comparison data arise in many domains with subjective assessment experiments, for example in image and video quality assessment. In these experiments observers are asked to express a preference between two conditions. However, many pairwise comparison protocols require a large number of comparisons to infer accurate scores, which may be unfeasible when each comparison is time-consuming (e… ▽ More Pairwise comparison data arise in many domains with subjective assessment experiments, for example in image and video quality assessment. In these experiments observers are asked to express a preference between two conditions. However, many pairwise comparison protocols require a large number of comparisons to infer accurate scores, which may be unfeasible when each comparison is time-consuming (e.g. videos) or expensive (e.g. medical imaging). This motivates the use of an active sampling algorithm that chooses only the most informative pairs for comparison. In this paper we propose ASAP, an active sampling algorithm based on approximate message passing and expected information gain maximization. Unlike most existing methods, which rely on partial updates of the posterior distribution, we are able to perform full updates and therefore much improve the accuracy of the inferred scores. The algorithm relies on three techniques for reducing computational cost: inference based on approximate message passing, selective evaluations of the information gain, and selecting pairs in a batch that forms a minimum spanning tree of the inverse of information gain. We demonstrate, with real and synthetic data, that ASAP offers the highest accuracy of inferred scores compared to the existing methods. We also provide an open-source GPU implementation of ASAP for large-scale experiments. △ Less

Submitted 12 April, 2020; originally announced April 2020.

Showing 1–4 of 4 results for author: Mikhailiuk, A