-
Data is Overrated: Perceptual Metrics Can Lead Learning in the Absence of Training Data
Authors:
Tashi Namgyal,
Alexander Hepburn,
Raul Santos-Rodriguez,
Valero Laparra,
Jesus Malo
Abstract:
Perceptual metrics are traditionally used to evaluate the quality of natural signals, such as images and audio. They are designed to mimic the perceptual behaviour of human observers and usually reflect structures found in natural signals. This motivates their use as loss functions for training generative models such that models will learn to capture the structure held in the metric. We take this…
▽ More
Perceptual metrics are traditionally used to evaluate the quality of natural signals, such as images and audio. They are designed to mimic the perceptual behaviour of human observers and usually reflect structures found in natural signals. This motivates their use as loss functions for training generative models such that models will learn to capture the structure held in the metric. We take this idea to the extreme in the audio domain by training a compressive autoencoder to reconstruct uniform noise, in lieu of natural data. We show that training with perceptual losses improves the reconstruction of spectrograms and re-synthesized audio at test time over models trained with a standard Euclidean loss. This demonstrates better generalisation to unseen natural signals when using perceptual metrics.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
What You Hear Is What You See: Audio Quality Metrics From Image Quality Metrics
Authors:
Tashi Namgyal,
Alexander Hepburn,
Raul Santos-Rodriguez,
Valero Laparra,
Jesus Malo
Abstract:
In this study, we investigate the feasibility of utilizing state-of-the-art image perceptual metrics for evaluating audio signals by representing them as spectrograms. The encouraging outcome of the proposed approach is based on the similarity between the neural mechanisms in the auditory and visual pathways. Furthermore, we customise one of the metrics which has a psychoacoustically plausible arc…
▽ More
In this study, we investigate the feasibility of utilizing state-of-the-art image perceptual metrics for evaluating audio signals by representing them as spectrograms. The encouraging outcome of the proposed approach is based on the similarity between the neural mechanisms in the auditory and visual pathways. Furthermore, we customise one of the metrics which has a psychoacoustically plausible architecture to account for the peculiarities of sound signals. We evaluate the effectiveness of our proposed metric and several baseline metrics using a music dataset, with promising results in terms of the correlation between the metrics and the perceived quality of audio as rated by human evaluators.
△ Less
Submitted 30 August, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
On the relation between statistical learning and perceptual distances
Authors:
Alexander Hepburn,
Valero Laparra,
Raul Santos-Rodriguez,
Johannes Ballé,
Jesús Malo
Abstract:
It has been demonstrated many times that the behavior of the human visual system is connected to the statistics of natural images. Since machine learning relies on the statistics of training data as well, the above connection has interesting implications when using perceptual distances (which mimic the behavior of the human visual system) as a loss function. In this paper, we aim to unravel the no…
▽ More
It has been demonstrated many times that the behavior of the human visual system is connected to the statistics of natural images. Since machine learning relies on the statistics of training data as well, the above connection has interesting implications when using perceptual distances (which mimic the behavior of the human visual system) as a loss function. In this paper, we aim to unravel the non-trivial relationships between the probability distribution of the data, perceptual distances, and unsupervised machine learning. To this end, we show that perceptual sensitivity is correlated with the probability of an image in its close neighborhood. We also explore the relation between distances induced by autoencoders and the probability distribution of the training data, as well as how these induced distances are correlated with human perception. Finally, we find perceptual distances do not always lead to noticeable gains in performance over Euclidean distance in common image processing tasks, except when data is scarce and the perceptual distance provides regularization. We propose this may be due to a \emph{double-counting} effect of the image statistics, once in the perceptual distance and once in the training procedure.
△ Less
Submitted 16 March, 2022; v1 submitted 8 June, 2021;
originally announced June 2021.
-
Physics-Aware Gaussian Processes in Remote Sensing
Authors:
Gustau Camps-Valls,
Luca Martino,
Daniel H. Svendsen,
Manuel Campos-Taberner,
Jordi Muñoz-Marí,
Valero Laparra,
David Luengo,
Francisco Javier García-Haro
Abstract:
Earth observation from satellite sensory data poses challenging problems, where machine learning is currently a key player. In recent years, Gaussian Process (GP) regression has excelled in biophysical parameter estimation tasks from airborne and satellite observations. GP regression is based on solid Bayesian statistics and generally yields efficient and accurate parameter estimates. However, GPs…
▽ More
Earth observation from satellite sensory data poses challenging problems, where machine learning is currently a key player. In recent years, Gaussian Process (GP) regression has excelled in biophysical parameter estimation tasks from airborne and satellite observations. GP regression is based on solid Bayesian statistics and generally yields efficient and accurate parameter estimates. However, GPs are typically used for inverse modeling based on concurrent observations and in situ measurements only. Very often a forward model encoding the well-understood physical relations between the state vector and the radiance observations is available though and could be useful to improve predictions and understanding. In this work, we review three GP models that respect and learn the physics of the underlying processes in the context of both forward and inverse modeling. After reviewing the traditional application of GPs for parameter retrieval, we introduce a Joint GP (JGP) model that combines in situ measurements and simulated data in a single GP model. Then, we present a latent force model (LFM) for GP modeling that encodes ordinary differential equations to blend data-driven modeling and physical constraints of the system governing equations. The LFM performs multi-output regression, adapts to the signal characteristics, is able to cope with missing data in the time series, and provides explicit latent functions that allow system analysis and evaluation. Finally, we present an Automatic Gaussian Process Emulator (AGAPE) that approximates the forward physical model using concepts from Bayesian optimization and at the same time builds an optimally compact look-up-table for inversion. We give empirical evidence of the performance of these models through illustrative examples of vegetation monitoring and atmospheric modeling.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Spatial noise-aware temperature retrieval from infrared sounder data
Authors:
David Malmgren-Hansen,
Valero Laparra,
Allan Aasbjerg Nielsen,
Gustau Camps-Valls
Abstract:
In this paper we present a combined strategy for the retrieval of atmospheric profiles from infrared sounders. The approach considers the spatial information and a noise-dependent dimensionality reduction approach. The extracted features are fed into a canonical linear regression. We compare Principal Component Analysis (PCA) and Minimum Noise Fraction (MNF) for dimensionality reduction, and study…
▽ More
In this paper we present a combined strategy for the retrieval of atmospheric profiles from infrared sounders. The approach considers the spatial information and a noise-dependent dimensionality reduction approach. The extracted features are fed into a canonical linear regression. We compare Principal Component Analysis (PCA) and Minimum Noise Fraction (MNF) for dimensionality reduction, and study the compactness and information content of the extracted features. Assessment of the results is done on a big dataset covering many spatial and temporal situations. PCA is widely used for these purposes but our analysis shows that one can gain significant improvements of the error rates when using MNF instead. In our analysis we also investigate the relationship between error rate improvements when including more spectral and spatial components in the regression model, aiming to uncover the trade-off between model complexity and error rates.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
-
Generation of global vegetation products from EUMETSAT AVHRR/METOP satellites
Authors:
Francisco Javier García-Haro,
Manuel Campos-Taberner,
Beatriz Martínez,
Sergio Sánchez-Ruiz,
María Amparo Gilabert,
Gustau Camps-Valls,
Jordi Muñoz-Marí,
Valero Laparra,
Fernando Camacho,
Jorge Sanchez-Zapero,
Beatriz Fuster
Abstract:
We describe the methodology applied for the retrieval of global LAI, FAPAR and FVC from Advanced Very High Resolution Radiometer (AVHRR) onboard the Meteorological-Operational (MetOp) polar orbiting satellites also known as EUMETSAT Polar System (EPS). A novel approach has been developed for the joint retrieval of three parameters (LAI, FVC, and FAPAR) instead of training one model per parameter.…
▽ More
We describe the methodology applied for the retrieval of global LAI, FAPAR and FVC from Advanced Very High Resolution Radiometer (AVHRR) onboard the Meteorological-Operational (MetOp) polar orbiting satellites also known as EUMETSAT Polar System (EPS). A novel approach has been developed for the joint retrieval of three parameters (LAI, FVC, and FAPAR) instead of training one model per parameter. The method relies on multi-output Gaussian Processes Regression (GPR) trained over PROSAIL EPS simulations. A sensitivity analysis is performed to assess several sources of uncertainties in retrievals and maximize the positive impact of modeling the noise in training simulations. We describe the main features of the operational processing chain along with the current status of the global EPS vegetation products, including details about its overall quality and preliminary assessment of the products based on intercomparison with equivalent (MODIS, PROBA-V) satellite vegetation products.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Derivation of global vegetation biophysical parameters from EUMETSAT Polar System
Authors:
Francisco Javier García-Haro,
Manuel Campos-Taberner,
Jordi Muñoz-Marí,
Valero Laparra,
Fernando Camacho,
Jorge Sanchez-Zapero,
Gustau Camps-Valls
Abstract:
This paper presents the algorithm developed in LSA-SAF (Satellite Application Facility for Land Surface Analysis) for the derivation of global vegetation parameters from the AVHRR (Advanced Very High-Resolution Radiometer) sensor onboard MetOp (Meteorological-Operational) satellites forming the EUMETSAT (European Organization for the Exploitation of Meteorological Satellites) Polar System (EPS). T…
▽ More
This paper presents the algorithm developed in LSA-SAF (Satellite Application Facility for Land Surface Analysis) for the derivation of global vegetation parameters from the AVHRR (Advanced Very High-Resolution Radiometer) sensor onboard MetOp (Meteorological-Operational) satellites forming the EUMETSAT (European Organization for the Exploitation of Meteorological Satellites) Polar System (EPS). The suite of LSA-SAF EPS vegetation products includes the leaf area index (LAI), the fractional vegetation cover (FVC), and the fraction of absorbed photosynthetically active radiation (FAPAR). LAI, FAPAR, and FVC characterize the structure and the functioning of vegetation and are key parameters for a wide range of land-biosphere applications. The algorithm is based on a hybrid approach that blends the generalization capabilities offered by physical radiative transfer models with the accuracy and computational efficiency of machine learning methods. One major feature is the implementation of multi-output retrieval methods able to jointly and more consistently estimate all the biophysical parameters at the same time. We propose a multi-output Gaussian process regression (GPRmulti), which outperforms other considered methods over PROSAIL (coupling of PROSPECT and SAIL (Scattering by Arbitrary Inclined Leaves) radiative transfer models) EPS simulations. The global EPS products include uncertainty estimates taking into account the uncertainty captured by the retrieval method and input error propagation. The consistent generation and distribution of the EPS vegetation products will constitute a valuable tool for monitoring of earth surface dynamic processes.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Cross-Sensor Adversarial Domain Adaptation of Landsat-8 and Proba-V images for Cloud Detection
Authors:
Gonzalo Mateo-García,
Valero Laparra,
Dan López-Puigdollers,
Luis Gómez-Chova
Abstract:
The number of Earth observation satellites carrying optical sensors with similar characteristics is constantly growing. Despite their similarities and the potential synergies among them, derived satellite products are often developed for each sensor independently. Differences in retrieved radiances lead to significant drops in accuracy, which hampers knowledge and information sharing across sensor…
▽ More
The number of Earth observation satellites carrying optical sensors with similar characteristics is constantly growing. Despite their similarities and the potential synergies among them, derived satellite products are often developed for each sensor independently. Differences in retrieved radiances lead to significant drops in accuracy, which hampers knowledge and information sharing across sensors. This is particularly harmful for machine learning algorithms, since gathering new ground truth data to train models for each sensor is costly and requires experienced manpower. In this work, we propose a domain adaptation transformation to reduce the statistical differences between images of two satellite sensors in order to boost the performance of transfer learning models. The proposed methodology is based on the Cycle Consistent Generative Adversarial Domain Adaptation (CyCADA) framework that trains the transformation model in an unpaired manner. In particular, Landsat-8 and Proba-V satellites, which present different but compatible spatio-spectral characteristics, are used to illustrate the method. The obtained transformation significantly reduces differences between the image datasets while preserving the spatial and spectral information of adapted images, which is hence useful for any general purpose cross-sensor application. In addition, the training of the proposed adversarial domain adaptation model can be modified to improve the performance in a specific remote sensing application, such as cloud detection, by including a dedicated term in the cost function. Results show that, when the proposed transformation is applied, cloud detection models trained in Landsat-8 data increase cloud detection accuracy in Proba-V.
△ Less
Submitted 10 June, 2020;
originally announced June 2020.
-
PerceptNet: A Human Visual System Inspired Neural Network for Estimating Perceptual Distance
Authors:
Alexander Hepburn,
Valero Laparra,
Jesús Malo,
Ryan McConville,
Raul Santos-Rodriguez
Abstract:
Traditionally, the vision community has devised algorithms to estimate the distance between an original image and images that have been subject to perturbations. Inspiration was usually taken from the human visual perceptual system and how the system processes different perturbations in order to replicate to what extent it determines our ability to judge image quality. While recent works have pres…
▽ More
Traditionally, the vision community has devised algorithms to estimate the distance between an original image and images that have been subject to perturbations. Inspiration was usually taken from the human visual perceptual system and how the system processes different perturbations in order to replicate to what extent it determines our ability to judge image quality. While recent works have presented deep neural networks trained to predict human perceptual quality, very few borrow any intuitions from the human visual system. To address this, we present PerceptNet, a convolutional neural network where the architecture has been chosen to reflect the structure and various stages in the human visual system. We evaluate PerceptNet on various traditional perception datasets and note strong performance on a number of them as compared with traditional image quality metrics. We also show that including a nonlinearity inspired by the human visual system in classical deep neural networks architectures can increase their ability to judge perceptual similarity. Compared to similar deep learning methods, the performance is similar, although our network has a number of parameters that is several orders of magnitude less.
△ Less
Submitted 17 November, 2020; v1 submitted 28 October, 2019;
originally announced October 2019.
-
Enforcing Perceptual Consistency on Generative Adversarial Networks by Using the Normalised Laplacian Pyramid Distance
Authors:
Alexander Hepburn,
Valero Laparra,
Ryan McConville,
Raul Santos-Rodriguez
Abstract:
In recent years there has been a growing interest in image generation through deep learning. While an important part of the evaluation of the generated images usually involves visual inspection, the inclusion of human perception as a factor in the training process is often overlooked. In this paper we propose an alternative perceptual regulariser for image-to-image translation using conditional ge…
▽ More
In recent years there has been a growing interest in image generation through deep learning. While an important part of the evaluation of the generated images usually involves visual inspection, the inclusion of human perception as a factor in the training process is often overlooked. In this paper we propose an alternative perceptual regulariser for image-to-image translation using conditional generative adversarial networks (cGANs). To do so automatically (avoiding visual inspection), we use the Normalised Laplacian Pyramid Distance (NLPD) to measure the perceptual similarity between the generated image and the original image. The NLPD is based on the principle of normalising the value of coefficients with respect to a local estimate of mean energy at different scales and has already been successfully tested in different experiments involving human perception. We compare this regulariser with the originally proposed L1 distance and note that when using NLPD the generated images contain more realistic values for both local and global contrast. We found that using NLPD as a regulariser improves image segmentation accuracy on generated images as well as improving two no-reference image quality metrics.
△ Less
Submitted 17 November, 2020; v1 submitted 9 August, 2019;
originally announced August 2019.