-
Gap Filling of Biophysical Parameter Time Series with Multi-Output Gaussian Processes
Authors:
Anna Mateo-Sanchis,
Jordi Munoz-Mari,
Manuel Campos-Taberner,
Javier Garcia-Haro,
Gustau Camps-Valls
Abstract:
In this work we evaluate multi-output (MO) Gaussian Process (GP) models based on the linear model of coregionalization (LMC) for estimation of biophysical parameter variables under a gap filling setup. In particular, we focus on LAI and fAPAR over rice areas. We show how this problem cannot be solved with standard single-output (SO) GP models, and how the proposed MO-GP models are able to successf…
▽ More
In this work we evaluate multi-output (MO) Gaussian Process (GP) models based on the linear model of coregionalization (LMC) for estimation of biophysical parameter variables under a gap filling setup. In particular, we focus on LAI and fAPAR over rice areas. We show how this problem cannot be solved with standard single-output (SO) GP models, and how the proposed MO-GP models are able to successfully predict these variables even in high missing data regimes, by implicitly performing an across-domain information transfer.
△ Less
Submitted 11 December, 2020;
originally announced December 2020.
-
Synergistic Integration of Optical and Microwave Satellite Data for Crop Yield Estimation
Authors:
Anna Mateo-Sanchis,
Maria Piles,
Jordi Muñoz-Marí,
Jose E. Adsuara,
Adrián Pérez-Suay,
Gustau Camps-Valls
Abstract:
Develo** accurate models of crop stress, phenology and productivity is of paramount importance, given the increasing need of food. Earth observation remote sensing data provides a unique source of information to monitor crops in a temporally resolved and spatially explicit way. In this study, we propose the combination of multisensor (optical and microwave) remote sensing data for crop yield est…
▽ More
Develo** accurate models of crop stress, phenology and productivity is of paramount importance, given the increasing need of food. Earth observation remote sensing data provides a unique source of information to monitor crops in a temporally resolved and spatially explicit way. In this study, we propose the combination of multisensor (optical and microwave) remote sensing data for crop yield estimation and forecasting using two novel approaches. We first propose the lag between Enhanced Vegetation Index derived from MODIS and Vegetation Optical Depth derived from SMAP as a new joint metric combining the information from the two satellite sensors in a unique feature or descriptor. Our second approach avoids summarizing statistics and uses machine learning to combine full time series of EVI and VOD. This study considers two statistical methods, a regularized linear regression and its nonlinear extension called kernel ridge regression to directly estimate the county-level surveyed total production, as well as individual yields of the major crops grown in the region: corn, soybean and wheat. The study area includes the US Corn Belt, and we use agricultural survey data from the National Agricultural Statistics Service (USDA-NASS) for year 2015 for quantitative assessment.
△ Less
Submitted 11 December, 2020;
originally announced December 2020.
-
Spatial noise-aware temperature retrieval from infrared sounder data
Authors:
David Malmgren-Hansen,
Valero Laparra,
Allan Aasbjerg Nielsen,
Gustau Camps-Valls
Abstract:
In this paper we present a combined strategy for the retrieval of atmospheric profiles from infrared sounders. The approach considers the spatial information and a noise-dependent dimensionality reduction approach. The extracted features are fed into a canonical linear regression. We compare Principal Component Analysis (PCA) and Minimum Noise Fraction (MNF) for dimensionality reduction, and study…
▽ More
In this paper we present a combined strategy for the retrieval of atmospheric profiles from infrared sounders. The approach considers the spatial information and a noise-dependent dimensionality reduction approach. The extracted features are fed into a canonical linear regression. We compare Principal Component Analysis (PCA) and Minimum Noise Fraction (MNF) for dimensionality reduction, and study the compactness and information content of the extracted features. Assessment of the results is done on a big dataset covering many spatial and temporal situations. PCA is widely used for these purposes but our analysis shows that one can gain significant improvements of the error rates when using MNF instead. In our analysis we also investigate the relationship between error rate improvements when including more spectral and spatial components in the regression model, aiming to uncover the trade-off between model complexity and error rates.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
-
Efficient Nonlinear RX Anomaly Detectors
Authors:
José A. Padrón Hidalgo,
Adrián Pérez-Suay,
Fatih Nar,
Gustau Camps-Valls
Abstract:
Current anomaly detection algorithms are typically challenged by either accuracy or efficiency. More accurate nonlinear detectors are typically slow and not scalable. In this letter, we propose two families of techniques to improve the efficiency of the standard kernel Reed-Xiaoli (RX) method for anomaly detection by approximating the kernel function with either {\em data-independent} random Fouri…
▽ More
Current anomaly detection algorithms are typically challenged by either accuracy or efficiency. More accurate nonlinear detectors are typically slow and not scalable. In this letter, we propose two families of techniques to improve the efficiency of the standard kernel Reed-Xiaoli (RX) method for anomaly detection by approximating the kernel function with either {\em data-independent} random Fourier features or {\em data-dependent} basis with the Nyström approach. We compare all methods for both real multi- and hyperspectral images. We show that the proposed efficient methods have a lower computational cost and they perform similar (or outperform) the standard kernel RX algorithm thanks to their implicit regularization effect. Last but not least, the Nyström approach has an improved power of detection.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Machine Learning Information Fusion in Earth Observation: A Comprehensive Review of Methods, Applications and Data Sources
Authors:
S. Salcedo-Sanz,
P. Ghamisi,
M. Piles,
M. Werner,
L. Cuadra,
A. Moreno-Martínez,
E. Izquierdo-Verdiguier,
J. Muñoz-Marí,
Amirhosein Mosavi,
G. Camps-Valls
Abstract:
This paper reviews the most important information fusion data-driven algorithms based on Machine Learning (ML) techniques for problems in Earth observation. Nowadays we observe and model the Earth with a wealth of observations, from a plethora of different sensors, measuring states, fluxes, processes and variables, at unprecedented spatial and temporal resolutions. Earth observation is well equipp…
▽ More
This paper reviews the most important information fusion data-driven algorithms based on Machine Learning (ML) techniques for problems in Earth observation. Nowadays we observe and model the Earth with a wealth of observations, from a plethora of different sensors, measuring states, fluxes, processes and variables, at unprecedented spatial and temporal resolutions. Earth observation is well equipped with remote sensing systems, mounted on satellites and airborne platforms, but it also involves in-situ observations, numerical models and social media data streams, among other data sources. Data-driven approaches, and ML techniques in particular, are the natural choice to extract significant information from this data deluge. This paper produces a thorough review of the latest work on information fusion for Earth observation, with a practical intention, not only focusing on describing the most relevant previous works in the field, but also the most important Earth observation applications where ML information fusion has obtained significant results. We also review some of the most currently used data sets, models and sources for Earth observation problems, describing their importance and how to obtain the data when needed. Finally, we illustrate the application of ML data fusion with a representative set of case studies, as well as we discuss and outlook the near future of the field.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Generation of global vegetation products from EUMETSAT AVHRR/METOP satellites
Authors:
Francisco Javier García-Haro,
Manuel Campos-Taberner,
Beatriz Martínez,
Sergio Sánchez-Ruiz,
María Amparo Gilabert,
Gustau Camps-Valls,
Jordi Muñoz-Marí,
Valero Laparra,
Fernando Camacho,
Jorge Sanchez-Zapero,
Beatriz Fuster
Abstract:
We describe the methodology applied for the retrieval of global LAI, FAPAR and FVC from Advanced Very High Resolution Radiometer (AVHRR) onboard the Meteorological-Operational (MetOp) polar orbiting satellites also known as EUMETSAT Polar System (EPS). A novel approach has been developed for the joint retrieval of three parameters (LAI, FVC, and FAPAR) instead of training one model per parameter.…
▽ More
We describe the methodology applied for the retrieval of global LAI, FAPAR and FVC from Advanced Very High Resolution Radiometer (AVHRR) onboard the Meteorological-Operational (MetOp) polar orbiting satellites also known as EUMETSAT Polar System (EPS). A novel approach has been developed for the joint retrieval of three parameters (LAI, FVC, and FAPAR) instead of training one model per parameter. The method relies on multi-output Gaussian Processes Regression (GPR) trained over PROSAIL EPS simulations. A sensitivity analysis is performed to assess several sources of uncertainties in retrievals and maximize the positive impact of modeling the noise in training simulations. We describe the main features of the operational processing chain along with the current status of the global EPS vegetation products, including details about its overall quality and preliminary assessment of the products based on intercomparison with equivalent (MODIS, PROBA-V) satellite vegetation products.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Convolutional Neural Networks for Multispectral Image Cloud Masking
Authors:
Gonzalo Mateo-García,
Luis Gómez-Chova,
Gustau Camps-Valls
Abstract:
Convolutional neural networks (CNN) have proven to be state of the art methods for many image classification tasks and their use is rapidly increasing in remote sensing problems. One of their major strengths is that, when enough data is available, CNN perform an end-to-end learning without the need of custom feature extraction methods. In this work, we study the use of different CNN architectures…
▽ More
Convolutional neural networks (CNN) have proven to be state of the art methods for many image classification tasks and their use is rapidly increasing in remote sensing problems. One of their major strengths is that, when enough data is available, CNN perform an end-to-end learning without the need of custom feature extraction methods. In this work, we study the use of different CNN architectures for cloud masking of Proba-V multispectral images. We compare such methods with the more classical machine learning approach based on feature extraction plus supervised classification. Experimental results suggest that CNN are a promising alternative for solving cloud masking problems.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
-
Derivation of global vegetation biophysical parameters from EUMETSAT Polar System
Authors:
Francisco Javier García-Haro,
Manuel Campos-Taberner,
Jordi Muñoz-Marí,
Valero Laparra,
Fernando Camacho,
Jorge Sanchez-Zapero,
Gustau Camps-Valls
Abstract:
This paper presents the algorithm developed in LSA-SAF (Satellite Application Facility for Land Surface Analysis) for the derivation of global vegetation parameters from the AVHRR (Advanced Very High-Resolution Radiometer) sensor onboard MetOp (Meteorological-Operational) satellites forming the EUMETSAT (European Organization for the Exploitation of Meteorological Satellites) Polar System (EPS). T…
▽ More
This paper presents the algorithm developed in LSA-SAF (Satellite Application Facility for Land Surface Analysis) for the derivation of global vegetation parameters from the AVHRR (Advanced Very High-Resolution Radiometer) sensor onboard MetOp (Meteorological-Operational) satellites forming the EUMETSAT (European Organization for the Exploitation of Meteorological Satellites) Polar System (EPS). The suite of LSA-SAF EPS vegetation products includes the leaf area index (LAI), the fractional vegetation cover (FVC), and the fraction of absorbed photosynthetically active radiation (FAPAR). LAI, FAPAR, and FVC characterize the structure and the functioning of vegetation and are key parameters for a wide range of land-biosphere applications. The algorithm is based on a hybrid approach that blends the generalization capabilities offered by physical radiative transfer models with the accuracy and computational efficiency of machine learning methods. One major feature is the implementation of multi-output retrieval methods able to jointly and more consistently estimate all the biophysical parameters at the same time. We propose a multi-output Gaussian process regression (GPRmulti), which outperforms other considered methods over PROSAIL (coupling of PROSPECT and SAIL (Scattering by Arbitrary Inclined Leaves) radiative transfer models) EPS simulations. The global EPS products include uncertainty estimates taking into account the uncertainty captured by the retrieval method and input error propagation. The consistent generation and distribution of the EPS vegetation products will constitute a valuable tool for monitoring of earth surface dynamic processes.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Causal Inference in Geoscience and Remote Sensing from Observational Data
Authors:
Adrián Pérez-Suay,
Gustau Camps-Valls
Abstract:
Establishing causal relations between random variables from observational data is perhaps the most important challenge in today's \blue{science}. In remote sensing and geosciences this is of special relevance to better understand the Earth's system and the complex interactions between the governing processes. In this paper, we focus on observational causal inference, thus we try to estimate the co…
▽ More
Establishing causal relations between random variables from observational data is perhaps the most important challenge in today's \blue{science}. In remote sensing and geosciences this is of special relevance to better understand the Earth's system and the complex interactions between the governing processes. In this paper, we focus on observational causal inference, thus we try to estimate the correct direction of causation using a finite set of empirical data. In addition, we focus on the more complex bivariate scenario that requires strong assumptions and no conditional independence tests can be used. In particular, we explore the framework of (non-deterministic) additive noise models, which relies on the principle of independence between the cause and the generating mechanism. A practical algorithmic instantiation of such principle only requires 1) two regression models in the forward and backward directions, and 2) the estimation of {\em statistical independence} between the obtained residuals and the observations. The direction leading to more independent residuals is decided to be the cause. We instead propose a criterion that uses the {\em sensitivity} (derivative) of the dependence estimator, the sensitivity criterion allows to identify samples most affecting the dependence measure, and hence the criterion is robust to spurious detections. We illustrate performance in a collection of 28 geoscience causal inference problems, in a database of radiative transfer models simulations and machine learning emulators in vegetation parameter modeling involving 182 problems, and in assessing the impact of different regression models in a carbon cycle problem. The criterion achieves state-of-the-art detection rates in all cases, it is generally robust to noise sources and distortions.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Statistical Learning for End-to-End Simulations
Authors:
J. Vicent,
J. Verrelst,
J. P. Rivera-Caicedo,
N. Sabater,
J. Muñoz-Marí,
G. Camps-Valls,
J. Moreno
Abstract:
End-to-end mission performance simulators (E2ES) are suitable tools to accelerate satellite mission development from concet to deployment. One core element of these E2ES is the generation of synthetic scenes that are observed by the various instruments of an Earth Observation mission. The generation of these scenes rely on Radiative Transfer Models (RTM) for the simulation of light interaction wit…
▽ More
End-to-end mission performance simulators (E2ES) are suitable tools to accelerate satellite mission development from concet to deployment. One core element of these E2ES is the generation of synthetic scenes that are observed by the various instruments of an Earth Observation mission. The generation of these scenes rely on Radiative Transfer Models (RTM) for the simulation of light interaction with the Earth surface and atmosphere. However, the execution of advanced RTMs is impractical due to their large computation burden. Classical interpolation and statistical emulation methods of pre-computed Look-Up Tables (LUT) are therefore common practice to generate synthetic scenes in a reasonable time. This work evaluates the accuracy and computation cost of interpolation and emulation methods to sample the input LUT variable space. The results on MONDTRAN-based top-of-atmosphere radiance data show that Gaussian Process emulators produced more accurate output spectra than linear interpolation at a fraction of its time. It is concluded that emulation can function as a fast and more accurate alternative to interpolation for LUT parameter space sampling.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Gaussian Processes Retrieval of LAI from Sentinel-2 Top-of-Atmosphere Radiance Data
Authors:
Jose Estevez,
Jorge Vicent,
Juan Pablo Rivera-Caicedo,
Pablo Morcillo-Pallarés,
Francesco Vuolo,
Neus Sabater,
Gustau Camps-Valls,
José Moreno,
Jochem Verrelst
Abstract:
Retrieval of vegetation properties from satellite and airborne optical data usually takes place after atmospheric correction, yet it is also possible to develop retrieval algorithms directly from top-of-atmosphere (TOA) radiance data. One of the key vegetation variables that can be retrieved from at-sensor TOA radiance data is the leaf area index (LAI) if algorithms account for variability in the…
▽ More
Retrieval of vegetation properties from satellite and airborne optical data usually takes place after atmospheric correction, yet it is also possible to develop retrieval algorithms directly from top-of-atmosphere (TOA) radiance data. One of the key vegetation variables that can be retrieved from at-sensor TOA radiance data is the leaf area index (LAI) if algorithms account for variability in the atmosphere. We demonstrate the feasibility of LAI retrieval from Sentinel-2 (S2) TOA radiance data (L1C product) in a hybrid machine learning framework. To achieve this, the coupled leaf-canopy-atmosphere radiative transfer models PROSAIL-6S were used to simulate a look-up table (LUT) of TOA radiance data and associated input variables. This LUT was then used to train the Bayesian machine learning algorithms Gaussian processes regression (GPR) and variational heteroscedastic GPR (VHGPR). PROSAIL simulations were also used to train GPR and VHGPR models for LAI retrieval from S2 images at bottom-of-atmosphere (BOA) level (L2A product) for comparison purposes. The VHGPR models led to consistent LAI maps at BOA and TOA scale. We demonstrated that hybrid LAI retrieval algorithms can be developed from TOA radiance data given a cloud-free sky, thus without the need for atmospheric correction.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Retrieval of aboveground crop nitrogen content with a hybrid machine learning method
Authors:
Katja Berger,
Jochem Verrelst,
Jean-Baptiste Féret,
Tobias Hank,
Matthias Wocher,
Wolfram Mauser,
Gustau Camps-Valls
Abstract:
Hyperspectral acquisitions have proven to be the most informative Earth observation data source for the estimation of nitrogen (N) content, which is the main limiting nutrient for plant growth and thus agricultural production. In the past, empirical algorithms have been widely employed to retrieve information on this biochemical plant component from canopy reflectance. However, these approaches do…
▽ More
Hyperspectral acquisitions have proven to be the most informative Earth observation data source for the estimation of nitrogen (N) content, which is the main limiting nutrient for plant growth and thus agricultural production. In the past, empirical algorithms have been widely employed to retrieve information on this biochemical plant component from canopy reflectance. However, these approaches do not seek for a cause-effect relationship based on physical laws. Moreover, most studies solely relied on the correlation of chlorophyll content with nitrogen, and thus neglected the fact that most N is bound in proteins. Our study presents a hybrid retrieval method using a physically-based approach combined with machine learning regression to estimate crop N content. Within the workflow, the leaf optical properties model PROSPECT-PRO including the newly calibrated specific absorption coefficients (SAC) of proteins, was coupled with the canopy reflectance model 4SAIL to PROSAIL-PRO. The latter was then employed to generate a training database to be used for advanced probabilistic machine learning methods: a standard homoscedastic Gaussian process (GP) and a heteroscedastic GP regression that accounts for signal-to-noise relations. Both GP models have the property of providing confidence intervals for the estimates, which sets them apart from other machine learners. GP-based band analysis identified optimal spectral settings with ten bands mainly situated in the shortwave infrared (SWIR) spectral region. Use of well-known protein absorption bands from the literature showed comparative results. Finally, the heteroscedastic GP model was successfully applied on airborne hyperspectral data for N map**. We conclude that GP algorithms, and in particular the heteroscedastic GP, should be implemented for global agricultural monitoring of aboveground N from future imaging spectroscopy data.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Nonlinear Complex PCA for spatio-temporal analysis of global soil moisture
Authors:
Diego Bueso,
Maria Piles,
Gustau Camps-Valls
Abstract:
Soil moisture (SM) is a key state variable of the hydrological cycle, needed to monitor the effects of a changing climate on natural resources. Soil moisture is highly variable in space and time, presenting seasonalities, anomalies and long-term trends, but also, and important nonlinear behaviours. Here, we introduce a novel fast and nonlinear complex PCA method to analyze the spatio-temporal patt…
▽ More
Soil moisture (SM) is a key state variable of the hydrological cycle, needed to monitor the effects of a changing climate on natural resources. Soil moisture is highly variable in space and time, presenting seasonalities, anomalies and long-term trends, but also, and important nonlinear behaviours. Here, we introduce a novel fast and nonlinear complex PCA method to analyze the spatio-temporal patterns of the Earth's surface SM. We use global SM estimates acquired during the period 2010-2017 by ESA's SMOS mission. Our approach unveils both time and space modes, trends and periodicities unlike standard PCA decompositions. Results show the distribution of the total SM variance among its different components, and indicate the dominant modes of temporal variability in surface soil moisture for different regions. The relationship of the derived SM spatio-temporal patterns with El Ni{ñ}o Southern Oscillation (ENSO) conditions is also explored.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
-
Disentangling Derivatives, Uncertainty and Error in Gaussian Process Models
Authors:
Juan Emmanuel Johnson,
Valero Laparra,
Gustau Camps-Valls
Abstract:
Gaussian Processes (GPs) are a class of kernel methods that have shown to be very useful in geoscience applications. They are widely used because they are simple, flexible and provide very accurate estimates for nonlinear problems, especially in parameter retrieval. An addition to a predictive mean function, GPs come equipped with a useful property: the predictive variance function which provides…
▽ More
Gaussian Processes (GPs) are a class of kernel methods that have shown to be very useful in geoscience applications. They are widely used because they are simple, flexible and provide very accurate estimates for nonlinear problems, especially in parameter retrieval. An addition to a predictive mean function, GPs come equipped with a useful property: the predictive variance function which provides confidence intervals for the predictions. The GP formulation usually assumes that there is no input noise in the training and testing points, only in the observations. However, this is often not the case in Earth observation problems where an accurate assessment of the instrument error is usually available. In this paper, we showcase how the derivative of a GP model can be used to provide an analytical error propagation formulation and we analyze the predictive variance and the propagated error terms in a temperature prediction problem from infrared sounding data.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
-
Consistent regression of biophysical parameters with kernel methods
Authors:
Emiliano Díaz,
Adrián Pérez-Suay,
Valero Laparra,
Gustau Camps-Valls
Abstract:
This paper introduces a novel statistical regression framework that allows the incorporation of consistency constraints. A linear and nonlinear (kernel-based) formulation are introduced, and both imply closed-form analytical solutions. The models exploit all the information from a set of drivers while being maximally independent of a set of auxiliary, protected variables. We successfully illustrat…
▽ More
This paper introduces a novel statistical regression framework that allows the incorporation of consistency constraints. A linear and nonlinear (kernel-based) formulation are introduced, and both imply closed-form analytical solutions. The models exploit all the information from a set of drivers while being maximally independent of a set of auxiliary, protected variables. We successfully illustrate the performance in the estimation of chlorophyll content.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
-
Kernel Anomalous Change Detection for Remote Sensing Imagery
Authors:
José A. Padrón-Hidalgo,
Valero Laparra,
Nathan Longbotham,
Gustau Camps-Valls
Abstract:
Anomalous change detection (ACD) is an important problem in remote sensing image processing. Detecting not only pervasive but also anomalous or extreme changes has many applications for which methodologies are available. This paper introduces a nonlinear extension of a full family of anomalous change detectors. In particular, we focus on algorithms that utilize Gaussian and elliptically contoured…
▽ More
Anomalous change detection (ACD) is an important problem in remote sensing image processing. Detecting not only pervasive but also anomalous or extreme changes has many applications for which methodologies are available. This paper introduces a nonlinear extension of a full family of anomalous change detectors. In particular, we focus on algorithms that utilize Gaussian and elliptically contoured (EC) distribution and extend them to their nonlinear counterparts based on the theory of reproducing kernels' Hilbert space. We illustrate the performance of the kernel methods introduced in both pervasive and ACD problems with real and simulated changes in multispectral and hyperspectral imagery with different resolutions (AVIRIS, Sentinel-2, WorldView-2, and Quickbird). A wide range of situations is studied in real examples, including droughts, wildfires, and urbanization. Excellent performance in terms of detection accuracy compared to linear formulations is achieved, resulting in improved detection accuracy and reduced false-alarm rates. Results also reveal that the EC assumption may be still valid in Hilbert spaces. We provide an implementation of the algorithms as well as a database of natural anomalous changes in real scenarios http://isp.uv.es/kacd.html.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
-
Gradient-based Automatic Look-Up Table Generator for Atmospheric Radiative Transfer Models
Authors:
Jorge Vicent,
Luis Alonso,
Luca Martino,
Neus Sabater,
Jochem Verrelst,
Gustau Camps-Valls
Abstract:
Atmospheric correction of Earth Observation data is one of the most critical steps in the data processing chain of a satellite mission for successful remote sensing applications. Atmospheric Radiative Transfer Models (RTM) inversion methods are typically preferred due to their high accuracy. However, the execution of RTMs on a pixel-per-pixel basis is impractical due to their high computation time…
▽ More
Atmospheric correction of Earth Observation data is one of the most critical steps in the data processing chain of a satellite mission for successful remote sensing applications. Atmospheric Radiative Transfer Models (RTM) inversion methods are typically preferred due to their high accuracy. However, the execution of RTMs on a pixel-per-pixel basis is impractical due to their high computation time, thus large multi-dimensional look-up tables (LUTs) are precomputed for their later interpolation. To further reduce the RTM computation burden and the error in LUT interpolation, we have developed a method to automatically select the minimum and optimal set of nodes to be included in a LUT. We present the gradient-based automatic LUT generator algorithm (GALGA) which relies on the notion of an acquisition function that incorporates (a) the Jacobian evaluation of an RTM, and (b) information about the multivariate distribution of the current nodes. We illustrate the capabilities of GALGA in the automatic construction and optimization of MODerate resolution atmospheric TRANsmission (MODTRAN) LUTs for several input dimensions. Our results indicate that, when compared to a pseudo-random homogeneous distribution of the LUT nodes, GALGA reduces (1) the LUT size by $\sim$75\% and (2) the maximum interpolation relative errors by 0.5\% It is concluded that automatic LUT design might benefit from the methodology proposed in GALGA to reduce computation time and interpolation errors.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Map** Leaf Area Index with a Smartphone and Gaussian Processes
Authors:
Manuel Campos-Taberner,
Franciso Javier García-Haro,
Álvaro Moreno,
María Amparo Gilabert,
Sergio Sánchez-Ruiz,
Beatriz Martínez,
Gustau Camps-Valls
Abstract:
Leaf area index (LAI) is a key biophysical parameter used to determine foliage cover and crop growth in environmental studies. Smartphones are nowadays ubiquitous sensor devices with high computational power, moderate cost, and high-quality sensors. A smartphone app, called PocketLAI, was recently presented and tested for acquiring ground LAI estimates. In this letter, we explore the use of state-…
▽ More
Leaf area index (LAI) is a key biophysical parameter used to determine foliage cover and crop growth in environmental studies. Smartphones are nowadays ubiquitous sensor devices with high computational power, moderate cost, and high-quality sensors. A smartphone app, called PocketLAI, was recently presented and tested for acquiring ground LAI estimates. In this letter, we explore the use of state-of-the-art nonlinear Gaussian process regression (GPR) to derive spatially explicit LAI estimates over rice using ground data from PocketLAI and Landsat 8 imagery. GPR has gained popularity in recent years because of their solid Bayesian foundations that offers not only high accuracy but also confidence intervals for the retrievals. We show the first LAI maps obtained with ground data from a smartphone combined with advanced machine learning. This work compares LAI predictions and confidence intervals of the retrievals obtained with PocketLAI to those obtained with classical instruments, such as digital hemispheric photography (DHP) and LI-COR LAI-2000. This letter shows that all three instruments got comparable result but the PocketLAI is far cheaper. The proposed methodology hence opens a wide range of possible applications at moderate cost.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Quantifying vegetation biophysical variables from imaging spectroscopy data: a review on retrieval methods
Authors:
Jochem Verrelst,
Zbyněk Malenovský,
Christiaan Van der Tol,
Gustau Camps-Valls,
Jean-Philippe Gastellu-Etchegorry,
Philip Lewis,
Peter North,
José Moreno
Abstract:
An unprecedented spectroscopic data stream will soon become available with forthcoming Earth-observing satellite missions equipped with imaging spectroradiometers. This data stream will open up a vast array of opportunities to quantify a diversity of biochemical and structural vegetation properties. The processing requirements for such large data streams require reliable retrieval techniques enabl…
▽ More
An unprecedented spectroscopic data stream will soon become available with forthcoming Earth-observing satellite missions equipped with imaging spectroradiometers. This data stream will open up a vast array of opportunities to quantify a diversity of biochemical and structural vegetation properties. The processing requirements for such large data streams require reliable retrieval techniques enabling the spatiotemporally explicit quantification of biophysical variables. With the aim of preparing for this new era of Earth observation, this review summarizes the state-of-the-art retrieval methods that have been applied in experimental imaging spectroscopy studies inferring all kinds of vegetation biophysical variables. Identified retrieval methods are categorized into: (1) parametric regression, including vegetation indices, shape indices, and spectral transformations; (2) non-parametric regression, including linear and non-linear machine learning regression algorithms; (3) physically-based, including inversion of radiative transfer models (RTMs) using numerical optimization and look-up-table approaches; and (4) hybrid regression methods, which combine RTM simulations with machine learning regression methods. For each of these categories, an overview of widely applied methods with application to map** vegetation properties is given. In view of processing imaging spectroscopy data, a critical aspect involves the challenge of dealing with spectral multicollinearity. The ability to provide robust estimates, retrieval uncertainties, and acceptable retrieval processing speed are other important aspects in view of operational processing. Recommendations towards new-generation spectroscopy-based processing chains for the operational production of biophysical variables are given.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Retrieval of Case 2 Water Quality Parameters with Machine Learning
Authors:
Ana B. Ruescas,
Gonzalo Mateo-Garcia,
Gustau Camps-Valls,
Martin Hieronymi
Abstract:
Water quality parameters are derived applying several machine learning regression methods on the Case2eXtreme dataset (C2X). The used data are based on Hydrolight in-water radiative transfer simulations at Sentinel-3 OLCI wavebands, and the application is done exclusively for absorbing waters with high concentrations of coloured dissolved organic matter (CDOM). The regression approaches are: regul…
▽ More
Water quality parameters are derived applying several machine learning regression methods on the Case2eXtreme dataset (C2X). The used data are based on Hydrolight in-water radiative transfer simulations at Sentinel-3 OLCI wavebands, and the application is done exclusively for absorbing waters with high concentrations of coloured dissolved organic matter (CDOM). The regression approaches are: regularized linear, random forest, Kernel ridge, Gaussian process and support vector regressors. The validation is made with and an independent simulation dataset. A comparison with the OLCI Neural Network Swarm (ONSS) is made as well. The best approached is applied to a sample scene and compared with the standard OLCI product delivered by EUMETSAT/ESA
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
Multi-temporal and multi-source remote sensing image classification by nonlinear relative normalization
Authors:
Devis Tuia,
Diego Marcos,
Gustau Camps-Valls
Abstract:
Remote sensing image classification exploiting multiple sensors is a very challenging problem: data from different modalities are affected by spectral distortions and mis-alignments of all kinds, and this hampers re-using models built for one image to be used successfully in other scenes. In order to adapt and transfer models across image acquisitions, one must be able to cope with datasets that a…
▽ More
Remote sensing image classification exploiting multiple sensors is a very challenging problem: data from different modalities are affected by spectral distortions and mis-alignments of all kinds, and this hampers re-using models built for one image to be used successfully in other scenes. In order to adapt and transfer models across image acquisitions, one must be able to cope with datasets that are not co-registered, acquired under different illumination and atmospheric conditions, by different sensors, and with scarce ground references. Traditionally, methods based on histogram matching have been used. However, they fail when densities have very different shapes or when there is no corresponding band to be matched between the images. An alternative builds upon \emph{manifold alignment}. Manifold alignment performs a multidimensional relative normalization of the data prior to product generation that can cope with data of different dimensionality (e.g. different number of bands) and possibly unpaired examples. Aligning data distributions is an appealing strategy, since it allows to provide data spaces that are more similar to each other, regardless of the subsequent use of the transformed data. In this paper, we study a methodology that aligns data from different domains in a nonlinear way through {\em kernelization}. We introduce the Kernel Manifold Alignment (KEMA) method, which provides a flexible and discriminative projection map, exploits only a few labeled samples (or semantic ties) in each domain, and reduces to solving a generalized eigenvalue problem. We successfully test KEMA in multi-temporal and multi-source very high resolution classification tasks, as well as on the task of making a model invariant to shadowing for hyperspectral imaging.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Active Learning Methods for Efficient Hybrid Biophysical Variable Retrieval
Authors:
ochem Verrelst,
Sara Dethier,
Juan Pablo Rivera,
Jordi Muñoz-Marí,
Gustau Camps-Valls,
José Moreno
Abstract:
Kernel-based machine learning regression algorithms (MLRAs) are potentially powerful methods for being implemented into operational biophysical variable retrieval schemes. However, they face difficulties in co** with large training datasets. With the increasing amount of optical remote sensing data made available for analysis and the possibility of using a large amount of simulated data from rad…
▽ More
Kernel-based machine learning regression algorithms (MLRAs) are potentially powerful methods for being implemented into operational biophysical variable retrieval schemes. However, they face difficulties in co** with large training datasets. With the increasing amount of optical remote sensing data made available for analysis and the possibility of using a large amount of simulated data from radiative transfer models (RTMs) to train kernel MLRAs, efficient data reduction techniques will need to be implemented. Active learning (AL) methods enable to select the most informative samples in a dataset. This letter introduces six AL methods for achieving optimized biophysical variable estimation with a manageable training dataset, and their implementation into a Matlab-based MLRA toolbox for semi-automatic use. The AL methods were analyzed on their efficiency of improving the estimation accuracy of leaf area index and chlorophyll content based on PROSAIL simulations. Each of the implemented methods outperformed random sampling, improving retrieval accuracy with lower sampling rates. Practically, AL methods open opportunities to feed advanced MLRAs with RTM-generated training data for development of operational retrieval models.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Randomized kernels for large scale Earth observation applications
Authors:
Adrián Pérez-Suay,
Julia Amorós-López,
Luis Gómez-Chova,
Valero Laparra,
Jordi Muñoz-Marí,
Gustau Camps-Valls
Abstract:
Dealing with land cover classification of the new image sources has also turned to be a complex problem requiring large amount of memory and processing time. In order to cope with these problems, statistical learning has greatly helped in the last years to develop statistical retrieval and classification models that can ingest large amounts of Earth observation data. Kernel methods constitute a fa…
▽ More
Dealing with land cover classification of the new image sources has also turned to be a complex problem requiring large amount of memory and processing time. In order to cope with these problems, statistical learning has greatly helped in the last years to develop statistical retrieval and classification models that can ingest large amounts of Earth observation data. Kernel methods constitute a family of powerful machine learning algorithms, which have found wide use in remote sensing and geosciences. However, kernel methods are still not widely adopted because of the high computational cost when dealing with large scale problems, such as the inversion of radiative transfer models or the classification of high spatial-spectral-temporal resolution data. This paper introduces an efficient kernel method for fast statistical retrieval of bio-geo-physical parameters and image classification problems. The method allows to approximate a kernel matrix with a set of projections on random bases sampled from the Fourier domain. The method is simple, computationally very efficient in both memory and processing costs, and easily parallelizable. We show that kernel regression and classification is now possible for datasets with millions of examples and high dimensionality. Examples on atmospheric parameter retrieval from hyperspectral infrared sounders like IASI/Metop; large scale emulation and inversion of the familiar PROSAIL radiative transfer model on Sentinel-2 data; and the identification of clouds over landmarks in time series of MSG/Seviri images show the efficiency and effectiveness of the proposed technique.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Understanding Climate Impacts on Vegetation with Gaussian Processes in Granger Causality
Authors:
Miguel Morata-Dolz,
Diego Bueso,
Maria Piles,
Gustau Camps-Valls
Abstract:
Global warming is leading to unprecedented changes in our planet, with great societal, economical and environmental implications, especially with the growing demand of biofuels and food. Assessing the impact of climate on vegetation is of pressing need. We approached the attribution problem with a novel nonlinear Granger causal (GC) methodology and used a large data archive of remote sensing satel…
▽ More
Global warming is leading to unprecedented changes in our planet, with great societal, economical and environmental implications, especially with the growing demand of biofuels and food. Assessing the impact of climate on vegetation is of pressing need. We approached the attribution problem with a novel nonlinear Granger causal (GC) methodology and used a large data archive of remote sensing satellite products, environmental and climatic variables spatio-temporally gridded over more than 30 years. We generalize kernel Granger causality by considering the variables cross-relations explicitly in Hilbert spaces, and use the covariance in Gaussian processes. The method generalizes the linear and kernel GC methods, and comes with tighter bounds of performance based on Rademacher complexity. Spatially-explicit global Granger footprints of precipitation and soil moisture on vegetation greenness are identified more sharply than previous GC methods.
△ Less
Submitted 6 December, 2020;
originally announced December 2020.
-
Remote sensing of vegetation dynamics in agro-ecosystems using SMAP vegetation optical depth and optical vegetation indices
Authors:
M. Piles,
D. Chaparro,
D. Entekhabi,
A. G. Konings,
T. Jagdhuber,
G. Camps-Valls
Abstract:
The ESA's SMOS and the NASA's SMAP missions, launched in 2009 and 2015, respectively, are the first two missions having on-board L-band microwave sensors, which are very sensitive to the water content in soils and vegetation. Focusing on the vegetation signal at L-band, we have implemented an inversion approach for SMAP that allows deriving vegetation optical depth (VOD, a microwave parameter rela…
▽ More
The ESA's SMOS and the NASA's SMAP missions, launched in 2009 and 2015, respectively, are the first two missions having on-board L-band microwave sensors, which are very sensitive to the water content in soils and vegetation. Focusing on the vegetation signal at L-band, we have implemented an inversion approach for SMAP that allows deriving vegetation optical depth (VOD, a microwave parameter related to biomass and plant water content) alongside soil moisture, without reliance on ancillary optical information on vegetation. This work aims at using this new observational data to monitor the phenology of crops in major global agro-ecosystems and enhance present agricultural monitoring and prediction capabilities. Core agricultural regions have been selected worldwide covering major crops (corn, soybean, wheat, rice). The complementarity and synergies between the microwave vegetation signal, sensitive to biomass water-uptake dynamics, and optical indices, sensitive to canopy greenness, are explored. Results reveal the value of L-band VOD as an independent ecological indicator for global terrestrial biosphere studies.
△ Less
Submitted 6 December, 2020;
originally announced December 2020.
-
Fusing Optical and SAR time series for LAI gap filling with multioutput Gaussian processes
Authors:
Luca Pipia,
Jordi Muñoz-Marí,
Eatidal Amin,
Santiago Belda,
Gustau Camps-Valls,
Jochem Verrelst
Abstract:
The availability of satellite optical information is often hampered by the natural presence of clouds, which can be problematic for many applications. Persistent clouds over agricultural fields can mask key stages of crop growth, leading to unreliable yield predictions. Synthetic Aperture Radar (SAR) provides all-weather imagery which can potentially overcome this limitation, but given its high an…
▽ More
The availability of satellite optical information is often hampered by the natural presence of clouds, which can be problematic for many applications. Persistent clouds over agricultural fields can mask key stages of crop growth, leading to unreliable yield predictions. Synthetic Aperture Radar (SAR) provides all-weather imagery which can potentially overcome this limitation, but given its high and distinct sensitivity to different surface properties, the fusion of SAR and optical data still remains an open challenge. In this work, we propose the use of Multi-Output Gaussian Process (MOGP) regression, a machine learning technique that learns automatically the statistical relationships among multisensor time series, to detect vegetated areas over which the synergy between SAR-optical imageries is profitable. For this purpose, we use the Sentinel-1 Radar Vegetation Index (RVI) and Sentinel-2 Leaf Area Index (LAI) time series over a study area in north west of the Iberian peninsula. Through a physical interpretation of MOGP trained models, we show its ability to provide estimations of LAI even over cloudy periods using the information shared with RVI, which guarantees the solution keeps always tied to real measurements. Results demonstrate the advantage of MOGP especially for long data gaps, where optical-based methods notoriously fail. The leave-one-image-out assessment technique applied to the whole vegetation cover shows MOGP predictions improve standard GP estimations over short-time gaps (R$^2$ of 74\% vs 68\%, RMSE of 0.4 vs 0.44 $[m^2m^{-2}]$) and especially over long-time gaps (R$^2$ of 33\% vs 12\%, RMSE of 0.5 vs 1.09 $[m^2m^{-2}]$).
△ Less
Submitted 5 December, 2020;
originally announced December 2020.
-
Information Theory in Density Destructors
Authors:
J. Emmanuel Johnson,
Valero Laparra,
Gustau Camps-Valls,
Raul Santos-Rodríguez,
Jesús Malo
Abstract:
Density destructors are differentiable and invertible transforms that map multivariate PDFs of arbitrary structure (low entropy) into non-structured PDFs (maximum entropy). Multivariate Gaussianization and multivariate equalization are specific examples of this family, which break down the complexity of the original PDF through a set of elementary transforms that progressively remove the structure…
▽ More
Density destructors are differentiable and invertible transforms that map multivariate PDFs of arbitrary structure (low entropy) into non-structured PDFs (maximum entropy). Multivariate Gaussianization and multivariate equalization are specific examples of this family, which break down the complexity of the original PDF through a set of elementary transforms that progressively remove the structure of the data. We demonstrate how this property of density destructive flows is connected to classical information theory, and how density destructors can be used to get more accurate estimates of information theoretic quantities. Experiments with total correlation and mutual information inmultivariate sets illustrate the ability of density destructors compared to competing methods. These results suggest that information theoretic measures may be an alternative optimization criteria when learning density destructive flows.
△ Less
Submitted 2 December, 2020;
originally announced December 2020.
-
Explicit Granger causality in kernel Hilbert spaces
Authors:
Diego Bueso,
Maria Piles,
Gustau Camps-Valls
Abstract:
Granger causality (GC) is undoubtedly the most widely used method to infer cause-effect relations from observational time series. Several nonlinear alternatives to GC have been proposed based on kernel methods. We generalize kernel Granger causality by considering the variables cross-relations explicitly in Hilbert spaces. The framework is shown to generalize the linear and kernel GC methods, and…
▽ More
Granger causality (GC) is undoubtedly the most widely used method to infer cause-effect relations from observational time series. Several nonlinear alternatives to GC have been proposed based on kernel methods. We generalize kernel Granger causality by considering the variables cross-relations explicitly in Hilbert spaces. The framework is shown to generalize the linear and kernel GC methods, and comes with tighter bounds of performance based on Rademacher complexity. We successfully evaluate its performance in standard dynamical systems, as well as to identify the arrow of time in coupled Rössler systems, and is exploited to disclose the El Niño-Southern Oscillation (ENSO) phenomenon footprints on soil moisture globally.
△ Less
Submitted 29 November, 2020;
originally announced November 2020.
-
Learning drivers of climate-induced human migrations with Gaussian processes
Authors:
Jose M. Tarraga,
Maria Piles,
Gustau Camps-Valls
Abstract:
In the current context of climate change, extreme heatwaves, droughts, and floods are not only impacting the biosphere and atmosphere but the anthroposphere too. Human populations are forcibly displaced, which are now referred to as climate-induced migrants. In this work, we investigate which climate and structural factors forced major human displacements in the presence of floods and storms durin…
▽ More
In the current context of climate change, extreme heatwaves, droughts, and floods are not only impacting the biosphere and atmosphere but the anthroposphere too. Human populations are forcibly displaced, which are now referred to as climate-induced migrants. In this work, we investigate which climate and structural factors forced major human displacements in the presence of floods and storms during the years 2017-2019. We built, curated, and harmonized a database of meteorological and remote sensing indicators along with structural factors of 27 develo** countries worldwide. We show how we can use Gaussian Processes to learn what variables can explain the impact of floods and storms in the context of forced displacements and to develop models that reproduce migration flows. Our results at regional, global, and disaster-specific scales show the importance of structural factors in the determination of the magnitude of displacements. The study may have both societal, political, and economical implications.
△ Less
Submitted 17 November, 2020;
originally announced November 2020.
-
Deep Importance Sampling based on Regression for Model Inversion and Emulation
Authors:
F. Llorente,
L. Martino,
D. Delgado,
G. Camps-Valls
Abstract:
Understanding systems by forward and inverse modeling is a recurrent topic of research in many domains of science and engineering. In this context, Monte Carlo methods have been widely used as powerful tools for numerical inference and optimization. They require the choice of a suitable proposal density that is crucial for their performance. For this reason, several adaptive importance sampling (A…
▽ More
Understanding systems by forward and inverse modeling is a recurrent topic of research in many domains of science and engineering. In this context, Monte Carlo methods have been widely used as powerful tools for numerical inference and optimization. They require the choice of a suitable proposal density that is crucial for their performance. For this reason, several adaptive importance sampling (AIS) schemes have been proposed in the literature. We here present an AIS framework called Regression-based Adaptive Deep Importance Sampling (RADIS). In RADIS, the key idea is the adaptive construction via regression of a non-parametric proposal density (i.e., an emulator), which mimics the posterior distribution and hence minimizes the mismatch between proposal and target densities. RADIS is based on a deep architecture of two (or more) nested IS schemes, in order to draw samples from the constructed emulator. The algorithm is highly efficient since employs the posterior approximation as proposal density, which can be improved adding more support points. As a consequence, RADIS asymptotically converges to an exact sampler under mild conditions. Additionally, the emulator produced by RADIS can be in turn used as a cheap surrogate model for further studies. We introduce two specific RADIS implementations that use Gaussian Processes (GPs) and Nearest Neighbors (NN) for constructing the emulator. Several numerical experiments and comparisons show the benefits of the proposed schemes. A real-world application in remote sensing model inversion and emulation confirms the validity of the approach.
△ Less
Submitted 27 February, 2021; v1 submitted 20 October, 2020;
originally announced October 2020.
-
Living in the Physics and Machine Learning Interplay for Earth Observation
Authors:
Gustau Camps-Valls,
Daniel H. Svendsen,
Jordi Cortés-Andrés,
Álvaro Moreno-Martínez,
Adrián Pérez-Suay,
Jose Adsuara,
Irene Martín,
Maria Piles,
Jordi Muñoz-Marí,
Luca Martino
Abstract:
Most problems in Earth sciences aim to do inferences about the system, where accurate predictions are just a tiny part of the whole problem. Inferences mean understanding variables relations, deriving models that are physically interpretable, that are simple parsimonious, and mathematically tractable. Machine learning models alone are excellent approximators, but very often do not respect the most…
▽ More
Most problems in Earth sciences aim to do inferences about the system, where accurate predictions are just a tiny part of the whole problem. Inferences mean understanding variables relations, deriving models that are physically interpretable, that are simple parsimonious, and mathematically tractable. Machine learning models alone are excellent approximators, but very often do not respect the most elementary laws of physics, like mass or energy conservation, so consistency and confidence are compromised. In this paper, we describe the main challenges ahead in the field, and introduce several ways to live in the Physics and machine learning interplay: to encode differential equations from data, constrain data-driven models with physics-priors and dependence constraints, improve parameterizations, emulate physical models, and blend data-driven and process-based models. This is a collective long-term AI agenda towards develo** and applying algorithms capable of discovering knowledge in the Earth system.
△ Less
Submitted 18 October, 2020;
originally announced October 2020.
-
Gaussianizing the Earth: Multidimensional Information Measures for Earth Data Analysis
Authors:
J. Emmanuel Johnson,
Valero Laparra,
Maria Piles,
Gustau Camps-Valls
Abstract:
Information theory is an excellent framework for analyzing Earth system data because it allows us to characterize uncertainty and redundancy, and is universally interpretable. However, accurately estimating information content is challenging because spatio-temporal data is high-dimensional, heterogeneous and has non-linear characteristics. In this paper, we apply multivariate Gaussianization for p…
▽ More
Information theory is an excellent framework for analyzing Earth system data because it allows us to characterize uncertainty and redundancy, and is universally interpretable. However, accurately estimating information content is challenging because spatio-temporal data is high-dimensional, heterogeneous and has non-linear characteristics. In this paper, we apply multivariate Gaussianization for probability density estimation which is robust to dimensionality, comes with statistical guarantees, and is easy to apply. In addition, this methodology allows us to estimate information-theoretic measures to characterize multivariate densities: information, entropy, total correlation, and mutual information. We demonstrate how information theory measures can be applied in various Earth system data analysis problems. First we show how the method can be used to jointly Gaussianize radar backscattering intensities, synthesize hyperspectral data, and quantify of information content in aerial optical images. We also quantify the information content of several variables describing the soil-vegetation status in agro-ecosystems, and investigate the temporal scales that maximize their shared information under extreme events such as droughts. Finally, we measure the relative information content of space and time dimensions in remote sensing products and model simulations involving long records of key variables such as precipitation, sensible heat and evaporation. Results confirm the validity of the method, for which we anticipate a wide use and adoption. Code and demos of the implemented algorithms and information-theory measures are provided.
△ Less
Submitted 25 November, 2020; v1 submitted 13 October, 2020;
originally announced October 2020.
-
Information Theory Measures via Multidimensional Gaussianization
Authors:
Valero Laparra,
J. Emmanuel Johnson,
Gustau Camps-Valls,
Raul Santos-Rodríguez,
Jesus Malo
Abstract:
Information theory is an outstanding framework to measure uncertainty, dependence and relevance in data and systems. It has several desirable properties for real world applications: it naturally deals with multivariate data, it can handle heterogeneous data types, and the measures can be interpreted in physical units. However, it has not been adopted by a wider audience because obtaining informati…
▽ More
Information theory is an outstanding framework to measure uncertainty, dependence and relevance in data and systems. It has several desirable properties for real world applications: it naturally deals with multivariate data, it can handle heterogeneous data types, and the measures can be interpreted in physical units. However, it has not been adopted by a wider audience because obtaining information from multidimensional data is a challenging problem due to the curse of dimensionality. Here we propose an indirect way of computing information based on a multivariate Gaussianization transform. Our proposal mitigates the difficulty of multivariate density estimation by reducing it to a composition of tractable (marginal) operations and simple linear transformations, which can be interpreted as a particular deep neural network. We introduce specific Gaussianization-based methodologies to estimate total correlation, entropy, mutual information and Kullback-Leibler divergence. We compare them to recent estimators showing the accuracy on synthetic data generated from different multivariate distributions. We made the tools and datasets publicly available to provide a test-bed to analyze future methodologies. Results show that our proposal is superior to previous estimators particularly in high-dimensional scenarios; and that it leads to interesting insights in neuroscience, geoscience, computer vision, and machine learning.
△ Less
Submitted 25 November, 2020; v1 submitted 8 October, 2020;
originally announced October 2020.
-
Kernel Methods and their derivatives: Concept and perspectives for the Earth system sciences
Authors:
J. Emmanuel Johnson,
Valero Laparra,
Adrián Pérez-Suay,
Miguel D. Mahecha,
Gustau Camps-Valls
Abstract:
Kernel methods are powerful machine learning techniques which implement generic non-linear functions to solve complex tasks in a simple way. They Have a solid mathematical background and exhibit excellent performance in practice. However, kernel machines are still considered black-box models as the feature map** is not directly accessible and difficult to interpret.The aim of this work is to sho…
▽ More
Kernel methods are powerful machine learning techniques which implement generic non-linear functions to solve complex tasks in a simple way. They Have a solid mathematical background and exhibit excellent performance in practice. However, kernel machines are still considered black-box models as the feature map** is not directly accessible and difficult to interpret.The aim of this work is to show that it is indeed possible to interpret the functions learned by various kernel methods is intuitive despite their complexity. Specifically, we show that derivatives of these functions have a simple mathematical formulation, are easy to compute, and can be applied to many different problems. We note that model function derivatives in kernel machines is proportional to the kernel function derivative. We provide the explicit analytic form of the first and second derivatives of the most common kernel functions with regard to the inputs as well as generic formulas to compute higher order derivatives. We use them to analyze the most used supervised and unsupervised kernel learning methods: Gaussian Processes for regression, Support Vector Machines for classification, Kernel Entropy Component Analysis for density estimation, and the Hilbert-Schmidt Independence Criterion for estimating the dependency between random variables. For all cases we expressed the derivative of the learned function as a linear combination of the kernel function derivative. Moreover we provide intuitive explanations through illustrative toy examples and show how to improve the interpretation of real applications in the context of spatiotemporal Earth system data cubes. This work reflects on the observation that function derivatives may play a crucial role in kernel methods analysis and understanding.
△ Less
Submitted 5 October, 2020; v1 submitted 29 July, 2020;
originally announced July 2020.
-
A Perspective on Gaussian Processes for Earth Observation
Authors:
Gustau Camps-Valls,
Dino Sejdinovic,
Jakob Runge,
Markus Reichstein
Abstract:
Earth observation (EO) by airborne and satellite remote sensing and in-situ observations play a fundamental role in monitoring our planet. In the last decade, machine learning and Gaussian processes (GPs) in particular has attained outstanding results in the estimation of bio-geo-physical variables from the acquired images at local and global scales in a time-resolved manner. GPs provide not only…
▽ More
Earth observation (EO) by airborne and satellite remote sensing and in-situ observations play a fundamental role in monitoring our planet. In the last decade, machine learning and Gaussian processes (GPs) in particular has attained outstanding results in the estimation of bio-geo-physical variables from the acquired images at local and global scales in a time-resolved manner. GPs provide not only accurate estimates but also principled uncertainty estimates for the predictions, can easily accommodate multimodal data coming from different sensors and from multitemporal acquisitions, allow the introduction of physical knowledge, and a formal treatment of uncertainty quantification and error propagation. Despite great advances in forward and inverse modelling, GP models still have to face important challenges that are revised in this perspective paper. GP models should evolve towards data-driven physics-aware models that respect signal characteristics, be consistent with elementary laws of physics, and move from pure regression to observational causal inference.
△ Less
Submitted 2 July, 2020;
originally announced July 2020.
-
Accounting for Input Noise in Gaussian Process Parameter Retrieval
Authors:
J. Emmanuel Johnson,
Valero Laparra,
Gustau Camps-Valls
Abstract:
Gaussian processes (GPs) are a class of Kernel methods that have shown to be very useful in geoscience and remote sensing applications for parameter retrieval, model inversion, and emulation. They are widely used because they are simple, flexible, and provide accurate estimates. GPs are based on a Bayesian statistical framework which provides a posterior probability function for each estimation. T…
▽ More
Gaussian processes (GPs) are a class of Kernel methods that have shown to be very useful in geoscience and remote sensing applications for parameter retrieval, model inversion, and emulation. They are widely used because they are simple, flexible, and provide accurate estimates. GPs are based on a Bayesian statistical framework which provides a posterior probability function for each estimation. Therefore, besides the usual prediction (given in this case by the mean function), GPs come equipped with the possibility to obtain a predictive variance (i.e., error bars, confidence intervals) for each prediction. Unfortunately, the GP formulation usually assumes that there is no noise in the inputs, only in the observations. However, this is often not the case in earth observation problems where an accurate assessment of the measuring instrument error is typically available, and where there is huge interest in characterizing the error propagation through the processing pipeline. In this letter, we demonstrate how one can account for input noise estimates using a GP model formulation which propagates the error terms using the derivative of the predictive mean function. We analyze the resulting predictive variance term and show how they more accurately represent the model error in a temperature prediction problem from infrared sounding data.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
Nonlinear PCA for Spatio-Temporal Analysis of Earth Observation Data
Authors:
Diego Bueso,
Maria Piles,
Gustau Camps-Valls
Abstract:
Remote sensing observations, products and simulations are fundamental sources of information to monitor our planet and its climate variability. Uncovering the main modes of spatial and temporal variability in Earth data is essential to analyze and understand the underlying physical dynamics and processes driving the Earth System. Dimensionality reduction methods can work with spatio-temporal datas…
▽ More
Remote sensing observations, products and simulations are fundamental sources of information to monitor our planet and its climate variability. Uncovering the main modes of spatial and temporal variability in Earth data is essential to analyze and understand the underlying physical dynamics and processes driving the Earth System. Dimensionality reduction methods can work with spatio-temporal datasets and decompose the information efficiently. Principal Component Analysis (PCA), also known as Empirical Orthogonal Functions (EOF) in geophysics, has been traditionally used to analyze climatic data. However, when nonlinear feature relations are present, PCA/EOF fails. In this work, we propose a nonlinear PCA method to deal with spatio-temporal Earth System data. The proposed method, called Rotated Complex Kernel PCA (ROCK-PCA for short), works in reproducing kernel Hilbert spaces to account for nonlinear processes, operates in the complex kernel domain to account for both space and time features, and adds an extra rotation for improved flexibility. The result is an explicitly resolved spatio-temporal decomposition of the Earth data cube. The method is unsupervised and computationally very efficient.We illustrate its ability to uncover spatio-temporal patterns using synthetic experiments and real data. Results of the decomposition of three essential climate variables are shown: satellite-based global Gross Primary Productivity (GPP) and Soil Moisture (SM), and reanalysis Sea Surface Temperature (SST) data. The ROCK-PCA method allows identifying their annual and seasonal oscillations, as well as their non-seasonal trends and spatial variability patterns.
△ Less
Submitted 27 January, 2020;
originally announced February 2020.
-
Active emulation of computer codes with Gaussian processes -- Application to remote sensing
Authors:
Daniel Heestermans Svendsen,
Luca Martino,
Gustau Camps-Valls
Abstract:
Many fields of science and engineering rely on running simulations with complex and computationally expensive models to understand the involved processes in the system of interest. Nevertheless, the high cost involved hamper reliable and exhaustive simulations. Very often such codes incorporate heuristics that ironically make them less tractable and transparent. This paper introduces an active lea…
▽ More
Many fields of science and engineering rely on running simulations with complex and computationally expensive models to understand the involved processes in the system of interest. Nevertheless, the high cost involved hamper reliable and exhaustive simulations. Very often such codes incorporate heuristics that ironically make them less tractable and transparent. This paper introduces an active learning methodology for adaptively constructing surrogate models, i.e. emulators, of such costly computer codes in a multi-output setting. The proposed technique is sequential and adaptive, and is based on the optimization of a suitable acquisition function. It aims to achieve accurate approximations, model tractability, as well as compact and expressive simulated datasets. In order to achieve this, the proposed Active Multi-Output Gaussian Process Emulator (AMOGAPE) combines the predictive capacity of Gaussian Processes (GPs) with the design of an acquisition function that favors sampling in low density and fluctuating regions of the approximation functions. Comparing different acquisition functions, we illustrate the promising performance of the method for the construction of emulators with toy examples, as well as for a widely used remote sensing transfer code.
△ Less
Submitted 13 December, 2019;
originally announced December 2019.
-
Kernel Dependence Regularizers and Gaussian Processes with Applications to Algorithmic Fairness
Authors:
Zhu Li,
Adrian Perez-Suay,
Gustau Camps-Valls,
Dino Sejdinovic
Abstract:
Current adoption of machine learning in industrial, societal and economical activities has raised concerns about the fairness, equity and ethics of automated decisions. Predictive models are often developed using biased datasets and thus retain or even exacerbate biases in their decisions and recommendations. Removing the sensitive covariates, such as gender or race, is insufficient to remedy this…
▽ More
Current adoption of machine learning in industrial, societal and economical activities has raised concerns about the fairness, equity and ethics of automated decisions. Predictive models are often developed using biased datasets and thus retain or even exacerbate biases in their decisions and recommendations. Removing the sensitive covariates, such as gender or race, is insufficient to remedy this issue since the biases may be retained due to other related covariates. We present a regularization approach to this problem that trades off predictive accuracy of the learned models (with respect to biased labels) for the fairness in terms of statistical parity, i.e. independence of the decisions from the sensitive covariates. In particular, we consider a general framework of regularized empirical risk minimization over reproducing kernel Hilbert spaces and impose an additional regularizer of dependence between predictors and sensitive covariates using kernel-based measures of dependence, namely the Hilbert-Schmidt Independence Criterion (HSIC) and its normalized version. This approach leads to a closed-form solution in the case of squared loss, i.e. ridge regression. Moreover, we show that the dependence regularizer has an interpretation as modifying the corresponding Gaussian process (GP) prior. As a consequence, a GP model with a prior that encourages fairness to sensitive variables can be derived, allowing principled hyperparameter selection and studying of the relative relevance of covariates under fairness constraints. Experimental results in synthetic examples and in real problems of income and crime prediction illustrate the potential of the approach to improve fairness of automated decisions.
△ Less
Submitted 11 November, 2019;
originally announced November 2019.
-
Joint Gaussian Processes for Biophysical Parameter Retrieval
Authors:
Daniel Heestermans Svendsen,
Luca Martino,
Manuel Campos-Taberner,
Francisco Javier García-Haro,
Gustau Camps-Valls
Abstract:
Solving inverse problems is central to geosciences and remote sensing. Radiative transfer models (RTMs) represent mathematically the physical laws which govern the phenomena in remote sensing applications (forward models). The numerical inversion of the RTM equations is a challenging and computationally demanding problem, and for this reason, often the application of a nonlinear statistical regres…
▽ More
Solving inverse problems is central to geosciences and remote sensing. Radiative transfer models (RTMs) represent mathematically the physical laws which govern the phenomena in remote sensing applications (forward models). The numerical inversion of the RTM equations is a challenging and computationally demanding problem, and for this reason, often the application of a nonlinear statistical regression is preferred. In general, regression models predict the biophysical parameter of interest from the corresponding received radiance. However, this approach does not employ the physical information encoded in the RTMs. An alternative strategy, which attempts to include the physical knowledge, consists in learning a regression model trained using data simulated by an RTM code. In this work, we introduce a nonlinear nonparametric regression model which combines the benefits of the two aforementioned approaches. The inversion is performed taking into account jointly both real observations and RTM-simulated data. The proposed Joint Gaussian Process (JGP) provides a solid framework for exploiting the regularities between the two types of data. The JGP automatically detects the relative quality of the simulated and real data, and combines them accordingly. This occurs by learning an additional hyper-parameter w.r.t. a standard GP model, and fitting parameters through maximizing the pseudo-likelihood of the real observations. The resulting scheme is both simple and robust, i.e., capable of adapting to different scenarios. The advantages of the JGP method compared to benchmark strategies are shown considering RTM-simulated and real observations in different experiments. Specifically, we consider leaf area index (LAI) retrieval from Landsat data combined with simulated data generated by the PROSAIL model.
△ Less
Submitted 14 November, 2017;
originally announced November 2017.
-
Fair Kernel Learning
Authors:
Adrián Pérez-Suay,
Valero Laparra,
Gonzalo Mateo-García,
Jordi Muñoz-Marí,
Luis Gómez-Chova,
Gustau Camps-Valls
Abstract:
New social and economic activities massively exploit big data and machine learning algorithms to do inference on people's lives. Applications include automatic curricula evaluation, wage determination, and risk assessment for credits and loans. Recently, many governments and institutions have raised concerns about the lack of fairness, equity and ethics in machine learning to treat these problems.…
▽ More
New social and economic activities massively exploit big data and machine learning algorithms to do inference on people's lives. Applications include automatic curricula evaluation, wage determination, and risk assessment for credits and loans. Recently, many governments and institutions have raised concerns about the lack of fairness, equity and ethics in machine learning to treat these problems. It has been shown that not including sensitive features that bias fairness, such as gender or race, is not enough to mitigate the discrimination when other related features are included. Instead, including fairness in the objective function has been shown to be more efficient.
We present novel fair regression and dimensionality reduction methods built on a previously proposed fair classification framework. Both methods rely on using the Hilbert Schmidt independence criterion as the fairness term. Unlike previous approaches, this allows us to simplify the problem and to use multiple sensitive variables simultaneously. Replacing the linear formulation by kernel functions allows the methods to deal with nonlinear problems. For both linear and nonlinear formulations the solution reduces to solving simple matrix inversions or generalized eigenvalue problems. This simplifies the evaluation of the solutions for different trade-off values between the predictive error and fairness terms. We illustrate the usefulness of the proposed methods in toy examples, and evaluate their performance on real world datasets to predict income using gender and/or race discrimination as sensitive variables, and contraceptive method prediction under demographic and socio-economic sensitive descriptors.
△ Less
Submitted 16 October, 2017;
originally announced October 2017.
-
Remote Sensing Image Classification with Large Scale Gaussian Processes
Authors:
Pablo Morales-Alvarez,
Adrian Perez-Suay,
Rafael Molina,
Gustau Camps-Valls
Abstract:
Current remote sensing image classification problems have to deal with an unprecedented amount of heterogeneous and complex data sources. Upcoming missions will soon provide large data streams that will make land cover/use classification difficult. Machine learning classifiers can help at this, and many methods are currently available. A popular kernel classifier is the Gaussian process classifier…
▽ More
Current remote sensing image classification problems have to deal with an unprecedented amount of heterogeneous and complex data sources. Upcoming missions will soon provide large data streams that will make land cover/use classification difficult. Machine learning classifiers can help at this, and many methods are currently available. A popular kernel classifier is the Gaussian process classifier (GPC), since it approaches the classification problem with a solid probabilistic treatment, thus yielding confidence intervals for the predictions as well as very competitive results to state-of-the-art neural networks and support vector machines. However, its computational cost is prohibitive for large scale applications, and constitutes the main obstacle precluding wide adoption. This paper tackles this problem by introducing two novel efficient methodologies for Gaussian Process (GP) classification. We first include the standard random Fourier features approximation into GPC, which largely decreases its computational cost and permits large scale remote sensing image classification. In addition, we propose a model which avoids randomly sampling a number of Fourier frequencies, and alternatively learns the optimal ones within a variational Bayes approach. The performance of the proposed methods is illustrated in complex problems of cloud detection from multispectral imagery and infrared sounding data. Excellent empirical results support the proposal in both computational cost and accuracy.
△ Less
Submitted 3 October, 2017; v1 submitted 2 October, 2017;
originally announced October 2017.
-
Group Importance Sampling for Particle Filtering and MCMC
Authors:
L. Martino,
V. Elvira,
G. Camps-Valls
Abstract:
Bayesian methods and their implementations by means of sophisticated Monte Carlo techniques have become very popular in signal processing over the last years. Importance Sampling (IS) is a well-known Monte Carlo technique that approximates integrals involving a posterior distribution by means of weighted samples. In this work, we study the assignation of a single weighted sample which compresses t…
▽ More
Bayesian methods and their implementations by means of sophisticated Monte Carlo techniques have become very popular in signal processing over the last years. Importance Sampling (IS) is a well-known Monte Carlo technique that approximates integrals involving a posterior distribution by means of weighted samples. In this work, we study the assignation of a single weighted sample which compresses the information contained in a population of weighted samples. Part of the theory that we present as Group Importance Sampling (GIS) has been employed implicitly in different works in the literature. The provided analysis yields several theoretical and practical consequences. For instance, we discuss the application of GIS into the Sequential Importance Resampling framework and show that Independent Multiple Try Metropolis schemes can be interpreted as a standard Metropolis-Hastings algorithm, following the GIS approach. We also introduce two novel Markov Chain Monte Carlo (MCMC) techniques based on GIS. The first one, named Group Metropolis Sampling method, produces a Markov chain of sets of weighted samples. All these sets are then employed for obtaining a unique global estimator. The second one is the Distributed Particle Metropolis-Hastings technique, where different parallel particle filters are jointly used to drive an MCMC algorithm. Different resampled trajectories are compared and then tested with a proper acceptance probability. The novel schemes are tested in different numerical experiments such as learning the hyperparameters of Gaussian Processes, two localization problems in a wireless sensor network (with synthetic and real data) and the tracking of vegetation parameters given satellite observations, where they are compared with several benchmark Monte Carlo techniques. Three illustrative Matlab demos are also provided.
△ Less
Submitted 4 August, 2018; v1 submitted 10 April, 2017;
originally announced April 2017.
-
The Recycling Gibbs Sampler for Efficient Learning
Authors:
Luca Martino,
Victor Elvira,
Gustau Camps-Valls
Abstract:
Monte Carlo methods are essential tools for Bayesian inference. Gibbs sampling is a well-known Markov chain Monte Carlo (MCMC) algorithm, extensively used in signal processing, machine learning, and statistics, employed to draw samples from complicated high-dimensional posterior distributions. The key point for the successful application of the Gibbs sampler is the ability to draw efficiently samp…
▽ More
Monte Carlo methods are essential tools for Bayesian inference. Gibbs sampling is a well-known Markov chain Monte Carlo (MCMC) algorithm, extensively used in signal processing, machine learning, and statistics, employed to draw samples from complicated high-dimensional posterior distributions. The key point for the successful application of the Gibbs sampler is the ability to draw efficiently samples from the full-conditional probability density functions. Since in the general case this is not possible, in order to speed up the convergence of the chain, it is required to generate auxiliary samples whose information is eventually disregarded. In this work, we show that these auxiliary samples can be recycled within the Gibbs estimators, improving their efficiency with no extra cost. This novel scheme arises naturally after pointing out the relationship between the standard Gibbs sampler and the chain rule used for sampling purposes. Numerical simulations involving simple and real inference problems confirm the excellent performance of the proposed scheme in terms of accuracy and computational efficiency. In particular we give empirical evidence of performance in a toy example, inference of Gaussian processes hyperparameters, and learning dependence graphs through regression.
△ Less
Submitted 20 December, 2017; v1 submitted 21 November, 2016;
originally announced November 2016.
-
Sensitivity Maps of the Hilbert-Schmidt Independence Criterion
Authors:
Adrián Pérez-Suay,
Gustau Camps-Valls
Abstract:
Kernel dependence measures yield accurate estimates of nonlinear relations between random variables, and they are also endorsed with solid theoretical properties and convergence rates. Besides, the empirical estimates are easy to compute in closed form just involving linear algebra operations. However, they are hampered by two important problems: the high computational cost involved, as two kernel…
▽ More
Kernel dependence measures yield accurate estimates of nonlinear relations between random variables, and they are also endorsed with solid theoretical properties and convergence rates. Besides, the empirical estimates are easy to compute in closed form just involving linear algebra operations. However, they are hampered by two important problems: the high computational cost involved, as two kernel matrices of the sample size have to be computed and stored, and the interpretability of the measure, which remains hidden behind the implicit feature map. We here address these two issues. We introduce the Sensitivity Maps (SMs) for the Hilbert-Schmidt independence criterion (HSIC). Sensitivity maps allow us to explicitly analyze and visualize the relative relevance of both examples and features on the dependence measure. We also present the randomized HSIC (RHSIC) and its corresponding sensitivity maps to cope with large scale problems. We build upon the framework of random features and the Bochner's theorem to approximate the involved kernels in the canonical HSIC. The power of the RHSIC measure scales favourably with the number of samples, and it approximates HSIC and the sensitivity maps efficiently. Convergence bounds of both the measure and the sensitivity map are also provided. Our proposal is illustrated in synthetic examples, and challenging real problems of dependence estimation, feature selection, and causal inference from empirical data.
△ Less
Submitted 2 November, 2016;
originally announced November 2016.
-
Optimized Kernel Entropy Components
Authors:
Emma Izquierdo-Verdiguier,
Valero Laparra,
Robert Jenssen,
Luis Gómez-Chova,
Gustau Camps-Valls
Abstract:
This work addresses two main issues of the standard Kernel Entropy Component Analysis (KECA) algorithm: the optimization of the kernel decomposition and the optimization of the Gaussian kernel parameter. KECA roughly reduces to a sorting of the importance of kernel eigenvectors by entropy instead of by variance as in Kernel Principal Components Analysis. In this work, we propose an extension of th…
▽ More
This work addresses two main issues of the standard Kernel Entropy Component Analysis (KECA) algorithm: the optimization of the kernel decomposition and the optimization of the Gaussian kernel parameter. KECA roughly reduces to a sorting of the importance of kernel eigenvectors by entropy instead of by variance as in Kernel Principal Components Analysis. In this work, we propose an extension of the KECA method, named Optimized KECA (OKECA), that directly extracts the optimal features retaining most of the data entropy by means of compacting the information in very few features (often in just one or two). The proposed method produces features which have higher expressive power. In particular, it is based on the Independent Component Analysis (ICA) framework, and introduces an extra rotation to the eigen-decomposition, which is optimized via gradient ascent search. This maximum entropy preservation suggests that OKECA features are more efficient than KECA features for density estimation. In addition, a critical issue in both methods is the selection of the kernel parameter since it critically affects the resulting performance. Here we analyze the most common kernel length-scale selection criteria. Results of both methods are illustrated in different synthetic and real problems. Results show that 1) OKECA returns projections with more expressive power than KECA, 2) the most successful rule for estimating the kernel parameter is based on maximum likelihood, and 3) OKECA is more robust to the selection of the length-scale parameter in kernel density estimation.
△ Less
Submitted 9 March, 2016;
originally announced March 2016.
-
Nonlinearities and Adaptation of Color Vision from Sequential Principal Curves Analysis
Authors:
Valero Laparra,
Sandra Jiménez,
Gustavo Camps-Valls,
Jesús Malo
Abstract:
Mechanisms of human color vision are characterized by two phenomenological aspects: the system is nonlinear and adaptive to changing environments. Conventional attempts to derive these features from statistics use separate arguments for each aspect. The few statistical approaches that do consider both phenomena simultaneously follow parametric formulations based on empirical models. Therefore, it…
▽ More
Mechanisms of human color vision are characterized by two phenomenological aspects: the system is nonlinear and adaptive to changing environments. Conventional attempts to derive these features from statistics use separate arguments for each aspect. The few statistical approaches that do consider both phenomena simultaneously follow parametric formulations based on empirical models. Therefore, it may be argued that the behavior does not come directly from the color statistics but from the convenient functional form adopted. In addition, many times the whole statistical analysis is based on simplified databases that disregard relevant physical effects in the input signal, as for instance by assuming flat Lambertian surfaces. Here we address the simultaneous statistical explanation of (i) the nonlinear behavior of achromatic and chromatic mechanisms in a fixed adaptation state, and (ii) the change of such behavior. Both phenomena emerge directly from the samples through a single data-driven method: the Sequential Principal Curves Analysis (SPCA) with local metric. SPCA is a new manifold learning technique to derive a set of sensors adapted to the manifold using different optimality criteria. A new database of colorimetrically calibrated images of natural objects under these illuminants was collected. The results obtained by applying SPCA show that the psychophysical behavior on color discrimination thresholds, discount of the illuminant and corresponding pairs in asymmetric color matching, emerge directly from realistic data regularities assuming no a priori functional form. These results provide stronger evidence for the hypothesis of a statistically driven organization of color sensors. Moreover, the obtained results suggest that color perception at this low abstraction level may be guided by an error minimization strategy rather than by the information maximization principle.
△ Less
Submitted 31 January, 2016;
originally announced February 2016.
-
Iterative Gaussianization: from ICA to Random Rotations
Authors:
Valero Laparra,
Gustavo Camps-Valls,
Jesús Malo
Abstract:
Most signal processing problems involve the challenging task of multidimensional probability density function (PDF) estimation. In this work, we propose a solution to this problem by using a family of Rotation-based Iterative Gaussianization (RBIG) transforms. The general framework consists of the sequential application of a univariate marginal Gaussianization transform followed by an orthonormal…
▽ More
Most signal processing problems involve the challenging task of multidimensional probability density function (PDF) estimation. In this work, we propose a solution to this problem by using a family of Rotation-based Iterative Gaussianization (RBIG) transforms. The general framework consists of the sequential application of a univariate marginal Gaussianization transform followed by an orthonormal transform. The proposed procedure looks for differentiable transforms to a known PDF so that the unknown PDF can be estimated at any point of the original domain. In particular, we aim at a zero mean unit covariance Gaussian for convenience. RBIG is formally similar to classical iterative Projection Pursuit (PP) algorithms. However, we show that, unlike in PP methods, the particular class of rotations used has no special qualitative relevance in this context, since looking for interestingness is not a critical issue for PDF estimation. The key difference is that our approach focuses on the univariate part (marginal Gaussianization) of the problem rather than on the multivariate part (rotation). This difference implies that one may select the most convenient rotation suited to each practical application. The differentiability, invertibility and convergence of RBIG are theoretically and experimentally analyzed. Relation to other methods, such as Radial Gaussianization (RG), one-class support vector domain description (SVDD), and deep neural networks (DNN) is also pointed out. The practical performance of RBIG is successfully illustrated in a number of multidimensional problems such as image synthesis, classification, denoising, and multi-information estimation.
△ Less
Submitted 31 January, 2016;
originally announced February 2016.
-
Principal Polynomial Analysis
Authors:
Valero Laparra,
Sandra Jiménez,
Devis Tuia,
Gustau Camps-Valls,
Jesús Malo
Abstract:
This paper presents a new framework for manifold learning based on a sequence of principal polynomials that capture the possibly nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) generalizes PCA by modeling the directions of maximal variance by means of curves, instead of straight lines. Contrarily to previous approaches, PPA reduces to performing simple univariate reg…
▽ More
This paper presents a new framework for manifold learning based on a sequence of principal polynomials that capture the possibly nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) generalizes PCA by modeling the directions of maximal variance by means of curves, instead of straight lines. Contrarily to previous approaches, PPA reduces to performing simple univariate regressions, which makes it computationally feasible and robust. Moreover, PPA shows a number of interesting analytical properties. First, PPA is a volume-preserving map, which in turn guarantees the existence of the inverse. Second, such an inverse can be obtained in closed form. Invertibility is an important advantage over other learning methods, because it permits to understand the identified features in the input domain where the data has physical meaning. Moreover, it allows to evaluate the performance of dimensionality reduction in sensible (input-domain) units. Volume preservation also allows an easy computation of information theoretic quantities, such as the reduction in multi-information after the transform. Third, the analytical nature of PPA leads to a clear geometrical interpretation of the manifold: it allows the computation of Frenet-Serret frames (local features) and of generalized curvatures at any point of the space. And fourth, the analytical Jacobian allows the computation of the metric induced by the data, thus generalizing the Mahalanobis distance. These properties are demonstrated theoretically and illustrated experimentally. The performance of PPA is evaluated in dimensionality and redundancy reduction, in both synthetic and real datasets from the UCI repository.
△ Less
Submitted 31 January, 2016;
originally announced February 2016.
-
Image Denoising with Kernels based on Natural Image Relations
Authors:
Valero Laparra,
Juan Gutiérrez,
Gustavo Camps-Valls,
Jesús Malo
Abstract:
A successful class of image denoising methods is based on Bayesian approaches working in wavelet representations. However, analytical estimates can be obtained only for particular combinations of analytical models of signal and noise, thus precluding its straightforward extension to deal with other arbitrary noise sources. In this paper, we propose an alternative non-explicit way to take into acco…
▽ More
A successful class of image denoising methods is based on Bayesian approaches working in wavelet representations. However, analytical estimates can be obtained only for particular combinations of analytical models of signal and noise, thus precluding its straightforward extension to deal with other arbitrary noise sources. In this paper, we propose an alternative non-explicit way to take into account the relations among natural image wavelet coefficients for denoising: we use support vector regression (SVR) in the wavelet domain to enforce these relations in the estimated signal. Since relations among the coefficients are specific to the signal, the regularization property of SVR is exploited to remove the noise, which does not share this feature. The specific signal relations are encoded in an anisotropic kernel obtained from mutual information measures computed on a representative image database. Training considers minimizing the Kullback-Leibler divergence (KLD) between the estimated and actual probability functions of signal and noise in order to enforce similarity. Due to its non-parametric nature, the method can eventually cope with different noise sources without the need of an explicit re-formulation, as it is strictly necessary under parametric Bayesian formalisms. Results under several noise levels and noise sources show that: (1) the proposed method outperforms conventional wavelet methods that assume coefficient independence, (2) it is similar to state-of-the-art methods that do explicitly include these relations when the noise source is Gaussian, and (3) it gives better numerical and visual performance when more complex, realistic noise sources are considered. Therefore, the proposed machine learning approach can be seen as a more flexible (model-free) alternative to the explicit description of wavelet coefficient relations for image denoising.
△ Less
Submitted 31 January, 2016;
originally announced February 2016.