Search | arXiv e-print repository

A Classification of Artificial Intelligence Systems for Mathematics Education

Authors: Steven Van Vaerenbergh, Adrián Pérez-Suay

Abstract: This chapter provides an overview of the different Artificial Intelligence (AI) systems that are being used in contemporary digital tools for Mathematics Education (ME). It is aimed at researchers in AI and Machine Learning (ML), for whom we shed some light on the specific technologies that are being used in educational applications; and at researchers in ME, for whom we clarify: i) what the possi… ▽ More This chapter provides an overview of the different Artificial Intelligence (AI) systems that are being used in contemporary digital tools for Mathematics Education (ME). It is aimed at researchers in AI and Machine Learning (ML), for whom we shed some light on the specific technologies that are being used in educational applications; and at researchers in ME, for whom we clarify: i) what the possibilities of the current AI technologies are, ii) what is still out of reach and iii) what is to be expected in the near future. We start our analysis by establishing a high-level taxonomy of AI tools that are found as components in digital ME applications. Then, we describe in detail how these AI tools, and in particular ML, are being used in two key applications, specifically AI-based calculators and intelligent tutoring systems. We finish the chapter with a discussion about student modeling systems and their relationship to artificial general intelligence. △ Less

Submitted 20 October, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

Comments: Chapter in the upcoming book "Mathematics Education in the Age of Artificial Intelligence: How Artificial Intelligence can serve Mathematical Human Learning", Springer Nature, edited by P. R. Richard, P. Vélez, and S. Van Vaerenbergh

arXiv:2012.14303 [pdf, other]

Causal Inference in Geosciences with Kernel Sensitivity Maps

Authors: Adrián Pérez-Suay, Gustau Camps-Valls

Abstract: Establishing causal relations between random variables from observational data is perhaps the most important challenge in today's Science. In remote sensing and geosciences this is of special relevance to better understand the Earth's system and the complex and elusive interactions between processes. In this paper we explore a framework to derive cause-effect relations from pairs of variables via… ▽ More Establishing causal relations between random variables from observational data is perhaps the most important challenge in today's Science. In remote sensing and geosciences this is of special relevance to better understand the Earth's system and the complex and elusive interactions between processes. In this paper we explore a framework to derive cause-effect relations from pairs of variables via regression and dependence estimation. We propose to focus on the sensitivity (curvature) of the dependence estimator to account for the asymmetry of the forward and inverse densities of approximation residuals. Results in a large collection of 28 geoscience causal inference problems demonstrate the good capabilities of the method. △ Less

Submitted 7 December, 2020; originally announced December 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1611.00555, arXiv:2012.05150

arXiv:2012.12308 [pdf, other]

Randomized RX for target detection

Authors: Fatih Nar, Adrián Pérez-Suay, José Antonio Padrón, Gustau Camps-Valls

Abstract: This work tackles the target detection problem through the well-known global RX method. The RX method models the clutter as a multivariate Gaussian distribution, and has been extended to nonlinear distributions using kernel methods. While the kernel RX can cope with complex clutters, it requires a considerable amount of computational resources as the number of clutter pixels gets larger. Here we p… ▽ More This work tackles the target detection problem through the well-known global RX method. The RX method models the clutter as a multivariate Gaussian distribution, and has been extended to nonlinear distributions using kernel methods. While the kernel RX can cope with complex clutters, it requires a considerable amount of computational resources as the number of clutter pixels gets larger. Here we propose random Fourier features to approximate the Gaussian kernel in kernel RX and consequently our development keep the accuracy of the nonlinearity while reducing the computational cost which is now controlled by an hyperparameter. Results over both synthetic and real-world image target detection problems show space and time efficiency of the proposed method while providing high detection performance. △ Less

Submitted 8 December, 2020; originally announced December 2020.

arXiv:2012.12307 [pdf, other]

Nonlinear Cook distance for Anomalous Change Detection

Authors: José A. Padrón Hidalgo, Adrián Pérez-Suay, Fatih Nar, Gustau Camps-Valls

Abstract: In this work we propose a method to find anomalous changes in remote sensing images based on the chronochrome approach. A regressor between images is used to discover the most {\em influential points} in the observed data. Typically, the pixels with largest residuals are decided to be anomalous changes. In order to find the anomalous pixels we consider the Cook distance and propose its nonlinear e… ▽ More In this work we propose a method to find anomalous changes in remote sensing images based on the chronochrome approach. A regressor between images is used to discover the most {\em influential points} in the observed data. Typically, the pixels with largest residuals are decided to be anomalous changes. In order to find the anomalous pixels we consider the Cook distance and propose its nonlinear extension using random Fourier features as an efficient nonlinear measure of impact. Good empirical performance is shown over different multispectral images both visually and quantitatively evaluated with ROC curves. △ Less

Submitted 8 December, 2020; originally announced December 2020.

arXiv:2012.12306 [pdf, other]

Pattern Recognition Scheme for Large-Scale Cloud Detection over Landmarks

Authors: Adrián Pérez-Suay, Julia Amorós-López, Luis Gómez-Chova, Jordi Muñoz-Marí, Dieter Just, Gustau Camps-Valls

Abstract: Landmark recognition and matching is a critical step in many Image Navigation and Registration (INR) models for geostationary satellite services, as well as to maintain the geometric quality assessment (GQA) in the instrument data processing chain of Earth observation satellites. Matching the landmark accurately is of paramount relevance, and the process can be strongly impacted by the cloud conta… ▽ More Landmark recognition and matching is a critical step in many Image Navigation and Registration (INR) models for geostationary satellite services, as well as to maintain the geometric quality assessment (GQA) in the instrument data processing chain of Earth observation satellites. Matching the landmark accurately is of paramount relevance, and the process can be strongly impacted by the cloud contamination of a given landmark. This paper introduces a complete pattern recognition methodology able to detect the presence of clouds over landmarks using Meteosat Second Generation (MSG) data. The methodology is based on the ensemble combination of dedicated support vector machines (SVMs) dependent on the particular landmark and illumination conditions. This divide-and-conquer strategy is motivated by the data complexity and follows a physically-based strategy that considers variability both in seasonality and illumination conditions along the day to split observations. In addition, it allows training the classification scheme with millions of samples at an affordable computational costs. The image archive was composed of 200 landmark test sites with near 7 million multispectral images that correspond to MSG acquisitions during 2010. Results are analyzed in terms of cloud detection accuracy and computational cost. We provide illustrative source code and a portion of the huge training data to the community. △ Less

Submitted 8 December, 2020; originally announced December 2020.

arXiv:2012.12105 [pdf, other]

Warped Gaussian Processes in Remote Sensing Parameter Estimation and Causal Inference

Authors: Anna Mateo-Sanchis, Jordi Muñoz-Marí, Adrián Pérez-Suay, Gustau Camps-Valls

Abstract: This paper introduces warped Gaussian processes (WGP) regression in remote sensing applications. WGP models output observations as a parametric nonlinear transformation of a GP. The parameters of such prior model are then learned via standard maximum likelihood. We show the good performance of the proposed model for the estimation of oceanic chlorophyll content from multispectral data, vegetation… ▽ More This paper introduces warped Gaussian processes (WGP) regression in remote sensing applications. WGP models output observations as a parametric nonlinear transformation of a GP. The parameters of such prior model are then learned via standard maximum likelihood. We show the good performance of the proposed model for the estimation of oceanic chlorophyll content from multispectral data, vegetation parameters (chlorophyll, leaf area index, and fractional vegetation cover) from hyperspectral data, and in the detection of the causal direction in a collection of 28 bivariate geoscience and remote sensing causal problems. The model consistently performs better than the standard GP and the more advanced heteroscedastic GP model, both in terms of accuracy and more sensible confidence intervals. △ Less

Submitted 9 December, 2020; originally announced December 2020.

arXiv:2012.10393 [pdf, other]

A deep network approach to multitemporal cloud detection

Authors: Devis Tuia, Benjamin Kellenberger, Adrian Pérez-Suay, Gustau Camps-Valls

Abstract: We present a deep learning model with temporal memory to detect clouds in image time series acquired by the Seviri imager mounted on the Meteosat Second Generation (MSG) satellite. The model provides pixel-level cloud maps with related confidence and propagates information in time via a recurrent neural network structure. With a single model, we are able to outline clouds along all year and during… ▽ More We present a deep learning model with temporal memory to detect clouds in image time series acquired by the Seviri imager mounted on the Meteosat Second Generation (MSG) satellite. The model provides pixel-level cloud maps with related confidence and propagates information in time via a recurrent neural network structure. With a single model, we are able to outline clouds along all year and during day and night with high accuracy. △ Less

Submitted 9 December, 2020; originally announced December 2020.

arXiv:2012.06377 [pdf, other]

Nonlinear Distribution Regression for Remote Sensing Applications

Authors: Jose E. Adsuara, Adrián Pérez-Suay, Jordi Muñoz-Marí, Anna Mateo-Sanchis, Maria Piles, Gustau Camps-Valls

Abstract: In many remote sensing applications one wants to estimate variables or parameters of interest from observations. When the target variable is available at a resolution that matches the remote sensing observations, standard algorithms such as neural networks, random forests or Gaussian processes are readily available to relate the two. However, we often encounter situations where the target variable… ▽ More In many remote sensing applications one wants to estimate variables or parameters of interest from observations. When the target variable is available at a resolution that matches the remote sensing observations, standard algorithms such as neural networks, random forests or Gaussian processes are readily available to relate the two. However, we often encounter situations where the target variable is only available at the group level, i.e. collectively associated to a number of remotely sensed observations. This problem setting is known in statistics and machine learning as {\em multiple instance learning} or {\em distribution regression}. This paper introduces a nonlinear (kernel-based) method for distribution regression that solves the previous problems without making any assumption on the statistics of the grouped data. The presented formulation considers distribution embeddings in reproducing kernel Hilbert spaces, and performs standard least squares regression with the empirical means therein. A flexible version to deal with multisource data of different dimensionality and sample sizes is also presented and evaluated. It allows working with the native spatial resolution of each sensor, avoiding the need of match-up procedures. Noting the large computational cost of the approach, we introduce an efficient version via random Fourier features to cope with millions of points and groups. △ Less

Submitted 7 December, 2020; originally announced December 2020.

arXiv:2012.05799 [pdf, ps, other]

doi 10.1109/LGRS.2020.2970582

Efficient Nonlinear RX Anomaly Detectors

Authors: José A. Padrón Hidalgo, Adrián Pérez-Suay, Fatih Nar, Gustau Camps-Valls

Abstract: Current anomaly detection algorithms are typically challenged by either accuracy or efficiency. More accurate nonlinear detectors are typically slow and not scalable. In this letter, we propose two families of techniques to improve the efficiency of the standard kernel Reed-Xiaoli (RX) method for anomaly detection by approximating the kernel function with either {\em data-independent} random Fouri… ▽ More Current anomaly detection algorithms are typically challenged by either accuracy or efficiency. More accurate nonlinear detectors are typically slow and not scalable. In this letter, we propose two families of techniques to improve the efficiency of the standard kernel Reed-Xiaoli (RX) method for anomaly detection by approximating the kernel function with either {\em data-independent} random Fourier features or {\em data-dependent} basis with the Nyström approach. We compare all methods for both real multi- and hyperspectral images. We show that the proposed efficient methods have a lower computational cost and they perform similar (or outperform) the standard kernel RX algorithm thanks to their implicit regularization effect. Last but not least, the Nyström approach has an improved power of detection. △ Less

Submitted 7 December, 2020; originally announced December 2020.

arXiv:2012.05150 [pdf, other]

Causal Inference in Geoscience and Remote Sensing from Observational Data

Authors: Adrián Pérez-Suay, Gustau Camps-Valls

Abstract: Establishing causal relations between random variables from observational data is perhaps the most important challenge in today's \blue{science}. In remote sensing and geosciences this is of special relevance to better understand the Earth's system and the complex interactions between the governing processes. In this paper, we focus on observational causal inference, thus we try to estimate the co… ▽ More Establishing causal relations between random variables from observational data is perhaps the most important challenge in today's \blue{science}. In remote sensing and geosciences this is of special relevance to better understand the Earth's system and the complex interactions between the governing processes. In this paper, we focus on observational causal inference, thus we try to estimate the correct direction of causation using a finite set of empirical data. In addition, we focus on the more complex bivariate scenario that requires strong assumptions and no conditional independence tests can be used. In particular, we explore the framework of (non-deterministic) additive noise models, which relies on the principle of independence between the cause and the generating mechanism. A practical algorithmic instantiation of such principle only requires 1) two regression models in the forward and backward directions, and 2) the estimation of {\em statistical independence} between the obtained residuals and the observations. The direction leading to more independent residuals is decided to be the cause. We instead propose a criterion that uses the {\em sensitivity} (derivative) of the dependence estimator, the sensitivity criterion allows to identify samples most affecting the dependence measure, and hence the criterion is robust to spurious detections. We illustrate performance in a collection of 28 geoscience causal inference problems, in a database of radiative transfer models simulations and machine learning emulators in vegetation parameter modeling involving 182 problems, and in assessing the impact of different regression models in a carbon cycle problem. The criterion achieves state-of-the-art detection rates in all cases, it is generally robust to noise sources and distortions. △ Less

Submitted 7 December, 2020; originally announced December 2020.

arXiv:2012.04922 [pdf, other]

Consistent regression of biophysical parameters with kernel methods

Authors: Emiliano Díaz, Adrián Pérez-Suay, Valero Laparra, Gustau Camps-Valls

Abstract: This paper introduces a novel statistical regression framework that allows the incorporation of consistency constraints. A linear and nonlinear (kernel-based) formulation are introduced, and both imply closed-form analytical solutions. The models exploit all the information from a set of drivers while being maximally independent of a set of auxiliary, protected variables. We successfully illustrat… ▽ More This paper introduces a novel statistical regression framework that allows the incorporation of consistency constraints. A linear and nonlinear (kernel-based) formulation are introduced, and both imply closed-form analytical solutions. The models exploit all the information from a set of drivers while being maximally independent of a set of auxiliary, protected variables. We successfully illustrate the performance in the estimation of chlorophyll content. △ Less

Submitted 9 December, 2020; originally announced December 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1710.05578

arXiv:2012.03630 [pdf, other]

Randomized kernels for large scale Earth observation applications

Authors: Adrián Pérez-Suay, Julia Amorós-López, Luis Gómez-Chova, Valero Laparra, Jordi Muñoz-Marí, Gustau Camps-Valls

Abstract: Dealing with land cover classification of the new image sources has also turned to be a complex problem requiring large amount of memory and processing time. In order to cope with these problems, statistical learning has greatly helped in the last years to develop statistical retrieval and classification models that can ingest large amounts of Earth observation data. Kernel methods constitute a fa… ▽ More Dealing with land cover classification of the new image sources has also turned to be a complex problem requiring large amount of memory and processing time. In order to cope with these problems, statistical learning has greatly helped in the last years to develop statistical retrieval and classification models that can ingest large amounts of Earth observation data. Kernel methods constitute a family of powerful machine learning algorithms, which have found wide use in remote sensing and geosciences. However, kernel methods are still not widely adopted because of the high computational cost when dealing with large scale problems, such as the inversion of radiative transfer models or the classification of high spatial-spectral-temporal resolution data. This paper introduces an efficient kernel method for fast statistical retrieval of bio-geo-physical parameters and image classification problems. The method allows to approximate a kernel matrix with a set of projections on random bases sampled from the Fourier domain. The method is simple, computationally very efficient in both memory and processing costs, and easily parallelizable. We show that kernel regression and classification is now possible for datasets with millions of examples and high dimensionality. Examples on atmospheric parameter retrieval from hyperspectral infrared sounders like IASI/Metop; large scale emulation and inversion of the familiar PROSAIL radiative transfer model on Sentinel-2 data; and the identification of clouds over landmarks in time series of MSG/Seviri images show the efficiency and effectiveness of the proposed technique. △ Less

Submitted 7 December, 2020; originally announced December 2020.

arXiv:2010.09031 [pdf, other]

Living in the Physics and Machine Learning Interplay for Earth Observation

Authors: Gustau Camps-Valls, Daniel H. Svendsen, Jordi Cortés-Andrés, Álvaro Moreno-Martínez, Adrián Pérez-Suay, Jose Adsuara, Irene Martín, Maria Piles, Jordi Muñoz-Marí, Luca Martino

Abstract: Most problems in Earth sciences aim to do inferences about the system, where accurate predictions are just a tiny part of the whole problem. Inferences mean understanding variables relations, deriving models that are physically interpretable, that are simple parsimonious, and mathematically tractable. Machine learning models alone are excellent approximators, but very often do not respect the most… ▽ More Most problems in Earth sciences aim to do inferences about the system, where accurate predictions are just a tiny part of the whole problem. Inferences mean understanding variables relations, deriving models that are physically interpretable, that are simple parsimonious, and mathematically tractable. Machine learning models alone are excellent approximators, but very often do not respect the most elementary laws of physics, like mass or energy conservation, so consistency and confidence are compromised. In this paper, we describe the main challenges ahead in the field, and introduce several ways to live in the Physics and machine learning interplay: to encode differential equations from data, constrain data-driven models with physics-priors and dependence constraints, improve parameterizations, emulate physical models, and blend data-driven and process-based models. This is a collective long-term AI agenda towards develo** and applying algorithms capable of discovering knowledge in the Earth system. △ Less

Submitted 18 October, 2020; originally announced October 2020.

Comments: 24 pages, 10 figures, 3 tables, expanded AAAI PGAI 2020 Symposium

arXiv:2007.14706 [pdf, other]

doi 10.1371/journal.pone.0235885

Kernel Methods and their derivatives: Concept and perspectives for the Earth system sciences

Authors: J. Emmanuel Johnson, Valero Laparra, Adrián Pérez-Suay, Miguel D. Mahecha, Gustau Camps-Valls

Abstract: Kernel methods are powerful machine learning techniques which implement generic non-linear functions to solve complex tasks in a simple way. They Have a solid mathematical background and exhibit excellent performance in practice. However, kernel machines are still considered black-box models as the feature map** is not directly accessible and difficult to interpret.The aim of this work is to sho… ▽ More Kernel methods are powerful machine learning techniques which implement generic non-linear functions to solve complex tasks in a simple way. They Have a solid mathematical background and exhibit excellent performance in practice. However, kernel machines are still considered black-box models as the feature map** is not directly accessible and difficult to interpret.The aim of this work is to show that it is indeed possible to interpret the functions learned by various kernel methods is intuitive despite their complexity. Specifically, we show that derivatives of these functions have a simple mathematical formulation, are easy to compute, and can be applied to many different problems. We note that model function derivatives in kernel machines is proportional to the kernel function derivative. We provide the explicit analytic form of the first and second derivatives of the most common kernel functions with regard to the inputs as well as generic formulas to compute higher order derivatives. We use them to analyze the most used supervised and unsupervised kernel learning methods: Gaussian Processes for regression, Support Vector Machines for classification, Kernel Entropy Component Analysis for density estimation, and the Hilbert-Schmidt Independence Criterion for estimating the dependency between random variables. For all cases we expressed the derivative of the learned function as a linear combination of the kernel function derivative. Moreover we provide intuitive explanations through illustrative toy examples and show how to improve the interpretation of real applications in the context of spatiotemporal Earth system data cubes. This work reflects on the observation that function derivatives may play a crucial role in kernel methods analysis and understanding. △ Less

Submitted 5 October, 2020; v1 submitted 29 July, 2020; originally announced July 2020.

Comments: 21 pages, 10 figures, PLOS One Journal

arXiv:1911.04322 [pdf, other]

Kernel Dependence Regularizers and Gaussian Processes with Applications to Algorithmic Fairness

Authors: Zhu Li, Adrian Perez-Suay, Gustau Camps-Valls, Dino Sejdinovic

Abstract: Current adoption of machine learning in industrial, societal and economical activities has raised concerns about the fairness, equity and ethics of automated decisions. Predictive models are often developed using biased datasets and thus retain or even exacerbate biases in their decisions and recommendations. Removing the sensitive covariates, such as gender or race, is insufficient to remedy this… ▽ More Current adoption of machine learning in industrial, societal and economical activities has raised concerns about the fairness, equity and ethics of automated decisions. Predictive models are often developed using biased datasets and thus retain or even exacerbate biases in their decisions and recommendations. Removing the sensitive covariates, such as gender or race, is insufficient to remedy this issue since the biases may be retained due to other related covariates. We present a regularization approach to this problem that trades off predictive accuracy of the learned models (with respect to biased labels) for the fairness in terms of statistical parity, i.e. independence of the decisions from the sensitive covariates. In particular, we consider a general framework of regularized empirical risk minimization over reproducing kernel Hilbert spaces and impose an additional regularizer of dependence between predictors and sensitive covariates using kernel-based measures of dependence, namely the Hilbert-Schmidt Independence Criterion (HSIC) and its normalized version. This approach leads to a closed-form solution in the case of squared loss, i.e. ridge regression. Moreover, we show that the dependence regularizer has an interpretation as modifying the corresponding Gaussian process (GP) prior. As a consequence, a GP model with a prior that encourages fairness to sensitive variables can be derived, allowing principled hyperparameter selection and studying of the relative relevance of covariates under fairness constraints. Experimental results in synthetic examples and in real problems of income and crime prediction illustrate the potential of the approach to improve fairness of automated decisions. △ Less

Submitted 11 November, 2019; originally announced November 2019.

arXiv:1710.00575 [pdf, other]

doi 10.1109/TGRS.2017.2758922

Remote Sensing Image Classification with Large Scale Gaussian Processes

Authors: Pablo Morales-Alvarez, Adrian Perez-Suay, Rafael Molina, Gustau Camps-Valls

Abstract: Current remote sensing image classification problems have to deal with an unprecedented amount of heterogeneous and complex data sources. Upcoming missions will soon provide large data streams that will make land cover/use classification difficult. Machine learning classifiers can help at this, and many methods are currently available. A popular kernel classifier is the Gaussian process classifier… ▽ More Current remote sensing image classification problems have to deal with an unprecedented amount of heterogeneous and complex data sources. Upcoming missions will soon provide large data streams that will make land cover/use classification difficult. Machine learning classifiers can help at this, and many methods are currently available. A popular kernel classifier is the Gaussian process classifier (GPC), since it approaches the classification problem with a solid probabilistic treatment, thus yielding confidence intervals for the predictions as well as very competitive results to state-of-the-art neural networks and support vector machines. However, its computational cost is prohibitive for large scale applications, and constitutes the main obstacle precluding wide adoption. This paper tackles this problem by introducing two novel efficient methodologies for Gaussian Process (GP) classification. We first include the standard random Fourier features approximation into GPC, which largely decreases its computational cost and permits large scale remote sensing image classification. In addition, we propose a model which avoids randomly sampling a number of Fourier frequencies, and alternatively learns the optimal ones within a variational Bayes approach. The performance of the proposed methods is illustrated in complex problems of cloud detection from multispectral imagery and infrared sounding data. Excellent empirical results support the proposal in both computational cost and accuracy. △ Less

Submitted 3 October, 2017; v1 submitted 2 October, 2017; originally announced October 2017.

Comments: 11 pages, 6 figures, Accepted for publication in IEEE Transactions on Geoscience and Remote Sensing; added the IEEE copyright statement

Showing 1–16 of 16 results for author: Perez-Suay, A