-
Cross-sensor super-resolution of irregularly sampled Sentinel-2 time series
Authors:
Aimi Okabayashi,
Nicolas Audebert,
Simon Donike,
Charlotte Pelletier
Abstract:
Satellite imaging generally presents a trade-off between the frequency of acquisitions and the spatial resolution of the images. Super-resolution is often advanced as a way to get the best of both worlds. In this work, we investigate multi-image super-resolution of satellite image time series, i.e. how multiple images of the same area acquired at different dates can help reconstruct a higher resol…
▽ More
Satellite imaging generally presents a trade-off between the frequency of acquisitions and the spatial resolution of the images. Super-resolution is often advanced as a way to get the best of both worlds. In this work, we investigate multi-image super-resolution of satellite image time series, i.e. how multiple images of the same area acquired at different dates can help reconstruct a higher resolution observation. In particular, we extend state-of-the-art deep single and multi-image super-resolution algorithms, such as SRDiff and HighRes-net, to deal with irregularly sampled Sentinel-2 time series. We introduce BreizhSR, a new dataset for 4x super-resolution of Sentinel-2 time series using very high-resolution SPOT-6 imagery of Brittany, a French region. We show that using multiple images significantly improves super-resolution performance, and that a well-designed temporal positional encoding allows us to perform super-resolution for different times of the series. In addition, we observe a trade-off between spectral fidelity and perceptual quality of the reconstructed HR images, questioning future directions for super-resolution of Earth Observation data.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Deep Learning for Satellite Image Time Series Analysis: A Review
Authors:
Lynn Miller,
Charlotte Pelletier,
Geoffrey I. Webb
Abstract:
Earth observation (EO) satellite missions have been providing detailed images about the state of the Earth and its land cover for over 50 years. Long term missions, such as NASA's Landsat, Terra, and Aqua satellites, and more recently, the ESA's Sentinel missions, record images of the entire world every few days. Although single images provide point-in-time data, repeated images of the same area,…
▽ More
Earth observation (EO) satellite missions have been providing detailed images about the state of the Earth and its land cover for over 50 years. Long term missions, such as NASA's Landsat, Terra, and Aqua satellites, and more recently, the ESA's Sentinel missions, record images of the entire world every few days. Although single images provide point-in-time data, repeated images of the same area, or satellite image time series (SITS) provide information about the changing state of vegetation and land use. These SITS are useful for modeling dynamic processes and seasonal changes such as plant phenology. They have potential benefits for many aspects of land and natural resource management, including applications in agricultural, forest, water, and disaster management, urban planning, and mining. However, the resulting satellite image time series (SITS) are complex, incorporating information from the temporal, spatial, and spectral dimensions. Therefore, deep learning methods are often deployed as they can analyze these complex relationships. This review presents a summary of the state-of-the-art methods of modelling environmental, agricultural, and other Earth observation variables from SITS data using deep learning methods. We aim to provide a resource for remote sensing experts interested in using deep learning techniques to enhance Earth observation models with temporal information.
△ Less
Submitted 11 April, 2024; v1 submitted 5 April, 2024;
originally announced April 2024.
-
Subtle Misogyny Detection and Mitigation: An Expert-Annotated Dataset
Authors:
Brooklyn Sheppard,
Anna Richter,
Allison Cohen,
Elizabeth Allyn Smith,
Tamara Kneese,
Carolyne Pelletier,
Ioana Baldini,
Yue Dong
Abstract:
Using novel approaches to dataset development, the Biasly dataset captures the nuance and subtlety of misogyny in ways that are unique within the literature. Built in collaboration with multi-disciplinary experts and annotators themselves, the dataset contains annotations of movie subtitles, capturing colloquial expressions of misogyny in North American film. The dataset can be used for a range of…
▽ More
Using novel approaches to dataset development, the Biasly dataset captures the nuance and subtlety of misogyny in ways that are unique within the literature. Built in collaboration with multi-disciplinary experts and annotators themselves, the dataset contains annotations of movie subtitles, capturing colloquial expressions of misogyny in North American film. The dataset can be used for a range of NLP tasks, including classification, severity score regression, and text generation for rewrites. In this paper, we discuss the methodology used, analyze the annotations obtained, and provide baselines using common NLP algorithms in the context of misogyny detection and mitigation. We hope this work will promote AI for social good in NLP for bias detection, explanation, and removal.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Match-And-Deform: Time Series Domain Adaptation through Optimal Transport and Temporal Alignment
Authors:
François Painblanc,
Laetitia Chapel,
Nicolas Courty,
Chloé Friguet,
Charlotte Pelletier,
Romain Tavenard
Abstract:
While large volumes of unlabeled data are usually available, associated labels are often scarce. The unsupervised domain adaptation problem aims at exploiting labels from a source domain to classify data from a related, yet different, target domain. When time series are at stake, new difficulties arise as temporal shifts may appear in addition to the standard feature distribution shift. In this pa…
▽ More
While large volumes of unlabeled data are usually available, associated labels are often scarce. The unsupervised domain adaptation problem aims at exploiting labels from a source domain to classify data from a related, yet different, target domain. When time series are at stake, new difficulties arise as temporal shifts may appear in addition to the standard feature distribution shift. In this paper, we introduce the Match-And-Deform (MAD) approach that aims at finding correspondences between the source and target time series while allowing temporal distortions. The associated optimization problem simultaneously aligns the series thanks to an optimal transport loss and the time stamps through dynamic time war**. When embedded into a deep neural network, MAD helps learning new representations of time series that both align the domains and maximize the discriminative power of the network. Empirical studies on benchmark datasets and remote sensing data demonstrate that MAD makes meaningful sample-to-sample pairing and time shift estimation, reaching similar or better classification performance than state-of-the-art deep time series domain adaptation strategies.
△ Less
Submitted 25 August, 2023; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Generalized Classification of Satellite Image Time Series with Thermal Positional Encoding
Authors:
Joachim Nyborg,
Charlotte Pelletier,
Ira Assent
Abstract:
Large-scale crop type classification is a task at the core of remote sensing efforts with applications of both economic and ecological importance. Current state-of-the-art deep learning methods are based on self-attention and use satellite image time series (SITS) to discriminate crop types based on their unique growth patterns. However, existing methods generalize poorly to regions not seen durin…
▽ More
Large-scale crop type classification is a task at the core of remote sensing efforts with applications of both economic and ecological importance. Current state-of-the-art deep learning methods are based on self-attention and use satellite image time series (SITS) to discriminate crop types based on their unique growth patterns. However, existing methods generalize poorly to regions not seen during training mainly due to not being robust to temporal shifts of the growing season caused by variations in climate. To this end, we propose Thermal Positional Encoding (TPE) for attention-based crop classifiers. Unlike previous positional encoding based on calendar time (e.g. day-of-year), TPE is based on thermal time, which is obtained by accumulating daily average temperatures over the growing season. Since crop growth is directly related to thermal time, but not calendar time, TPE addresses the temporal shifts between different regions to improve generalization. We propose multiple TPE strategies, including learnable methods, to further improve results compared to the common fixed positional encodings. We demonstrate our approach on a crop classification task across four different European regions, where we obtain state-of-the-art generalization results.
△ Less
Submitted 14 June, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
TimeMatch: Unsupervised Cross-Region Adaptation by Temporal Shift Estimation
Authors:
Joachim Nyborg,
Charlotte Pelletier,
Sébastien Lefèvre,
Ira Assent
Abstract:
The recent developments of deep learning models that capture complex temporal patterns of crop phenology have greatly advanced crop classification from Satellite Image Time Series (SITS). However, when applied to target regions spatially different from the training region, these models perform poorly without any target labels due to the temporal shift of crop phenology between regions. Although va…
▽ More
The recent developments of deep learning models that capture complex temporal patterns of crop phenology have greatly advanced crop classification from Satellite Image Time Series (SITS). However, when applied to target regions spatially different from the training region, these models perform poorly without any target labels due to the temporal shift of crop phenology between regions. Although various unsupervised domain adaptation techniques have been proposed in recent years, no method explicitly learns the temporal shift of SITS and thus provides only limited benefits for crop classification. To address this, we propose TimeMatch, which explicitly accounts for the temporal shift for improved SITS-based domain adaptation. In TimeMatch, we first estimate the temporal shift from the target to the source region using the predictions of a source-trained model. Then, we re-train the model for the target region by an iterative algorithm where the estimated shift is used to generate accurate target pseudo-labels. Additionally, we introduce an open-access dataset for cross-region adaptation from SITS in four different regions in Europe. On our dataset, we demonstrate that TimeMatch outperforms all competing methods by 11% in average F1-score across five different adaptation scenarios, setting a new state-of-the-art in cross-region adaptation.
△ Less
Submitted 9 May, 2022; v1 submitted 4 November, 2021;
originally announced November 2021.
-
Elastic Similarity and Distance Measures for Multivariate Time Series
Authors:
Ahmed Shifaz,
Charlotte Pelletier,
Francois Petitjean,
Geoffrey I. Webb
Abstract:
This paper contributes multivariate versions of seven commonly used elastic similarity and distance measures for time series data analytics. Elastic similarity and distance measures are a class of similarity measures that can compensate for misalignments in the time axis of time series data. We adapt two existing strategies used in a multivariate version of the well-known Dynamic Time War** (DTW…
▽ More
This paper contributes multivariate versions of seven commonly used elastic similarity and distance measures for time series data analytics. Elastic similarity and distance measures are a class of similarity measures that can compensate for misalignments in the time axis of time series data. We adapt two existing strategies used in a multivariate version of the well-known Dynamic Time War** (DTW), namely, Independent and Dependent DTW, to these seven measures.
While these measures can be applied to various time series analysis tasks, we demonstrate their utility on multivariate time series classification using the nearest neighbor classifier. On 23 well-known datasets, we demonstrate that each of the measures but one achieves the highest accuracy relative to others on at least one dataset, supporting the value of develo** a suite of multivariate similarity and distance measures. We also demonstrate that there are datasets for which either the dependent versions of all measures are more accurate than their independent counterparts or vice versa. In addition, we also construct a nearest neighbor-based ensemble of the measures and show that it is competitive to other state-of-the-art single-strategy multivariate time series classifiers.
△ Less
Submitted 17 January, 2023; v1 submitted 19 February, 2021;
originally announced February 2021.
-
A Bayesian-inspired, deep learning-based, semi-supervised domain adaptation technique for land cover map**
Authors:
Benjamin Lucas,
Charlotte Pelletier,
Daniel Schmidt,
Geoffrey I. Webb,
François Petitjean
Abstract:
Land cover maps are a vital input variable to many types of environmental research and management. While they can be produced automatically by machine learning techniques, these techniques require substantial training data to achieve high levels of accuracy, which are not always available. One technique researchers use when labelled training data are scarce is domain adaptation (DA) -- where data…
▽ More
Land cover maps are a vital input variable to many types of environmental research and management. While they can be produced automatically by machine learning techniques, these techniques require substantial training data to achieve high levels of accuracy, which are not always available. One technique researchers use when labelled training data are scarce is domain adaptation (DA) -- where data from an alternate region, known as the source domain, are used to train a classifier and this model is adapted to map the study region, or target domain. The scenario we address in this paper is known as semi-supervised DA, where some labelled samples are available in the target domain. In this paper we present Sourcerer, a Bayesian-inspired, deep learning-based, semi-supervised DA technique for producing land cover maps from SITS data. The technique takes a convolutional neural network trained on a source domain and then trains further on the available target domain with a novel regularizer applied to the model weights. The regularizer adjusts the degree to which the model is modified to fit the target data, limiting the degree of change when the target data are few in number and increasing it as target data quantity increases. Our experiments on Sentinel-2 time series images compare Sourcerer with two state-of-the-art semi-supervised domain adaptation techniques and four baseline models. We show that on two different source-target domain pairings Sourcerer outperforms all other methods for any quantity of labelled target data available. In fact, the results on the more difficult target domain show that the starting accuracy of Sourcerer (when no labelled target data are available), 74.2%, is greater than the next-best state-of-the-art method trained on 20,000 labelled target instances.
△ Less
Submitted 10 March, 2021; v1 submitted 25 May, 2020;
originally announced May 2020.
-
InceptionTime: Finding AlexNet for Time Series Classification
Authors:
Hassan Ismail Fawaz,
Benjamin Lucas,
Germain Forestier,
Charlotte Pelletier,
Daniel F. Schmidt,
Jonathan Weber,
Geoffrey I. Webb,
Lhassane Idoumghar,
Pierre-Alain Muller,
François Petitjean
Abstract:
This paper brings deep learning at the forefront of research into Time Series Classification (TSC). TSC is the area of machine learning tasked with the categorization (or labelling) of time series. The last few decades of work in this area have led to significant progress in the accuracy of classifiers, with the state of the art now represented by the HIVE-COTE algorithm. While extremely accurate,…
▽ More
This paper brings deep learning at the forefront of research into Time Series Classification (TSC). TSC is the area of machine learning tasked with the categorization (or labelling) of time series. The last few decades of work in this area have led to significant progress in the accuracy of classifiers, with the state of the art now represented by the HIVE-COTE algorithm. While extremely accurate, HIVE-COTE cannot be applied to many real-world datasets because of its high training time complexity in O(N2 * T4) for a dataset with N time series of length T. For example, it takes HIVE-COTE more than 8 days to learn from a small dataset with N = 1500 time series of short length T = 46. Meanwhile deep learning has received enormous attention because of its high accuracy and scalability. Recent approaches to deep learning for TSC have been scalable, but less accurate than HIVE-COTE. We introduce InceptionTime - an ensemble of deep Convolutional Neural Network (CNN) models, inspired by the Inception-v4 architecture. Our experiments show that InceptionTime is on par with HIVE-COTE in terms of accuracy while being much more scalable: not only can it learn from 1,500 time series in one hour but it can also learn from 8M time series in 13 hours, a quantity of data that is fully out of reach of HIVE-COTE.
△ Less
Submitted 5 December, 2020; v1 submitted 11 September, 2019;
originally announced September 2019.
-
TS-CHIEF: A Scalable and Accurate Forest Algorithm for Time Series Classification
Authors:
Ahmed Shifaz,
Charlotte Pelletier,
Francois Petitjean,
Geoffrey I. Webb
Abstract:
Time Series Classification (TSC) has seen enormous progress over the last two decades. HIVE-COTE (Hierarchical Vote Collective of Transformation-based Ensembles) is the current state of the art in terms of classification accuracy. HIVE-COTE recognizes that time series data are a specific data type for which the traditional attribute-value representation, used predominantly in machine learning, fai…
▽ More
Time Series Classification (TSC) has seen enormous progress over the last two decades. HIVE-COTE (Hierarchical Vote Collective of Transformation-based Ensembles) is the current state of the art in terms of classification accuracy. HIVE-COTE recognizes that time series data are a specific data type for which the traditional attribute-value representation, used predominantly in machine learning, fails to provide a relevant representation. HIVE-COTE combines multiple types of classifiers: each extracting information about a specific aspect of a time series, be it in the time domain, frequency domain or summarization of intervals within the series. However, HIVE-COTE (and its predecessor, FLAT-COTE) is often infeasible to run on even modest amounts of data. For instance, training HIVE-COTE on a dataset with only 1,500 time series can require 8 days of CPU time. It has polynomial runtime with respect to the training set size, so this problem compounds as data quantity increases. We propose a novel TSC algorithm, TS-CHIEF (Time Series Combination of Heterogeneous and Integrated Embedding Forest), which rivals HIVE-COTE in accuracy but requires only a fraction of the runtime. TS-CHIEF constructs an ensemble classifier that integrates the most effective embeddings of time series that research has developed in the last decade. It uses tree-structured classifiers to do so efficiently. We assess TS-CHIEF on 85 datasets of the University of California Riverside (UCR) archive, where it achieves state-of-the-art accuracy with scalability and efficiency. We demonstrate that TS-CHIEF can be trained on 130k time series in 2 days, a data quantity that is beyond the reach of any TSC algorithm with comparable accuracy.
△ Less
Submitted 13 February, 2020; v1 submitted 25 June, 2019;
originally announced June 2019.
-
BreizhCrops: A Time Series Dataset for Crop Type Map**
Authors:
Marc Rußwurm,
Charlotte Pelletier,
Maximilian Zollner,
Sébastien Lefèvre,
Marco Körner
Abstract:
We present Breizhcrops, a novel benchmark dataset for the supervised classification of field crops from satellite time series. We aggregated label data and Sentinel-2 top-of-atmosphere as well as bottom-of-atmosphere time series in the region of Brittany (Breizh in local language), north-east France. We compare seven recently proposed deep neural networks along with a Random Forest baseline. The d…
▽ More
We present Breizhcrops, a novel benchmark dataset for the supervised classification of field crops from satellite time series. We aggregated label data and Sentinel-2 top-of-atmosphere as well as bottom-of-atmosphere time series in the region of Brittany (Breizh in local language), north-east France. We compare seven recently proposed deep neural networks along with a Random Forest baseline. The dataset, model (re-)implementations and pre-trained model weights are available at the associated GitHub repository (https://github.com/dl4sits/BreizhCrops) that has been designed with applicability for practitioners in mind. We plan to maintain the repository with additional data and welcome contributions of novel methods to build a state-of-the-art benchmark on methods for crop type map**.
△ Less
Submitted 10 May, 2020; v1 submitted 28 May, 2019;
originally announced May 2019.
-
Temporal Convolutional Neural Network for the Classification of Satellite Image Time Series
Authors:
Charlotte Pelletier,
Geoffrey I. Webb,
Francois Petitjean
Abstract:
New remote sensing sensors now acquire high spatial and spectral Satellite Image Time Series (SITS) of the world. These series of images are a key component of classification systems that aim at obtaining up-to-date and accurate land cover maps of the Earth's surfaces. More specifically, the combination of the temporal, spectral and spatial resolutions of new SITS makes possible to monitor vegetat…
▽ More
New remote sensing sensors now acquire high spatial and spectral Satellite Image Time Series (SITS) of the world. These series of images are a key component of classification systems that aim at obtaining up-to-date and accurate land cover maps of the Earth's surfaces. More specifically, the combination of the temporal, spectral and spatial resolutions of new SITS makes possible to monitor vegetation dynamics. Although traditional classification algorithms, such as Random Forest (RF), have been successfully applied for SITS classification, these algorithms do not make the most of the temporal domain. Conversely, some approaches that take into account the temporal dimension have recently been tested, especially Recurrent Neural Networks (RNNs). This paper proposes an exhaustive study of another deep learning approaches, namely Temporal Convolutional Neural Networks (TempCNNs) where convolutions are applied in the temporal dimension. The goal is to quantitatively and qualitatively evaluate the contribution of TempCNNs for SITS classification. This paper proposes a set of experiments performed on one million time series extracted from 46 Formosat-2 images. The experimental results show that TempCNNs are more accurate than RF and RNNs, that are the current state of the art for SITS classification. We also highlight some differences with results obtained in computer vision, e.g. about pooling layers. Moreover, we provide some general guidelines on the network architecture, common regularization mechanisms, and hyper-parameter values such as batch size. Finally, we assess the visual quality of the land cover maps produced by TempCNNs.
△ Less
Submitted 30 January, 2019; v1 submitted 25 November, 2018;
originally announced November 2018.
-
Proximity Forest: An effective and scalable distance-based classifier for time series
Authors:
Benjamin Lucas,
Ahmed Shifaz,
Charlotte Pelletier,
Lachlan O'Neill,
Nayyar Zaidi,
Bart Goethals,
Francois Petitjean,
Geoffrey I. Webb
Abstract:
Research into the classification of time series has made enormous progress in the last decade. The UCR time series archive has played a significant role in challenging and guiding the development of new learners for time series classification. The largest dataset in the UCR archive holds 10 thousand time series only; which may explain why the primary research focus has been in creating algorithms…
▽ More
Research into the classification of time series has made enormous progress in the last decade. The UCR time series archive has played a significant role in challenging and guiding the development of new learners for time series classification. The largest dataset in the UCR archive holds 10 thousand time series only; which may explain why the primary research focus has been in creating algorithms that have high accuracy on relatively small datasets.
This paper introduces Proximity Forest, an algorithm that learns accurate models from datasets with millions of time series, and classifies a time series in milliseconds. The models are ensembles of highly randomized Proximity Trees. Whereas conventional decision trees branch on attribute values (and usually perform poorly on time series), Proximity Trees branch on the proximity of time series to one exemplar time series or another; allowing us to leverage the decades of work into develo** relevant measures for time series. Proximity Forest gains both efficiency and accuracy by stochastic selection of both exemplars and similarity measures.
Our work is motivated by recent time series applications that provide orders of magnitude more time series than the UCR benchmarks. Our experiments demonstrate that Proximity Forest is highly competitive on the UCR archive: it ranks among the most accurate classifiers while being significantly faster. We demonstrate on a 1M time series Earth observation dataset that Proximity Forest retains this accuracy on datasets that are many orders of magnitude greater than those in the UCR repository, while learning its models at least 100,000 times faster than current state of the art models Elastic Ensemble and COTE.
△ Less
Submitted 12 December, 2018; v1 submitted 31 August, 2018;
originally announced August 2018.