ASSESSMENT OF SENTINEL-2 SPATIAL AND TEMPORAL COVERAGE BASED ON THE SCENE CLASSIFICATION LAYER

Abstract

Since the launch of the Sentinel-2 (S2) satellites, many ML models have used the data for diverse applications. The scene classification layer (SCL) inside the S2 product provides rich information for training, such as filtering images with high cloud coverage. However, there is more potential in this. We propose a technique to assess the clean optical coverage of a region, expressed by a SITS and calculated with the S2-based SCL data. With a manual threshold and specific labels in the SCL, the proposed technique assigns a percentage of spatial and temporal coverage across the time series and a high/low assessment. By evaluating the AI4EO challenge for Enhanced Agriculture, we show that the assessment is correlated to the predictive results of ML models. The classification results in a region with low spatial and temporal coverage is worse than in a region with high coverage. Finally, we applied the technique across all continents of the global dataset LandCoverNet.

Index Terms— Sentinel-2, Optical Coverage, Satellite Image Time Series, Machine Learning

Copyright 2024 IEEE. Published in the 2024 IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2024), scheduled for 7 - 12 July, 2024 in Athens, Greece. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl. 908-562-3966.

1 INTRODUCTION

Optical observations are the fuel of many remote sensing (RS)-based applications. These images rely on passive observation that is affected by distinct factors, such as clouds, haze, cloud shadow, and snow [1].

S2-based optical data has played a key role in different research fields related to land cover-use map** with machine learning (ML) over the last decade. The inclusion of the SCL has been crucial for filtering images with a high presence of clouds [2, 3]. The SCL is a S2 product that provides an estimated scene class for each pixel in the paired S2 image, at a 10-meter pixel resolution. In this work, we present an assessment based on the information contained in the SCL data. The research question that drags us is how much clean data, e.g. cloud-free, in a time series are we actually feeding into ML models?, first thoughts brought us to inspect the cloud coverage per each satellite image in a time-series. However, we decided to present a general assessment based on user-defined labels in the SCL. Our proposed assessment calculates the spatial and temporal coverage in a sample region expressed by satellite image time series (SITS), concretely, the SCL inside the S2 data.

We evaluate the relation between the spatial and temporal coverage with the classification results of a random forest (RF) trained in the AI4EO Enhanced Agriculture dataset. The RF model is trained in a pixel-wise manner with neighborhood information [3]. We obtained that the classification results are worse in sample regions with cloudy conditions. In addition, we calculated the spatial and temporal coverage on the recent LandCoverNet global dataset [4] and obtained a distribution of clean coverage per continent in the year 2018 based on the S2 data. This coverage calculation could be useful for researchers interested in assessing the quality of SITS and understanding the prediction differences by region. We provide an evaluation with the SCL inside the S2 data, but this could be reproduced for any other scene classification mask. The code and assessment obtained can be found at https://github.com/fmenat/SITS_S2Coverage.

This paper is organized as follows. The background is presented in Sec. 2, followed by the proposed assessment in Sec. 3. In Sec. 4, we show the results of the assessment in two datasets. Finally, Sec. 5 provides the conclusion of our work.

Refer to caption — Table 1: Possible labels in the S2-based SCL data.

Tag	Name
0	No Data
1	Saturated or Defective
2	Dark Area Pixels
3	Cloud Shadows
4	Vegetation
5	Not Vegetated
6	Water
7	Unclassified
8	Cloud Medium
9	Cloud High
10	Thin Cirrus
11	Snow

2 BACKGROUND AND RELATED WORK

Thanks to the European Space Agency and the Copernicus S2 mission, it is possible to access optical satellite image data. The S2 is a constellation of two sun-synchronized satellites that are placed along the same orbit and separated by 180 degrees. This allows access to sun-lighted images from the same location on Earth at approximately every five days. S2-based optical images are composed by 13 bands with a spatial resolution varying from 10 to 60 meters. Along the optical images, a SCL generated by the Sen2Cor algorithm [5] is provided. The main purpose of this layer is to deliver a close estimation of what is presented in each pixel of the paired optical image. The available classes are shown in Table 2.

There are different factors that affect the clean observation of optical images [1], such as cloud, haze, snow, anomalies, and errors. Indeed, some works have shown that ML models trained on optical data get low predictive performance in cloudy conditions [6, 7]. Ferrari et al. [7] categorize regions into three different cloud coverage conditions (low, medium, and high) for deforestation prediction with optical and radar SITS. Nevertheless, in this manuscript, we generalize previous analysis beyond cloud presence.

Consequently, the SCL has been used in multiple studies to discriminate (filter out) data belonging to cloud-related labels [2], water and snow [8], or defective pixels [9]. Furthermore, another usage is the selection of pixels belonging to a certain class. In AI4Boundaries work [10], the tags 2, 4, 5, 6 and 7 in the SCL (See Tab. 2) are considered as clean input data for the ML training.

3 Assessment of Coverage Availability

Regardless of the spatial and temporal resolutions of satellite images, we consider the following terminology. Sample region, as a single sample zone that is used to define a region of interest. The data in a sample region could come from a single satellite image (SI), or a SITS as a (ordered) collection of images at different times. Clean coverage, which refers to the spatio-temporal availability of data pixels belonging to specific classes. See Figure 2 for an illustration. In addition, consider for each sample region $i$ an optical SITS $\mathcal{X}^{(i)}$ , (with $B$ bands) and its corresponding SCL data (also a SITS), $\mathcal{L}^{(i)}$ . Both information with a pixel resolution of $W\times H$ :

	$\displaystyle\mathcal{X}^{(i)}$	$\displaystyle=\left\{X_{1}^{(i)},X_{2}^{(i)},\ldots,X_{T_{i}}^{(i)}\right\},\ % \text{where }X_{t,w,h}^{(i)}\in\mathbb{R}_{+}^{B}$
	$\displaystyle\mathcal{L}^{(i)}$	$\displaystyle=\left\{L_{1}^{(i)},L_{2}^{(i)},\ldots,L_{T_{i}}^{(i)}\right\},\ % \text{where }L_{t,w,h}^{(i)}\in\left[0,11\right]$

Given a set of labels $\mathcal{K}$ , we define the spatial coverage $SC_{i}^{(t)}$ for a SI at time-step $t$ in the sample region $i$ as the percentage of pixels in $L_{t}^{(i)}$ belonging to the set $\mathcal{K}$ . Besides, we define the spatial coverage $SC^{(i)}$ for the whole SITS in a sample region $i$ , as the average across the time-series:

	$\displaystyle SC_{t}^{(i)}$	$\displaystyle=\frac{1}{W\cdot H}\sum_{w}^{W}\sum_{h}^{H}\mathbbm{1}(L_{t,w,h}^% {(i)}\in\mathcal{K})$		(1)
	$\displaystyle SC^{(i)}$	$\displaystyle=\frac{1}{T_{i}}\sum_{t=1}^{T_{i}}SC_{t}^{(i)}\ ,$		(2)

with $\mathbbm{1}$ a function giving $1$ when the equation inside holds. Then, considering a threshold $SC_{\text{thresh}}$ , we define a spatial coverage assessment (SCA) label for each sample region $i$ ,

\text{SCA}^{(i)}=\left\{\begin{array}[]{ll}\text{high}&\text{if}\ SC^{(i)}\geq SC% _{\text{thresh}}\\ \text{low}&\text{otherwise.}\end{array}\right.

(3)

In addition, we define the temporal coverage $TC^{(i)}$ in a sample region $i$ , based on the spatial coverage across the time-series. Based on a threshold $TC_{\text{thresh}}$ , we defined a temporal coverage assessment (TCA) label for each sample region $i$ ,

	$\displaystyle TC^{(i)}$	$\displaystyle=\frac{1}{T_{i}}\sum_{t=1}^{T_{i}}\mathbbm{1}(SC_{t}^{(i)}\geq TC% _{\text{thresh}})$		(4)
	$\displaystyle\text{TCA}^{(i)}$	$\displaystyle=\left\{\begin{array}[]{ll}\text{high}&\text{if}\ TC^{(i)}\geq TC% _{\text{thresh}}\\ \text{low}&\text{otherwise.}\end{array}\right.$		(7)

Therefore, given the SCL-based SITS as input data, $\mathcal{L}^{(i)}$ , our technique obtains: $SC^{(i)}$ , $TC^{(i)}$ , $\text{SCA}^{(i)}$ , and $\text{TCA}^{(i)}$ . The final goal of the proposed technique is to rapidly get an assessment of the spatio-temporal availability in a region.

4 EVALUATION AND APPLICATION

4.1 Evaluation: AI4EO Enhanced Agriculture

We evaluated the assessment in the AI4EO challenge on Enhanced Agriculture¹¹1platform.ai4eo.eu/enhanced-sentinel2-agriculture. The input data consists of S2-based optical SITS across the growing season in 2019 for Slovenia (March to September). The target data consists of a binary masks (cultivated or not) at a higher spatial resolution (2.5 m) than the S2. The (100) sample regions are $500\times 500$ size SITS of variable length. First, we calculate the spatial and temporal coverage of the sample regions by considering two types of filters. L-all-but-cloud, that represents a cloud-removal filter, where all the classes are selected for coverage except cloud-related (3, 8 and 9). On the other hand, L-veg-non-veg that represents a vegetated-related filter, where only classes 4 and 5 are selected for coverage. Figure 1 shows the coverage for these two filters. It can be seen that the L-veg-non-veg coverage is lower than L-all-but-cloud coverage, since it filters more label-types for its calculation.

For prediction, we used the approach by Tarasiewicz et al. [3]. First, a bilinear interpolation is carried out to match the input data to the target data. Then, a statistic generation is performed for each input pixel on a neighborhood of $25\times 25$ . Finally, a RF model it uses to predict the binary label of the central pixel based on the pixel neighborhood time series. The training is performed only on the central pixel from the original 10 meters resolution image [3]. We use 80 sample regions for training and 20 for validation, and evaluate with the same metrics used in the challenge: Matthews Correlation Coefficient (MCC) and Accuracy (ACC).

Figure 2 shows the classification results of sample regions by the L-veg-non-veg (since with the L-all-but-cloud we obtained only one field categorized as low). It is clear that in regions with low spatial and temporal coverage, the classification results are worse than in high coverage.

Furthermore, the Figure 3 illustrates the correlation between the accuracy and the spatio-temporal coverage with the L-all-but-cloud filter. There is a tendency that, when a sample region has a higher temporal coverage, the classification is better.

4.2 Application: LandCoverNet

In addition, we applied the technique to the recent global dataset of LandCoverNet [4].There are sample regions coming from all the continents: 1980 in Africa, 2753 in Asia, 600 in Australia, 840 in Europe, 1561/1200 in North/South America. A S2-based SITS (optical and SCL data) is available across 2018 for each sample region. These regions correspond to a $256\times 256$ size SITS of variable length.

Figure 4 shows the spatial and temporal coverage with the L-all-but-cloud filter on each continent of the LandCoverNet dataset. As expected, it can be seen that each continent has different clean coverage patterns, with Australia having on average a higher cloudless coverage in 2018. In contrast, Europe and Asia are the regions with more cloudy conditions on average. In addition, the coverage distribution across sample regions within each continent is quite different. With a threshold of 50%, the number of samples regions categorized with low TCA are: 230 in Africa, 435 in Asia, 38 in Australia, 255 in Europe, 46 in North America, and 124 in South America. However, as each continent has a different total number of sample regions, the percentage with low TCA is 12% in Africa, 16% in Asia, 5% in Australia, 30% in Europe, 3% in North America, and 15% in South America. While Asia has almost twice of low TCA sample regions than Africa and Europe, this is a low value relative to the total number of sample regions in each continent. Figure 4 also shows some outlier sample regions with very low coverage in different continents, with Asia and South America the more clear cases. Surprisingly, Australia and Europe do not have outlier sample regions regarding the spatial and temporal coverage.

5 CONCLUSION

We proposed a technique to assess the amount of clean coverage in sample regions from S2-based SITS. The purpose is to assess the spatio-temporal clean coverage in a region of interest based on the SCL-based SITS contained in the S2 product. Our evaluation shows a positive correlation between the clean coverage and the predictive results of ML models trained with S2-based SITS. Some potential directions for this research could be curriculum learning. For instance, provide an order of sample regions during training from the highest to lowest coverage sample regions or vice versa.

Acknowledgement. F. Mena acknowledges the financial support from the chair of Prof. A. Dengel with RPTU.

References

[1] H Shen, X Li, Q Cheng, C Zeng, G Yang, H Li, and L Zhang, “Missing information reconstruction of remote sensing data: A technical review,” IEEE Geoscience and Remote Sensing Magazine, vol. 3, no. 3, pp. 61–85, 2015.
[2] T Hardy, L Kooistra, M Domingues, S Richter, Erwin Vonk, G van den Eertwegh, and D Van Deijl, “Sen2grass: A cloud-based solution to generate field-specific grassland information derived from sentinel-2 imagery,” AgriEngineering, vol. 3, no. 1, pp. 118–137, 2021.
[3] T Tarasiewicz, L Tulczyjew, M Myller, M Kawulok, N Longépé, and J Nalepa, “Extracting high-resolution cultivated land maps from sentinel-2 image series,” in IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2022, pp. 175–178.
[4] H Alemohammad and K Booth, “Landcovernet: A global benchmark land cover classification training dataset,” arXiv preprint arXiv:2012.03111, 2020.
[5] M Main-Knorn, B Pflug, J Louis, V Debaecker, U Müller-Wilm, and F Gascon, “Sen2cor for sentinel-2,” in Image and Signal Processing for Remote Sensing XXIII. SPIE, 2017, vol. 10427, pp. 37–48.
[6] VSF Garnot, L Landrieu, and N Chehata, “Multi-modal temporal attention models for crop map** from satellite time series,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 187, pp. 294–305, 2022.
[7] F Ferrari, MP Ferreira, CA Almeida, and RQ Feitosa, “Fusing sentinel-1 and sentinel-2 images for deforestation detection under diverse cloud conditions,” IEEE Geoscience and Remote Sensing Letters, 2023.
[8] E Roteta, A Bastarrika, M Padilla, T Storm, and E Chuvieco, “Development of a sentinel-2 burned area algorithm: Generation of a small fire database for sub-saharan africa,” Remote sensing of environment, vol. 222, pp. 1–17, 2019.
[9] N Johnson, W Treible, and D Crispell, “Opensentinelmap: A large-scale land use dataset using openstreetmap and sentinel-2 imagery,” in Proceedings of the IEEE/CVF CVPR, 2022, pp. 1333–1341.
[10] R d’Andrimont, M Claverie, P Kempeneers, D Muraro, M Yordanov, D Peressutti, M Batič, and F Waldner, “Ai4boundaries: an open ai-ready dataset to map field boundaries with sentinel-2 and aerial photography,” Earth System Science Data, vol. 15, no. 1, pp. 317–329, 2023.