Search | arXiv e-print repository

Semi Supervised Heterogeneous Domain Adaptation via Disentanglement and Pseudo-Labelling

Authors: Cassio F. Dantas, Raffaele Gaetano, Dino Ienco

Abstract: Semi-supervised domain adaptation methods leverage information from a source labelled domain with the goal of generalizing over a scarcely labelled target domain. While this setting already poses challenges due to potential distribution shifts between domains, an even more complex scenario arises when source and target data differs in modality representation (e.g. they are acquired by sensors with… ▽ More Semi-supervised domain adaptation methods leverage information from a source labelled domain with the goal of generalizing over a scarcely labelled target domain. While this setting already poses challenges due to potential distribution shifts between domains, an even more complex scenario arises when source and target data differs in modality representation (e.g. they are acquired by sensors with different characteristics). For instance, in remote sensing, images may be collected via various acquisition modes (e.g. optical or radar), different spectral characteristics (e.g. RGB or multi-spectral) and spatial resolutions. Such a setting is denoted as Semi-Supervised Heterogeneous Domain Adaptation (SSHDA) and it exhibits an even more severe distribution shift due to modality heterogeneity across domains.To cope with the challenging SSHDA setting, here we introduce SHeDD (Semi-supervised Heterogeneous Domain Adaptation via Disentanglement) an end-to-end neural framework tailored to learning a target domain classifier by leveraging both labelled and unlabelled data from heterogeneous data sources. SHeDD is designed to effectively disentangle domain-invariant representations, relevant for the downstream task, from domain-specific information, that can hinder the cross-modality transfer. Additionally, SHeDD adopts an augmentation-based consistency regularization mechanism that takes advantages of reliable pseudo-labels on the unlabelled target samples to further boost its generalization ability on the target domain. Empirical evaluations on two remote sensing benchmarks, encompassing heterogeneous data in terms of acquisition modes and spectral/spatial resolutions, demonstrate the quality of SHeDD compared to both baseline and state-of-the-art competing approaches. Our code is publicly available here: https://github.com/tanodino/SSHDA/ △ Less

Submitted 20 June, 2024; originally announced June 2024.

Journal ref: ECML PKDD: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Sep 2024, Vilnius, Lithuania

arXiv:2406.13424 [pdf, other]

Towards a multimodal framework for remote sensing image change retrieval and captioning

Authors: Roger Ferrod, Luigi Di Caro, Dino Ienco

Abstract: Recently, there has been increasing interest in multimodal applications that integrate text with other modalities, such as images, audio and video, to facilitate natural language interactions with multimodal AI systems. While applications involving standard modalities have been extensively explored, there is still a lack of investigation into specific data modalities such as remote sensing (RS) da… ▽ More Recently, there has been increasing interest in multimodal applications that integrate text with other modalities, such as images, audio and video, to facilitate natural language interactions with multimodal AI systems. While applications involving standard modalities have been extensively explored, there is still a lack of investigation into specific data modalities such as remote sensing (RS) data. Despite the numerous potential applications of RS data, including environmental protection, disaster monitoring and land planning, available solutions are predominantly focused on specific tasks like classification, captioning and retrieval. These solutions often overlook the unique characteristics of RS data, such as its capability to systematically provide information on the same geographical areas over time. This ability enables continuous monitoring of changes in the underlying landscape. To address this gap, we propose a novel foundation model for bi-temporal RS image pairs, in the context of change detection analysis, leveraging Contrastive Learning and the LEVIR-CC dataset for both captioning and text-image retrieval. By jointly training a contrastive encoder and captioning decoder, our model add text-image retrieval capabilities, in the context of bi-temporal change detection, while maintaining captioning performances that are comparable to the state of the art. We release the source code and pretrained weights at: https://github.com/rogerferrod/RSICRC. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2404.11114 [pdf, other]

Reuse out-of-year data to enhance land cover map** via feature disentanglement and contrastive learning

Authors: Cassio F. Dantas, Raffaele Gaetano, Claudia Paris, Dino Ienco

Abstract: Timely up-to-date land use/land cover (LULC) maps play a pivotal role in supporting agricultural territory management, environmental monitoring and facilitating well-informed and sustainable decision-making. Typically, when creating a land cover (LC) map, precise ground truth data is collected through time-consuming and expensive field campaigns. This data is then utilized in conjunction with sate… ▽ More Timely up-to-date land use/land cover (LULC) maps play a pivotal role in supporting agricultural territory management, environmental monitoring and facilitating well-informed and sustainable decision-making. Typically, when creating a land cover (LC) map, precise ground truth data is collected through time-consuming and expensive field campaigns. This data is then utilized in conjunction with satellite image time series (SITS) through advanced machine learning algorithms to get the final map. Unfortunately, each time this process is repeated (e.g., annually over a region to estimate agricultural production or potential biodiversity loss), new ground truth data must be collected, leading to the complete disregard of previously gathered reference data despite the substantial financial and time investment they have required. How to make value of historical data, from the same or similar study sites, to enhance the current LULC map** process constitutes a significant challenge that could enable the financial and human-resource efforts invested in previous data campaigns to be valued again. Aiming to tackle this important challenge, we here propose a deep learning framework based on recent advances in domain adaptation and generalization to combine remote sensing and reference data coming from two different domains (e.g. historical data and fresh ones) to ameliorate the current LC map** process. Our approach, namely REFeD (data Reuse with Effective Feature Disentanglement for land cover map**), leverages a disentanglement strategy, based on contrastive learning, where invariant and specific per-domain features are derived to recover the intrinsic information related to the downstream LC map** task and alleviate possible distribution shifts between domains. Additionally, REFeD is equipped with an effective supervision scheme where feature disentanglement is further enforced via multiple levels of supervision at different granularities. The experimental assessment over two study areas covering extremely diverse and contrasted landscapes, namely Koumbia (located in the West-Africa region, in Burkina Faso) and Centre Val de Loire (located in centre Europe, France), underlines the quality of our framework and the obtained findings demonstrate that out-of-year information coming from the same (or similar) study site, at different periods of time, can constitute a valuable additional source of information to enhance the LC map** process. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2310.02116 [pdf, other]

Coarse-to-Fine Concept Bottleneck Models

Authors: Konstantinos P. Panousis, Dino Ienco, Diego Marcos

Abstract: Deep learning algorithms have recently gained significant attention due to their impressive performance. However, their high complexity and un-interpretable mode of operation hinders their confident deployment in real-world safety-critical tasks. This work targets ante hoc interpretability, and specifically Concept Bottleneck Models (CBMs). Our goal is to design a framework that admits a highly in… ▽ More Deep learning algorithms have recently gained significant attention due to their impressive performance. However, their high complexity and un-interpretable mode of operation hinders their confident deployment in real-world safety-critical tasks. This work targets ante hoc interpretability, and specifically Concept Bottleneck Models (CBMs). Our goal is to design a framework that admits a highly interpretable decision making process with respect to human understandable concepts, on two levels of granularity. To this end, we propose a novel two-level concept discovery formulation leveraging: (i) recent advances in vision-language models, and (ii) an innovative formulation for coarse-to-fine concept selection via data-driven and sparsity-inducing Bayesian arguments. Within this framework, concept information does not solely rely on the similarity between the whole image and general unstructured concepts; instead, we introduce the notion of concept hierarchy to uncover and exploit more granular concept information residing in patch-specific regions of the image scene. As we experimentally show, the proposed construction not only outperforms recent CBM approaches, but also yields a principled framework towards interpetability. △ Less

Submitted 27 June, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

arXiv:2309.03173 [pdf, other]

PDiscoNet: Semantically consistent part discovery for fine-grained recognition

Authors: Robert van der Klis, Stephan Alaniz, Massimiliano Mancini, Cassio F. Dantas, Dino Ienco, Zeynep Akata, Diego Marcos

Abstract: Fine-grained classification often requires recognizing specific object parts, such as beak shape and wing patterns for birds. Encouraging a fine-grained classification model to first detect such parts and then using them to infer the class could help us gauge whether the model is indeed looking at the right details better than with interpretability methods that provide a single attribution map. We… ▽ More Fine-grained classification often requires recognizing specific object parts, such as beak shape and wing patterns for birds. Encouraging a fine-grained classification model to first detect such parts and then using them to infer the class could help us gauge whether the model is indeed looking at the right details better than with interpretability methods that provide a single attribution map. We propose PDiscoNet to discover object parts by using only image-level class labels along with priors encouraging the parts to be: discriminative, compact, distinct from each other, equivariant to rigid transforms, and active in at least some of the images. In addition to using the appropriate losses to encode these priors, we propose to use part-dropout, where full part feature vectors are dropped at once to prevent a single part from dominating in the classification, and part feature vector modulation, which makes the information coming from each part distinct from the perspective of the classifier. Our results on CUB, CelebA, and PartImageNet show that the proposed method provides substantially better part discovery performance than previous methods while not requiring any additional hyper-parameter tuning and without penalizing the classification performance. The code is available at https://github.com/robertdvdk/part_detection. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: 9 pages, 8 figures, ICCV

arXiv:2308.12127 [pdf, other]

Masking Strategies for Background Bias Removal in Computer Vision Models

Authors: Ananthu Aniraj, Cassio F. Dantas, Dino Ienco, Diego Marcos

Abstract: Models for fine-grained image classification tasks, where the difference between some classes can be extremely subtle and the number of samples per class tends to be low, are particularly prone to picking up background-related biases and demand robust methods to handle potential examples with out-of-distribution (OOD) backgrounds. To gain deeper insights into this critical problem, our research in… ▽ More Models for fine-grained image classification tasks, where the difference between some classes can be extremely subtle and the number of samples per class tends to be low, are particularly prone to picking up background-related biases and demand robust methods to handle potential examples with out-of-distribution (OOD) backgrounds. To gain deeper insights into this critical problem, our research investigates the impact of background-induced bias on fine-grained image classification, evaluating standard backbone models such as Convolutional Neural Network (CNN) and Vision Transformers (ViT). We explore two masking strategies to mitigate background-induced bias: Early masking, which removes background information at the (input) image level, and late masking, which selectively masks high-level spatial features corresponding to the background. Extensive experiments assess the behavior of CNN and ViT models under different masking strategies, with a focus on their generalization to OOD backgrounds. The obtained findings demonstrate that both proposed strategies enhance OOD performance compared to the baseline models, with early masking consistently exhibiting the best OOD performance. Notably, a ViT variant employing GAP-Pooled Patch token-based classification combined with early masking achieves the highest OOD robustness. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: Accepted at the 2023 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) on Out Of Distribution Generalization in Computer Vision (OOD-CV)

arXiv:2308.10782 [pdf, other]

Sparse Linear Concept Discovery Models

Authors: Konstantinos P. Panousis, Dino Ienco, Diego Marcos

Abstract: The recent mass adoption of DNNs, even in safety-critical scenarios, has shifted the focus of the research community towards the creation of inherently intrepretable models. Concept Bottleneck Models (CBMs) constitute a popular approach where hidden layers are tied to human understandable concepts allowing for investigation and correction of the network's decisions. However, CBMs usually suffer fr… ▽ More The recent mass adoption of DNNs, even in safety-critical scenarios, has shifted the focus of the research community towards the creation of inherently intrepretable models. Concept Bottleneck Models (CBMs) constitute a popular approach where hidden layers are tied to human understandable concepts allowing for investigation and correction of the network's decisions. However, CBMs usually suffer from: (i) performance degradation and (ii) lower interpretability than intended due to the sheer amount of concepts contributing to each decision. In this work, we propose a simple yet highly intuitive interpretable framework based on Contrastive Language Image models and a single sparse linear layer. In stark contrast to related approaches, the sparsity in our framework is achieved via principled Bayesian arguments by inferring concept presence via a data-driven Bernoulli distribution. As we experimentally show, our framework not only outperforms recent CBM approaches accuracy-wise, but it also yields high per example concept sparsity, facilitating the individual investigation of the emerging concepts. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: Accepted @ ICCVW CLVL 2023

arXiv:2301.01520 [pdf, other]

Towards Explainable Land Cover Map**: a Counterfactual-based Strategy

Authors: Cassio F. Dantas, Diego Marcos, Dino Ienco

Abstract: Counterfactual explanations are an emerging tool to enhance interpretability of deep learning models. Given a sample, these methods seek to find and display to the user similar samples across the decision boundary. In this paper, we propose a generative adversarial counterfactual approach for satellite image time series in a multi-class setting for the land cover classification task. One of the di… ▽ More Counterfactual explanations are an emerging tool to enhance interpretability of deep learning models. Given a sample, these methods seek to find and display to the user similar samples across the decision boundary. In this paper, we propose a generative adversarial counterfactual approach for satellite image time series in a multi-class setting for the land cover classification task. One of the distinctive features of the proposed approach is the lack of prior assumption on the targeted class for a given counterfactual explanation. This inherent flexibility allows for the discovery of interesting information on the relationship between land cover classes. The other feature consists of encouraging the counterfactual to differ from the original sample only in a small and compact temporal segment. These time-contiguous perturbations allow for a much sparser and, thus, interpretable solution. Furthermore, plausibility/realism of the generated counterfactual explanations is enforced via the proposed adversarial learning strategy. △ Less

Submitted 20 August, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

arXiv:2103.16836 [pdf, other]

Channel-Based Attention for LCC Using Sentinel-2 Time Series

Authors: Hermann Courteille, A. Benoît, N Méger, A Atto, D. Ienco

Abstract: Deep Neural Networks (DNNs) are getting increasing attention to deal with Land Cover Classification (LCC) relying on Satellite Image Time Series (SITS). Though high performances can be achieved, the rationale of a prediction yielded by a DNN often remains unclear. An architecture expressing predictions with respect to input channels is thus proposed in this paper. It relies on convolutional layers… ▽ More Deep Neural Networks (DNNs) are getting increasing attention to deal with Land Cover Classification (LCC) relying on Satellite Image Time Series (SITS). Though high performances can be achieved, the rationale of a prediction yielded by a DNN often remains unclear. An architecture expressing predictions with respect to input channels is thus proposed in this paper. It relies on convolutional layers and an attention mechanism weighting the importance of each channel in the final classification decision. The correlation between channels is taken into account to set up shared kernels and lower model complexity. Experiments based on a Sentinel-2 SITS show promising results. △ Less

Submitted 31 March, 2021; originally announced March 2021.

Journal ref: International Geoscience and Remote Sensing Symposium (IGARSS), Jul 2021, Brussels, Belgium

arXiv:2004.14672 [pdf, other]

Attentive Weakly Supervised land cover map** for object-based satellite image time series data with spatial interpretation

Authors: Dino Ienco, Yawogan Jean Eudes Gbodjo, Roberto Interdonato, Raffaele Gaetano

Abstract: Nowadays, modern Earth Observation systems continuously collect massive amounts of satellite information. The unprecedented possibility to acquire high resolution Satellite Image Time Series (SITS) data (series of images with high revisit time period on the same geographical area) is opening new opportunities to monitor the different aspects of the Earth Surface but, at the same time, it is raisin… ▽ More Nowadays, modern Earth Observation systems continuously collect massive amounts of satellite information. The unprecedented possibility to acquire high resolution Satellite Image Time Series (SITS) data (series of images with high revisit time period on the same geographical area) is opening new opportunities to monitor the different aspects of the Earth Surface but, at the same time, it is raising up new challenges in term of suitable methods to analyze and exploit such huge amount of rich and complex image data. One of the main task associated to SITS data analysis is related to land cover map** where satellite data are exploited via learning methods to recover the Earth Surface status aka the corresponding land cover classes. Due to operational constraints, the collected label information, on which machine learning strategies are trained, is often limited in volume and obtained at coarse granularity carrying out inexact and weak knowledge that can affect the whole process. To cope with such issues, in the context of object-based SITS land cover map**, we propose a new deep learning framework, named TASSEL (aTtentive weAkly Supervised Satellite image time sEries cLassifier), that is able to intelligently exploit the weak supervision provided by the coarse granularity labels. Furthermore, our framework also produces an additional side-information that supports the model interpretability with the aim to make the black box gray. Such side-information allows to associate spatial interpretation to the model decision via visual inspection. △ Less

Submitted 30 April, 2020; originally announced April 2020.

Comments: Under submission to Elsevier journal

arXiv:2004.01963 [pdf, other]

Fine grained classification for multi-source land cover map**

Authors: Yawogan Jean Eudes Gbodjo, Dino Ienco, Louise Leroux, Roberto Interdonato, Raffaelle Gaetano

Abstract: Nowadays, there is a general agreement on the need to better characterize agricultural monitoring systems in response to the global changes. Timely and accurate land use/land cover map** can support this vision by providing useful information at fine scale. Here, a deep learning approach is proposed to deal with multi-source land cover map** at object level. The approach is based on an extensi… ▽ More Nowadays, there is a general agreement on the need to better characterize agricultural monitoring systems in response to the global changes. Timely and accurate land use/land cover map** can support this vision by providing useful information at fine scale. Here, a deep learning approach is proposed to deal with multi-source land cover map** at object level. The approach is based on an extension of Recurrent Neural Network enriched via an attention mechanism dedicated to multi-temporal data context. Moreover, a new hierarchical pretraining strategy designed to exploit specific domain knowledge available under hierarchical relationships within land cover classes is introduced. Experiments carried out on the Reunion island - a french overseas department - demonstrate the significance of the proposal compared to remote sensing standard approaches for land cover map**. △ Less

Submitted 4 April, 2020; originally announced April 2020.

Comments: Paper presented at the ICLR 2020 Workshop on Computer Vision for Agriculture (CV4A)

arXiv:1911.08815 [pdf, other]

Object-based multi-temporal and multi-source land cover map** leveraging hierarchical class relationships

Authors: Yawogan Jean Eudes Gbodjo, Dino Ienco, Louise Leroux, Roberto Interdonato, Raffaele Gaetano, Babacar Ndao, Stephane Dupuy

Abstract: European satellite missions Sentinel-1 (S1) and Sentinel-2 (S2) provide at highspatial resolution and high revisit time, respectively, radar and optical imagesthat support a wide range of Earth surface monitoring tasks such as LandUse/Land Cover map**. A long-standing challenge in the remote sensingcommunity is about how to efficiently exploit multiple sources of information and leverage their c… ▽ More European satellite missions Sentinel-1 (S1) and Sentinel-2 (S2) provide at highspatial resolution and high revisit time, respectively, radar and optical imagesthat support a wide range of Earth surface monitoring tasks such as LandUse/Land Cover map**. A long-standing challenge in the remote sensingcommunity is about how to efficiently exploit multiple sources of information and leverage their complementary. In this particular case, get the most out ofradar and optical satellite image time series (SITS). Here, we propose to dealwith land cover map** through a deep learning framework especially tailoredto leverage the multi-source complementarity provided by radar and opticalSITS. The proposed architecture is based on an extension of Recurrent NeuralNetwork (RNN) enriched via a customized attention mechanism capable to fitthe specificity of SITS data. In addition, we propose a new pretraining strategythat exploits domain expert knowledge to guide the model parameter initial-ization. Thorough experimental evaluations involving several machine learningcompetitors, on two contrasted study sites, have demonstrated the suitabilityof our new attention mechanism combined with the extend RNN model as wellas the benefit/limit to inject domain expert knowledge in the neural networktraining process. △ Less

Submitted 20 November, 2019; originally announced November 2019.

arXiv:1911.01071 [pdf, other]

Supervised level-wise pretraining for recurrent neural network initialization in multi-class classification

Authors: Dino Ienco, Roberto Interdonato, Raffaele Gaetano

Abstract: Recurrent Neural Networks (RNNs) can be seriously impacted by the initial parameters assignment, which may result in poor generalization performances on new unseen data. With the objective to tackle this crucial issue, in the context of RNN based classification, we propose a new supervised layer-wise pretraining strategy to initialize network parameters. The proposed approach leverages a data-awar… ▽ More Recurrent Neural Networks (RNNs) can be seriously impacted by the initial parameters assignment, which may result in poor generalization performances on new unseen data. With the objective to tackle this crucial issue, in the context of RNN based classification, we propose a new supervised layer-wise pretraining strategy to initialize network parameters. The proposed approach leverages a data-aware strategy that sets up a taxonomy of classification problems automatically derived by the model behavior. To the best of our knowledge, despite the great interest in RNN-based classification, this is the first data-aware strategy dealing with the initialization of such models. The proposed strategy has been tested on four benchmarks coming from two different domains, i.e., Speech Recognition and Remote Sensing. Results underline the significance of our approach and point out that data-aware strategies positively support the initialization of Recurrent Neural Network based classification models. △ Less

Submitted 4 November, 2019; originally announced November 2019.

arXiv:1812.05530 [pdf, other]

Combining Sentinel-1 and Sentinel-2 Time Series via RNN for object-based land cover classification

Authors: Dino Ienco, Raffaele Gaetano, Roberto Interdonato, Kenji Ose, Dinh Ho Tong Minh

Abstract: Radar and Optical Satellite Image Time Series (SITS) are sources of information that are commonly employed to monitor earth surfaces for tasks related to ecology, agriculture, mobility, land management planning and land cover monitoring. Many studies have been conducted using one of the two sources, but how to smartly combine the complementary information provided by radar and optical SITS is stil… ▽ More Radar and Optical Satellite Image Time Series (SITS) are sources of information that are commonly employed to monitor earth surfaces for tasks related to ecology, agriculture, mobility, land management planning and land cover monitoring. Many studies have been conducted using one of the two sources, but how to smartly combine the complementary information provided by radar and optical SITS is still an open challenge. In this context, we propose a new neural architecture for the combination of Sentinel-1 (S1) and Sentinel-2 (S2) imagery at object level, applied to a real-world land cover classification task. Experiments carried out on the Reunion Island, a overseas department of France in the Indian Ocean, demonstrate the significance of our proposal. △ Less

Submitted 13 December, 2018; originally announced December 2018.

arXiv:1809.07589 [pdf, other]

DuPLO: A DUal view Point deep Learning architecture for time series classificatiOn

Authors: Roberto Interdonato, Dino Ienco, Raffaele Gaetano, Kenji Ose

Abstract: Nowadays, modern Earth Observation systems continuously generate huge amounts of data. A notable example is represented by the Sentinel-2 mission, which provides images at high spatial resolution (up to 10m) with high temporal revisit period (every 5 days), which can be organized in Satellite Image Time Series (SITS). While the use of SITS has been proved to be beneficial in the context of Land Us… ▽ More Nowadays, modern Earth Observation systems continuously generate huge amounts of data. A notable example is represented by the Sentinel-2 mission, which provides images at high spatial resolution (up to 10m) with high temporal revisit period (every 5 days), which can be organized in Satellite Image Time Series (SITS). While the use of SITS has been proved to be beneficial in the context of Land Use/Land Cover (LULC) map generation, unfortunately, machine learning approaches commonly leveraged in remote sensing field fail to take advantage of spatio-temporal dependencies present in such data. Recently, new generation deep learning methods allowed to significantly advance research in this field. These approaches have generally focused on a single type of neural network, i.e., Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), which model different but complementary information: spatial autocorrelation (CNNs) and temporal dependencies (RNNs). In this work, we propose the first deep learning architecture for the analysis of SITS data, namely \method{} (DUal view Point deep Learning architecture for time series classificatiOn), that combines Convolutional and Recurrent neural networks to exploit their complementarity. Our hypothesis is that, since CNNs and RNNs capture different aspects of the data, a combination of both models would produce a more diverse and complete representation of the information for the underlying land cover classification task. Experiments carried out on two study sites characterized by different land cover characteristics (i.e., the \textit{Gard} site in France and the \textit{Reunion Island} in the Indian Ocean), demonstrate the significance of our proposal. △ Less

Submitted 20 September, 2018; originally announced September 2018.

arXiv:1806.11452 [pdf, other]

MRFusion: A Deep Learning architecture to fuse PAN and MS imagery for land cover map**

Authors: Raffaele Gaetano, Dino Ienco, Kenji Ose, Remi Cresson

Abstract: Nowadays, Earth Observation systems provide a multitude of heterogeneous remote sensing data. How to manage such richness leveraging its complementarity is a crucial chal- lenge in modern remote sensing analysis. Data Fusion techniques deal with this point proposing method to combine and exploit complementarity among the different data sensors. Considering optical Very High Spatial Resolution (VHS… ▽ More Nowadays, Earth Observation systems provide a multitude of heterogeneous remote sensing data. How to manage such richness leveraging its complementarity is a crucial chal- lenge in modern remote sensing analysis. Data Fusion techniques deal with this point proposing method to combine and exploit complementarity among the different data sensors. Considering optical Very High Spatial Resolution (VHSR) images, satellites obtain both Multi Spectral (MS) and panchro- matic (PAN) images at different spatial resolution. VHSR images are extensively exploited to produce land cover maps to deal with agricultural, ecological, and socioeconomic issues as well as assessing ecosystem status, monitoring biodiversity and provid- ing inputs to conceive food risk monitoring systems. Common techniques to produce land cover maps from such VHSR images typically opt for a prior pansharpening of the multi-resolution source for a full resolution processing. Here, we propose a new deep learning architecture to jointly use PAN and MS imagery for a direct classification without any prior image fusion or resampling process. By managing the spectral information at its native spatial resolution, our method, named MRFusion, aims at avoiding the possible infor- mation loss induced by pansharpening or any other hand-crafted preprocessing. Moreover, the proposed architecture is suitably designed to learn non-linear transformations of the sources with the explicit aim of taking as much as possible advantage of the complementarity of PAN and MS imagery. Experiments are carried out on two-real world scenarios depicting large areas with different land cover characteristics. The characteristics of the proposed scenarios underline the applicability and the generality of our method in operational settings. △ Less

Submitted 29 June, 2018; originally announced June 2018.

arXiv:1803.01945 [pdf, other]

M3Fusion: A Deep Learning Architecture for Multi-{Scale/Modal/Temporal} satellite data fusion

Authors: P. Benedetti, D. Ienco, R. Gaetano, K. Osé, R. Pensa, S. Dupuy

Abstract: Modern Earth Observation systems provide sensing data at different temporal and spatial resolutions. Among optical sensors, today the Sentinel-2 program supplies high-resolution temporal (every 5 days) and high spatial resolution (10m) images that can be useful to monitor land cover dynamics. On the other hand, Very High Spatial Resolution images (VHSR) are still an essential tool to figure out la… ▽ More Modern Earth Observation systems provide sensing data at different temporal and spatial resolutions. Among optical sensors, today the Sentinel-2 program supplies high-resolution temporal (every 5 days) and high spatial resolution (10m) images that can be useful to monitor land cover dynamics. On the other hand, Very High Spatial Resolution images (VHSR) are still an essential tool to figure out land cover map** characterized by fine spatial patterns. Understand how to efficiently leverage these complementary sources of information together to deal with land cover map** is still challenging. With the aim to tackle land cover map** through the fusion of multi-temporal High Spatial Resolution and Very High Spatial Resolution satellite images, we propose an End-to-End Deep Learning framework, named M3Fusion, able to leverage simultaneously the temporal knowledge contained in time series data as well as the fine spatial information available in VHSR information. Experiments carried out on the Reunion Island study area asses the quality of our proposal considering both quantitative and qualitative aspects. △ Less

Submitted 5 March, 2018; originally announced March 2018.

arXiv:1708.03694 [pdf, other]

Deep Recurrent Neural Networks for map** winter vegetation quality coverage via multi-temporal SAR Sentinel-1

Authors: Dinh Ho Tong Minh, Dino Ienco, Raffaele Gaetano, Nathalie Lalande, Emile Ndikumana, Faycal Osman, Pierre Maurel

Abstract: Map** winter vegetation quality coverage is a challenge problem of remote sensing. This is due to the cloud coverage in winter period, leading to use radar rather than optical images. The objective of this paper is to provide a better understanding of the capabilities of radar Sentinel-1 and deep learning concerning about map** winter vegetation quality coverage. The analysis presented in this… ▽ More Map** winter vegetation quality coverage is a challenge problem of remote sensing. This is due to the cloud coverage in winter period, leading to use radar rather than optical images. The objective of this paper is to provide a better understanding of the capabilities of radar Sentinel-1 and deep learning concerning about map** winter vegetation quality coverage. The analysis presented in this paper is carried out on multi-temporal Sentinel-1 data over the site of La Rochelle, France, during the campaign in December 2016. This dataset were processed in order to produce an intensity radar data stack from October 2016 to February 2017. Two deep Recurrent Neural Network (RNN) based classifier methods were employed. We found that the results of RNNs clearly outperformed the classical machine learning approaches (Support Vector Machine and Random Forest). This study confirms that the time series radar Sentinel-1 and RNNs could be exploited for winter vegetation quality cover map**. △ Less

Submitted 11 August, 2017; originally announced August 2017.

Comments: In submission to IEEE Geoscience and Remote Sensing Letters

arXiv:1704.04055 [pdf, other]

doi 10.1109/LGRS.2017.2728698

Land Cover Classification via Multi-temporal Spatial Data by Recurrent Neural Networks

Authors: Dino Ienco, Raffaele Gaetano, Claire Dupaquier, Pierre Maurel

Abstract: Nowadays, modern earth observation programs produce huge volumes of satellite images time series (SITS) that can be useful to monitor geographical areas through time. How to efficiently analyze such kind of information is still an open question in the remote sensing field. Recently, deep learning methods proved suitable to deal with remote sensing data mainly for scene classification (i.e. Convolu… ▽ More Nowadays, modern earth observation programs produce huge volumes of satellite images time series (SITS) that can be useful to monitor geographical areas through time. How to efficiently analyze such kind of information is still an open question in the remote sensing field. Recently, deep learning methods proved suitable to deal with remote sensing data mainly for scene classification (i.e. Convolutional Neural Networks - CNNs - on single images) while only very few studies exist involving temporal deep learning approaches (i.e Recurrent Neural Networks - RNNs) to deal with remote sensing time series. In this letter we evaluate the ability of Recurrent Neural Networks, in particular the Long-Short Term Memory (LSTM) model, to perform land cover classification considering multi-temporal spatial data derived from a time series of satellite images. We carried out experiments on two different datasets considering both pixel-based and object-based classification. The obtained results show that Recurrent Neural Networks are competitive compared to state-of-the-art classifiers, and may outperform classical approaches in presence of low represented and/or highly mixed classes. We also show that using the alternative feature representation generated by LSTM can improve the performances of standard classifiers. △ Less

Submitted 13 April, 2017; originally announced April 2017.

arXiv:1704.03441 [pdf, other]

Node-centric community detection in multilayer networks with layer-coverage diversification bias

Authors: Roberto Interdonato, Andrea Tagarelli, Dino Ienco, Arnaud Sallaberry, Pascal Poncelet

Abstract: The problem of node-centric, or local, community detection in information networks refers to the identification of a community for a given input node, having limited information about the network topology. Existing methods for solving this problem, however, are not conceived to work on complex networks. In this paper, we propose a novel framework for local community detection based on the multilay… ▽ More The problem of node-centric, or local, community detection in information networks refers to the identification of a community for a given input node, having limited information about the network topology. Existing methods for solving this problem, however, are not conceived to work on complex networks. In this paper, we propose a novel framework for local community detection based on the multilayer network model. Our approach relies on the maximization of the ratio between the community internal connection density and the external connection density, according to multilayer similarity-based community relations. We also define a biasing scheme that allows the discovery of local communities characterized by different degrees of layer-coverage diversification. Experimental evaluation conducted on real-world multilayer networks has shown the significance of our approach. △ Less

Submitted 11 April, 2017; originally announced April 2017.

Comments: Accepted at 8th International Conference on Complex Networks (CompleNet'17)

arXiv:1511.04855 [pdf, other]

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch

Authors: Lionel Pibre, Pasquet Jérôme, Dino Ienco, Marc Chaumont

Abstract: Since the BOSS competition, in 2010, most steganalysis approaches use a learning methodology involving two steps: feature extraction, such as the Rich Models (RM), for the image representation, and use of the Ensemble Classifier (EC) for the learning step. In 2015, Qian et al. have shown that the use of a deep learning approach that jointly learns and computes the features, is very promising for t… ▽ More Since the BOSS competition, in 2010, most steganalysis approaches use a learning methodology involving two steps: feature extraction, such as the Rich Models (RM), for the image representation, and use of the Ensemble Classifier (EC) for the learning step. In 2015, Qian et al. have shown that the use of a deep learning approach that jointly learns and computes the features, is very promising for the steganalysis. In this paper, we follow-up the study of Qian et al., and show that, due to intrinsic joint minimization, the results obtained from a Convolutional Neural Network (CNN) or a Fully Connected Neural Network (FNN), if well parameterized, surpass the conventional use of a RM with an EC. First, numerous experiments were conducted in order to find the best " shape " of the CNN. Second, experiments were carried out in the clairvoyant scenario in order to compare the CNN and FNN to an RM with an EC. The results show more than 16% reduction in the classification error with our CNN or FNN. Third, experiments were also performed in a cover-source mismatch setting. The results show that the CNN and FNN are naturally robust to the mismatch problem. In Addition to the experiments, we provide discussions on the internal mechanisms of a CNN, and weave links with some previously stated ideas, in order to understand the impressive results we obtained. △ Less

Submitted 12 January, 2018; v1 submitted 16 November, 2015; originally announced November 2015.

Comments: IS&T. Media Watermarking, Security, and Forensics, Part of IS&T International Symposium on Electronic Imaging, EI'2016, Feb 2015, San Fransisco, United States

Showing 1–21 of 21 results for author: Ienco, D