-
A scalable framework for annotating photovoltaic cell defects in electroluminescence images
Authors:
Urtzi Otamendi,
Inigo Martinez,
Igor G. Olaizola,
Marco Quartulli
Abstract:
The correct functioning of photovoltaic (PV) cells is critical to ensuring the optimal performance of a solar plant. Anomaly detection techniques for PV cells can result in significant cost savings in operation and maintenance (O&M). Recent research has focused on deep learning techniques for automatically detecting anomalies in Electroluminescence (EL) images. Automated anomaly annotations can im…
▽ More
The correct functioning of photovoltaic (PV) cells is critical to ensuring the optimal performance of a solar plant. Anomaly detection techniques for PV cells can result in significant cost savings in operation and maintenance (O&M). Recent research has focused on deep learning techniques for automatically detecting anomalies in Electroluminescence (EL) images. Automated anomaly annotations can improve current O&M methodologies and help develop decision-making systems to extend the life-cycle of the PV cells and predict failures. This paper addresses the lack of anomaly segmentation annotations in the literature by proposing a combination of state-of-the-art data-driven techniques to create a Golden Standard benchmark. The proposed method stands out for (1) its adaptability to new PV cell types, (2) cost-efficient fine-tuning, and (3) leverage public datasets to generate advanced annotations. The methodology has been validated in the annotation of a widely used dataset, obtaining a reduction of the annotation cost by 60%.
△ Less
Submitted 15 December, 2022;
originally announced December 2022.
-
Integrating pre-processing pipelines in ODC based framework
Authors:
U. Otamendi,
I. Azpiroz,
M. Quartulli,
I. Olaizola
Abstract:
Using on-demand processing pipelines to generate virtual geospatial products is beneficial to optimizing resource management and decreasing processing requirements and data storage space. Additionally, pre-processed products improve data quality for data-driven analytical algorithms, such as machine learning or deep learning models. This paper proposes a method to integrate virtual products based…
▽ More
Using on-demand processing pipelines to generate virtual geospatial products is beneficial to optimizing resource management and decreasing processing requirements and data storage space. Additionally, pre-processed products improve data quality for data-driven analytical algorithms, such as machine learning or deep learning models. This paper proposes a method to integrate virtual products based on integrating open-source processing pipelines. In order to validate and evaluate the functioning of this approach, we have integrated it into a geo-imagery management framework based on Open Data Cube (ODC). To validate the methodology, we have performed three experiments develo** on-demand processing pipelines using multi-sensor remote sensing data, for instance, Sentinel-1 and Sentinel-2. These pipelines are integrated using open-source processing frameworks.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Geo-imagery management and statistical processing in a regional context using Open Data Cube
Authors:
U. Otamendi,
I. Azpiroz,
M. Quartulli,
I. Olaizola,
F. J. Perez,
D. Alda,
X. Garitano
Abstract:
We propose a methodology to manage and process remote sensing and geo-imagery data for non-expert users. The proposed system provides automated data ingestion and manipulation capability for analytical data-driven purposes. In this paper, we describe the technological basis of the proposed method in addition to describing the tool architecture, the inherent data flow, and its operation in a specif…
▽ More
We propose a methodology to manage and process remote sensing and geo-imagery data for non-expert users. The proposed system provides automated data ingestion and manipulation capability for analytical data-driven purposes. In this paper, we describe the technological basis of the proposed method in addition to describing the tool architecture, the inherent data flow, and its operation in a specific use case to provide statistical summaries of Sentinel-2 regions of interest corresponding to the cultivation of polygonal areas located in the Basque Country (ES).
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Storehouse: a Reinforcement Learning Environment for Optimizing Warehouse Management
Authors:
Julen Cestero,
Marco Quartulli,
Alberto Maria Metelli,
Marcello Restelli
Abstract:
Warehouse Management Systems have been evolving and improving thanks to new Data Intelligence techniques. However, many current optimizations have been applied to specific cases or are in great need of manual interaction. Here is where Reinforcement Learning techniques come into play, providing automatization and adaptability to current optimization policies. In this paper, we present Storehouse,…
▽ More
Warehouse Management Systems have been evolving and improving thanks to new Data Intelligence techniques. However, many current optimizations have been applied to specific cases or are in great need of manual interaction. Here is where Reinforcement Learning techniques come into play, providing automatization and adaptability to current optimization policies. In this paper, we present Storehouse, a customizable environment that generalizes the definition of warehouse simulations for Reinforcement Learning. We also validate this environment against state-of-the-art reinforcement learning algorithms and compare these results to human and random policies.
△ Less
Submitted 21 July, 2022; v1 submitted 8 July, 2022;
originally announced July 2022.
-
Segmentation of cell-level anomalies in electroluminescence images of photovoltaic modules
Authors:
Urtzi Otamendi,
Iñigo Martinez,
Marco Quartulli,
Igor G. Olaizola,
Elisabeth Viles,
Werther Cambarau
Abstract:
In the operation & maintenance (O&M) of photovoltaic (PV) plants, the early identification of failures has become crucial to maintain productivity and prolong components' life. Of all defects, cell-level anomalies can lead to serious failures and may affect surrounding PV modules in the long run. These fine defects are usually captured with high spatial resolution electroluminescence (EL) imaging.…
▽ More
In the operation & maintenance (O&M) of photovoltaic (PV) plants, the early identification of failures has become crucial to maintain productivity and prolong components' life. Of all defects, cell-level anomalies can lead to serious failures and may affect surrounding PV modules in the long run. These fine defects are usually captured with high spatial resolution electroluminescence (EL) imaging. The difficulty of acquiring such images has limited the availability of data. For this work, multiple data resources and augmentation techniques have been used to surpass this limitation. Current state-of-the-art detection methods extract barely low-level information from individual PV cell images, and their performance is conditioned by the available training data. In this article, we propose an end-to-end deep learning pipeline that detects, locates and segments cell-level anomalies from entire photovoltaic modules via EL images. The proposed modular pipeline combines three deep learning techniques: 1. object detection (modified Faster-RNN), 2. image classification (EfficientNet) and 3. weakly supervised segmentation (autoencoder). The modular nature of the pipeline allows to upgrade the deep learning models to the further improvements in the state-of-the-art and also extend the pipeline towards new functionalities.
△ Less
Submitted 14 January, 2022; v1 submitted 21 June, 2021;
originally announced June 2021.
-
Determining input variable ranges in Industry 4.0: A heuristic for estimating the domain of a real-valued function or trained regression model given an output range
Authors:
Noelia Oses,
Aritz Legarretaetxebarria,
Marco Quartulli,
Igor García,
Mikel Serrano
Abstract:
Industrial process control systems try to keep an output variable within a given tolerance around a target value. PID control systems have been widely used in industry to control input variables in order to reach this goal. However, this kind of Transfer Function based approach cannot be extended to complex processes where input data might be non-numeric, high dimensional, sparse, etc. In such cas…
▽ More
Industrial process control systems try to keep an output variable within a given tolerance around a target value. PID control systems have been widely used in industry to control input variables in order to reach this goal. However, this kind of Transfer Function based approach cannot be extended to complex processes where input data might be non-numeric, high dimensional, sparse, etc. In such cases, there is still a need for determining the subspace of input data that produces an output within a given range. This paper presents a non-stochastic heuristic to determine input values for a mathematical function or trained regression model given an output range. The proposed method creates a synthetic training data set of input combinations with a class label that indicates whether the output is within the given target range or not. Then, a decision tree classifier is used to determine the subspace of input data of interest. This method is more general than a traditional controller as the target range for the output does not have to be centered around a reference value and it can be applied given a regression model of the output variable, which may have categorical variables as inputs and may be high dimensional, sparse... The proposed heuristic is validated with a proof of concept on a real use case where the quality of a lamination factory is established to identify the suitable subspace of production variable values.
△ Less
Submitted 3 April, 2019;
originally announced April 2019.
-
Distributed mining of large scale remote sensing image archives on public computing infrastructures
Authors:
Luigi Mascolo,
Marco Quartulli,
Pietro Guccione,
Giovanni Nico,
Igor G. Olaizola
Abstract:
Earth Observation (EO) mining aims at supporting efficient access and exploration of petabyte-scale space- and airborne remote sensing archives that are currently expanding at rates of terabytes per day. A significant challenge is performing the analysis required by envisaged applications --- like for instance process map** for environmental risk management --- in reasonable time. In this work,…
▽ More
Earth Observation (EO) mining aims at supporting efficient access and exploration of petabyte-scale space- and airborne remote sensing archives that are currently expanding at rates of terabytes per day. A significant challenge is performing the analysis required by envisaged applications --- like for instance process map** for environmental risk management --- in reasonable time. In this work, we address the problem of content-based image retrieval via example-based queries from EO data archives. In particular, we focus on the analysis of polarimetric SAR data, for which target decomposition theorems have proved fundamental in discovering patterns in data and characterize the ground scattering properties. To this end, we propose an interactive region-oriented content-based image mining system in which 1) unsupervised ingestion processes are distributed onto virtual machines in elastic, on-demand computing infrastructures 2) archive-scale content hierarchical indexing is implemented in terms of a "big data" analytics cluster-computing framework 3) query processing amounts to traversing the generated binary tree index, computing distances that correspond to descriptor-based similarity measures between image groups and a query image tile. We describe in depth both the strategies and the actual implementations for the ingestion and indexing components, and verify the approach by experiments carried out on the NASA/JPL UAVSAR full polarimetric data archive. We report the results of the tests performed on computer clusters by using a public Infrastructure-as-a-Service and evaluating the impact of cluster configuration on system performance. Results are promising for data map** and information retrieval applications.
△ Less
Submitted 17 January, 2015;
originally announced January 2015.
-
Trace transform based method for color image domain identification
Authors:
Igor G. Olaizola,
Marco Quartulli,
Julian Florez,
Basilio Sierra
Abstract:
Context categorization is a fundamental pre-requisite for multi-domain multimedia content analysis applications in order to manage contextual information in an efficient manner. In this paper, we introduce a new color image context categorization method (DITEC) based on the trace transform. The problem of dimensionality reduction of the obtained trace transform signal is addressed through statisti…
▽ More
Context categorization is a fundamental pre-requisite for multi-domain multimedia content analysis applications in order to manage contextual information in an efficient manner. In this paper, we introduce a new color image context categorization method (DITEC) based on the trace transform. The problem of dimensionality reduction of the obtained trace transform signal is addressed through statistical descriptors that keep the underlying information. These extracted features offer a highly discriminant behavior for content categorization. The theoretical properties of the method are analyzed and validated experimentally through two different datasets.
△ Less
Submitted 25 March, 2019; v1 submitted 19 August, 2012;
originally announced August 2012.
-
A review of EO image information mining
Authors:
Marco Quartulli,
Igor G. Olaizola
Abstract:
We analyze the state of the art of content-based retrieval in Earth observation image archives focusing on complete systems showing promise for operational implementation. The different paradigms at the basis of the main system families are introduced. The approaches taken are analyzed, focusing in particular on the phases after primitive feature extraction. The solutions envisaged for the issues…
▽ More
We analyze the state of the art of content-based retrieval in Earth observation image archives focusing on complete systems showing promise for operational implementation. The different paradigms at the basis of the main system families are introduced. The approaches taken are analyzed, focusing in particular on the phases after primitive feature extraction. The solutions envisaged for the issues related to feature simplification and synthesis, indexing, semantic labeling are reviewed. The methodologies for query specification and execution are analyzed.
△ Less
Submitted 19 June, 2012; v1 submitted 4 March, 2012;
originally announced March 2012.