-
`Just One More Sensor is Enough' -- Iterative Water Leak Localization with Physical Simulation and a Small Number of Pressure Sensors
Authors:
Michał Cholewa,
Michał Romaszewski,
Przemysław Głomb,
Katarzyna Kołodziej,
Michał Gorawski,
Jakub Koral,
Wojciech Koral,
Andrzej Madej,
Kryspin Musioł
Abstract:
In this article, we propose an approach to leak localisation in a complex water delivery grid with the use of data from physical simulation (e.g. EPANET software). This task is usually achieved by a network of multiple water pressure sensors and analysis of the so-called sensitivity matrix of pressure differences between the network's simulated data and actual data of the network affected by the l…
▽ More
In this article, we propose an approach to leak localisation in a complex water delivery grid with the use of data from physical simulation (e.g. EPANET software). This task is usually achieved by a network of multiple water pressure sensors and analysis of the so-called sensitivity matrix of pressure differences between the network's simulated data and actual data of the network affected by the leak. However, most algorithms using this approach require a significant number of pressure sensors -- a condition that is not easy to fulfil in the case of many less equipped networks. Therefore, we answer the question of whether leak localisation is possible by utilising very few sensors but having the ability to relocate one of them. Our algorithm is based on physical simulations (EPANET software) and an iterative scheme for mobile sensor relocation. The experiments show that the proposed system can equalise the low number of sensors with adjustments made for their positioning, giving a very good approximation of leak's position both in simulated cases and real-life example taken from BattLeDIM competition L-Town data.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models
Authors:
Michał Romaszewski,
Przemysław Sekuła,
Przemysław Głomb,
Michał Cholewa,
Katarzyna Kołodziej
Abstract:
Large Language Models (LLMs) have shown exceptional performance in text processing. Notably, LLMs can synthesize information from large datasets and explain their decisions similarly to human reasoning through a chain of thought (CoT). An emerging application of LLMs is the handling and interpreting of numerical data, where fine-tuning enhances their performance over basic inference methods. This…
▽ More
Large Language Models (LLMs) have shown exceptional performance in text processing. Notably, LLMs can synthesize information from large datasets and explain their decisions similarly to human reasoning through a chain of thought (CoT). An emerging application of LLMs is the handling and interpreting of numerical data, where fine-tuning enhances their performance over basic inference methods. This paper proposes a novel approach to training LLMs using knowledge transfer from a random forest (RF) ensemble, leveraging its efficiency and accuracy. By converting RF decision paths into natural language statements, we generate outputs for LLM fine-tuning, enhancing the model's ability to classify and explain its decisions. Our method includes verifying these rules through established classification metrics, ensuring their correctness. We also examine the impact of preprocessing techniques on the representation of numerical data and their influence on classification accuracy and rule correctness
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Data structure > labels? Unsupervised heuristics for SVM hyperparameter estimation
Authors:
Michał Cholewa,
Michał Romaszewski,
Przemysław Głomb
Abstract:
Classification is one of the main areas of pattern recognition research, and within it, Support Vector Machine (SVM) is one of the most popular methods outside of field of deep learning -- and a de-facto reference for many Machine Learning approaches. Its performance is determined by parameter selection, which is usually achieved by a time-consuming grid search cross-validation procedure (GSCV). T…
▽ More
Classification is one of the main areas of pattern recognition research, and within it, Support Vector Machine (SVM) is one of the most popular methods outside of field of deep learning -- and a de-facto reference for many Machine Learning approaches. Its performance is determined by parameter selection, which is usually achieved by a time-consuming grid search cross-validation procedure (GSCV). That method, however relies on the availability and quality of labelled examples and thus, when those are limited can be hindered. To address that problem, there exist several unsupervised heuristics that take advantage of the characteristics of the dataset for selecting parameters instead of using class label information. While an order of magnitude faster, they are scarcely used under the assumption that their results are significantly worse than those of grid search. To challenge that assumption, we have proposed improved heuristics for SVM parameter selection and tested it against GSCV and state of the art heuristics on over 30 standard classification datasets. The results show not only its advantage over state-of-art heuristics but also that it is statistically no worse than GSCV.
△ Less
Submitted 22 February, 2024; v1 submitted 3 November, 2021;
originally announced November 2021.
-
Improving Autoencoder Training Performance for Hyperspectral Unmixing with Network Reinitialisation
Authors:
Kamil Książek,
Przemysław Głomb,
Michał Romaszewski,
Michał Cholewa,
Bartosz Grabowski,
Krisztián Búza
Abstract:
Neural networks, in particular autoencoders, are one of the most promising solutions for unmixing hyperspectral data, i.e. reconstructing the spectra of observed substances (endmembers) and their relative mixing fractions (abundances), which is needed for effective hyperspectral analysis and classification. However, as we show in this paper, the training of autoencoders for unmixing is highly depe…
▽ More
Neural networks, in particular autoencoders, are one of the most promising solutions for unmixing hyperspectral data, i.e. reconstructing the spectra of observed substances (endmembers) and their relative mixing fractions (abundances), which is needed for effective hyperspectral analysis and classification. However, as we show in this paper, the training of autoencoders for unmixing is highly dependent on weights initialisation; some sets of weights lead to degenerate or low-performance solutions, introducing negative bias in the expected performance. In this work, we experimentally investigate autoencoders stability as well as network reinitialisation methods based on coefficients of neurons' dead activations. We demonstrate that the proposed techniques have a positive effect on autoencoder training in terms of reconstruction, abundances and endmembers errors.
△ Less
Submitted 12 April, 2022; v1 submitted 28 September, 2021;
originally announced September 2021.
-
Hyperspectral classification of blood-like substances using machine learning methods combined with genetic algorithms in transductive and inductive scenarios
Authors:
Filip Pałka,
Wojciech Książek,
Paweł Pławiak,
Michał Romaszewski,
Kamil Książek
Abstract:
This study is focused on applying genetic algorithms (GA) to model and band selection in hyperspectral image classification. We use a forensic-inspired data set of seven hyperspectral images with blood and five visually similar substances to test GA-optimised classifiers in two scenarios: when the training and test data come from the same image and when they come from different images, which is a…
▽ More
This study is focused on applying genetic algorithms (GA) to model and band selection in hyperspectral image classification. We use a forensic-inspired data set of seven hyperspectral images with blood and five visually similar substances to test GA-optimised classifiers in two scenarios: when the training and test data come from the same image and when they come from different images, which is a more challenging task due to significant spectra differences. In our experiments we compare GA with a classic model optimisation through grid search. Our results show that GA-based model optimisation can reduce the number of bands and create an accurate classifier that outperforms the GS-based reference models, provided that during model optimisation it has access to examples similar to test data. We illustrate this with experiment highlighting the importance of a validation set.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Comparison of surface thermal patterns of horses and donkeys in IRT images
Authors:
Małgorzata Domino,
Michał Romaszewski,
Tomasz Jasiński,
Małgorzata Maśko
Abstract:
Infrared thermography (IRT) is a valuable diagnostic tool in equine veterinary medicine however, little is known about its application in donkeys. The aim was to find patterns in thermal images of donkeys and horses, and determine if these patterns share similarities. The study was carried out on 18 donkeys and 16 horses. All equids underwent thermal imaging with an infrared camera and measuring t…
▽ More
Infrared thermography (IRT) is a valuable diagnostic tool in equine veterinary medicine however, little is known about its application in donkeys. The aim was to find patterns in thermal images of donkeys and horses, and determine if these patterns share similarities. The study was carried out on 18 donkeys and 16 horses. All equids underwent thermal imaging with an infrared camera and measuring the skin thickness and hair coat length. On the class maps of each thermal image, 15 regions of interest (ROIs) were annotated and then combined into 10 groups of ROIs (GORs). The existence of statistically significant differences between surface temperatures in GORs was tested both `globally' for all animals of a given species and `locally' for each animal. Two special cases of animals that differ from the rest were also discussed. Our results indicated that the majority of thermal patterns are similar for both species however, average surface temperatures in horses are higher than in donkeys. It may be related to differences in the skin and hair coat. We concluded, the patterns of both species are associated with GORs, rather than an individual ROI, with higher uniformity of donkeys patterns.
△ Less
Submitted 19 October, 2020;
originally announced October 2020.
-
A Dataset for Evaluating Blood Detection in Hyperspectral Images
Authors:
Michał Romaszewski,
Przemysław Głomb,
Arkadiusz Sochan,
Michał Cholewa
Abstract:
The sensitivity of imaging spectroscopy to haemoglobin derivatives makes it a promising tool for detecting blood. However, due to complexity and high dimensionality of hyperspectral images, the development of hyperspectral blood detection algorithms is challenging. To facilitate their development, we present a new hyperspectral blood detection dataset. This dataset, published in accordance to open…
▽ More
The sensitivity of imaging spectroscopy to haemoglobin derivatives makes it a promising tool for detecting blood. However, due to complexity and high dimensionality of hyperspectral images, the development of hyperspectral blood detection algorithms is challenging. To facilitate their development, we present a new hyperspectral blood detection dataset. This dataset, published in accordance to open access mandate, consist of multiple detection scenarios with varying levels of complexity. It allows to test the performance of Machine Learning methods in relation to different acquisition environments, types of background, age of blood and presence of other blood-like substances. We explored the dataset with blood detection experiments. We used hyperspectral target detection algorithm based on the well-known Matched Filter detector. Our results and their discussion highlight the challenges of blood detection in hyperspectral data and form a reference for further works.
△ Less
Submitted 31 March, 2021; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Band selection with Higher Order Multivariate Cumulants for small target detection in hyperspectral images
Authors:
Przemysław Głomb,
Krzysztof Domino,
Michał Romaszewski,
Michał Cholewa
Abstract:
In the small target detection problem a pattern to be located is on the order of magnitude less numerous than other patterns present in the dataset. This applies both to the case of supervised detection, where the known template is expected to match in just a few areas and unsupervised anomaly detection, as anomalies are rare by definition. This problem is frequently related to the imaging applica…
▽ More
In the small target detection problem a pattern to be located is on the order of magnitude less numerous than other patterns present in the dataset. This applies both to the case of supervised detection, where the known template is expected to match in just a few areas and unsupervised anomaly detection, as anomalies are rare by definition. This problem is frequently related to the imaging applications, i.e. detection within the scene acquired by a camera. To maximize available data about the scene, hyperspectral cameras are used; at each pixel, they record spectral data in hundreds of narrow bands.
The typical feature of hyperspectral imaging is that characteristic properties of target materials are visible in the small number of bands, where light of certain wavelength interacts with characteristic molecules. A target-independent band selection method based on statistical principles is a versatile tool for solving this problem in different practical applications.
Combination of a regular background and a rare standing out anomaly will produce a distortion in the joint distribution of hyperspectral pixels. Higher Order Cumulants Tensors are a natural `window' into this distribution, allowing to measure properties and suggest candidate bands for removal. While there have been attempts at producing band selection algorithms based on the 3 rd cumulant's tensor i.e. the joint skewness, the literature lacks a systematic analysis of how the order of the cumulant tensor used affects effectiveness of band selection in detection applications. In this paper we present an analysis of a general algorithm for band selection based on higher order cumulants. We discuss its usability related to the observed breaking points in performance, depending both on method order and the desired number of bands. Finally we perform experiments and evaluate these methods in a hyperspectral detection scenario.
△ Less
Submitted 10 August, 2018;
originally announced August 2018.
-
Compression of animated 3D models using HO-SVD
Authors:
Michał Romaszewski,
Piotr Gawron,
Sebastian Opozda
Abstract:
This work presents an analysis of Higher Order Singular Value Decomposition (HO-SVD) applied to lossy compression of 3D mesh animations. We describe strategies for choosing a number of preserved spatial and temporal components after tensor decomposition. Compression error is measured using three metrics (MSE, Hausdorff, MSDM). Results are compared with a method based on Principal Component Analysi…
▽ More
This work presents an analysis of Higher Order Singular Value Decomposition (HO-SVD) applied to lossy compression of 3D mesh animations. We describe strategies for choosing a number of preserved spatial and temporal components after tensor decomposition. Compression error is measured using three metrics (MSE, Hausdorff, MSDM). Results are compared with a method based on Principal Component Analysis (PCA) and presented on a set of animations with typical mesh deformations.
△ Less
Submitted 4 October, 2013;
originally announced October 2013.
-
Natural hand gestures for human identification in a Human-Computer Interface
Authors:
Michał Romaszewski,
Przemysław Głomb,
Piotr Gawron
Abstract:
The goal of this work is the identification of humans based on motion data in the form of natural hand gestures. In this paper, the identification problem is formulated as classification with classes corresponding to persons' identities, based on recorded signals of performed gestures. The identification performance is examined with a database of twenty-two natural hand gestures recorded with two…
▽ More
The goal of this work is the identification of humans based on motion data in the form of natural hand gestures. In this paper, the identification problem is formulated as classification with classes corresponding to persons' identities, based on recorded signals of performed gestures. The identification performance is examined with a database of twenty-two natural hand gestures recorded with two types of hardware and three state-of-art classifiers: Linear Discrimination Analysis (LDA), Support Vector machines (SVM) and k-Nearest Neighbour (k-NN). Results show that natural hand gestures allow for an effective human classification.
△ Less
Submitted 19 March, 2013; v1 submitted 23 September, 2011;
originally announced September 2011.