-
`Just One More Sensor is Enough' -- Iterative Water Leak Localization with Physical Simulation and a Small Number of Pressure Sensors
Authors:
Michał Cholewa,
Michał Romaszewski,
Przemysław Głomb,
Katarzyna Kołodziej,
Michał Gorawski,
Jakub Koral,
Wojciech Koral,
Andrzej Madej,
Kryspin Musioł
Abstract:
In this article, we propose an approach to leak localisation in a complex water delivery grid with the use of data from physical simulation (e.g. EPANET software). This task is usually achieved by a network of multiple water pressure sensors and analysis of the so-called sensitivity matrix of pressure differences between the network's simulated data and actual data of the network affected by the l…
▽ More
In this article, we propose an approach to leak localisation in a complex water delivery grid with the use of data from physical simulation (e.g. EPANET software). This task is usually achieved by a network of multiple water pressure sensors and analysis of the so-called sensitivity matrix of pressure differences between the network's simulated data and actual data of the network affected by the leak. However, most algorithms using this approach require a significant number of pressure sensors -- a condition that is not easy to fulfil in the case of many less equipped networks. Therefore, we answer the question of whether leak localisation is possible by utilising very few sensors but having the ability to relocate one of them. Our algorithm is based on physical simulations (EPANET software) and an iterative scheme for mobile sensor relocation. The experiments show that the proposed system can equalise the low number of sensors with adjustments made for their positioning, giving a very good approximation of leak's position both in simulated cases and real-life example taken from BattLeDIM competition L-Town data.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models
Authors:
Michał Romaszewski,
Przemysław Sekuła,
Przemysław Głomb,
Michał Cholewa,
Katarzyna Kołodziej
Abstract:
Large Language Models (LLMs) have shown exceptional performance in text processing. Notably, LLMs can synthesize information from large datasets and explain their decisions similarly to human reasoning through a chain of thought (CoT). An emerging application of LLMs is the handling and interpreting of numerical data, where fine-tuning enhances their performance over basic inference methods. This…
▽ More
Large Language Models (LLMs) have shown exceptional performance in text processing. Notably, LLMs can synthesize information from large datasets and explain their decisions similarly to human reasoning through a chain of thought (CoT). An emerging application of LLMs is the handling and interpreting of numerical data, where fine-tuning enhances their performance over basic inference methods. This paper proposes a novel approach to training LLMs using knowledge transfer from a random forest (RF) ensemble, leveraging its efficiency and accuracy. By converting RF decision paths into natural language statements, we generate outputs for LLM fine-tuning, enhancing the model's ability to classify and explain its decisions. Our method includes verifying these rules through established classification metrics, ensuring their correctness. We also examine the impact of preprocessing techniques on the representation of numerical data and their influence on classification accuracy and rule correctness
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Classification and Self-Supervised Regression of Arrhythmic ECG Signals Using Convolutional Neural Networks
Authors:
Bartosz Grabowski,
Przemysław Głomb,
Wojciech Masarczyk,
Paweł Pławiak,
Özal Yıldırım,
U Rajendra Acharya,
Ru-San Tan
Abstract:
Interpretation of electrocardiography (ECG) signals is required for diagnosing cardiac arrhythmia. Recently, machine learning techniques have been applied for automated computer-aided diagnosis. Machine learning tasks can be divided into regression and classification. Regression can be used for noise and artifacts removal as well as resolve issues of missing data from low sampling frequency. Class…
▽ More
Interpretation of electrocardiography (ECG) signals is required for diagnosing cardiac arrhythmia. Recently, machine learning techniques have been applied for automated computer-aided diagnosis. Machine learning tasks can be divided into regression and classification. Regression can be used for noise and artifacts removal as well as resolve issues of missing data from low sampling frequency. Classification task concerns the prediction of output diagnostic classes according to expert-labeled input classes. In this work, we propose a deep neural network model capable of solving regression and classification tasks. Moreover, we combined the two approaches, using unlabeled and labeled data, to train the model. We tested the model on the MIT-BIH Arrhythmia database. Our method showed high effectiveness in detecting cardiac arrhythmia based on modified Lead II ECG records, as well as achieved high quality of ECG signal approximation. For the former, our method attained overall accuracy of 87:33% and balanced accuracy of 80:54%, on par with reference approaches. For the latter, application of self-supervised learning allowed for training without the need for expert labels. The regression model yielded satisfactory performance with fairly accurate prediction of QRS complexes. Transferring knowledge from regression to the classification task, our method attained higher overall accuracy of 87:78%.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Data structure > labels? Unsupervised heuristics for SVM hyperparameter estimation
Authors:
Michał Cholewa,
Michał Romaszewski,
Przemysław Głomb
Abstract:
Classification is one of the main areas of pattern recognition research, and within it, Support Vector Machine (SVM) is one of the most popular methods outside of field of deep learning -- and a de-facto reference for many Machine Learning approaches. Its performance is determined by parameter selection, which is usually achieved by a time-consuming grid search cross-validation procedure (GSCV). T…
▽ More
Classification is one of the main areas of pattern recognition research, and within it, Support Vector Machine (SVM) is one of the most popular methods outside of field of deep learning -- and a de-facto reference for many Machine Learning approaches. Its performance is determined by parameter selection, which is usually achieved by a time-consuming grid search cross-validation procedure (GSCV). That method, however relies on the availability and quality of labelled examples and thus, when those are limited can be hindered. To address that problem, there exist several unsupervised heuristics that take advantage of the characteristics of the dataset for selecting parameters instead of using class label information. While an order of magnitude faster, they are scarcely used under the assumption that their results are significantly worse than those of grid search. To challenge that assumption, we have proposed improved heuristics for SVM parameter selection and tested it against GSCV and state of the art heuristics on over 30 standard classification datasets. The results show not only its advantage over state-of-art heuristics but also that it is statistically no worse than GSCV.
△ Less
Submitted 22 February, 2024; v1 submitted 3 November, 2021;
originally announced November 2021.
-
Improving Autoencoder Training Performance for Hyperspectral Unmixing with Network Reinitialisation
Authors:
Kamil Książek,
Przemysław Głomb,
Michał Romaszewski,
Michał Cholewa,
Bartosz Grabowski,
Krisztián Búza
Abstract:
Neural networks, in particular autoencoders, are one of the most promising solutions for unmixing hyperspectral data, i.e. reconstructing the spectra of observed substances (endmembers) and their relative mixing fractions (abundances), which is needed for effective hyperspectral analysis and classification. However, as we show in this paper, the training of autoencoders for unmixing is highly depe…
▽ More
Neural networks, in particular autoencoders, are one of the most promising solutions for unmixing hyperspectral data, i.e. reconstructing the spectra of observed substances (endmembers) and their relative mixing fractions (abundances), which is needed for effective hyperspectral analysis and classification. However, as we show in this paper, the training of autoencoders for unmixing is highly dependent on weights initialisation; some sets of weights lead to degenerate or low-performance solutions, introducing negative bias in the expected performance. In this work, we experimentally investigate autoencoders stability as well as network reinitialisation methods based on coefficients of neurons' dead activations. We demonstrate that the proposed techniques have a positive effect on autoencoder training in terms of reconstruction, abundances and endmembers errors.
△ Less
Submitted 12 April, 2022; v1 submitted 28 September, 2021;
originally announced September 2021.
-
A Dataset for Evaluating Blood Detection in Hyperspectral Images
Authors:
Michał Romaszewski,
Przemysław Głomb,
Arkadiusz Sochan,
Michał Cholewa
Abstract:
The sensitivity of imaging spectroscopy to haemoglobin derivatives makes it a promising tool for detecting blood. However, due to complexity and high dimensionality of hyperspectral images, the development of hyperspectral blood detection algorithms is challenging. To facilitate their development, we present a new hyperspectral blood detection dataset. This dataset, published in accordance to open…
▽ More
The sensitivity of imaging spectroscopy to haemoglobin derivatives makes it a promising tool for detecting blood. However, due to complexity and high dimensionality of hyperspectral images, the development of hyperspectral blood detection algorithms is challenging. To facilitate their development, we present a new hyperspectral blood detection dataset. This dataset, published in accordance to open access mandate, consist of multiple detection scenarios with varying levels of complexity. It allows to test the performance of Machine Learning methods in relation to different acquisition environments, types of background, age of blood and presence of other blood-like substances. We explored the dataset with blood detection experiments. We used hyperspectral target detection algorithm based on the well-known Matched Filter detector. Our results and their discussion highlight the challenges of blood detection in hyperspectral data and form a reference for further works.
△ Less
Submitted 31 March, 2021; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Effective training of deep convolutional neural networks for hyperspectral image classification through artificial labeling
Authors:
Wojciech Masarczyk,
Przemysław Głomb,
Bartosz Grabowski,
Mateusz Ostaszewski
Abstract:
Hyperspectral imaging is a rich source of data, allowing for multitude of effective applications. However, such imaging remains challenging because of large data dimension and, typically, small pool of available training examples. While deep learning approaches have been shown to be successful in providing effective classification solutions, especially for high dimensional problems, unfortunately…
▽ More
Hyperspectral imaging is a rich source of data, allowing for multitude of effective applications. However, such imaging remains challenging because of large data dimension and, typically, small pool of available training examples. While deep learning approaches have been shown to be successful in providing effective classification solutions, especially for high dimensional problems, unfortunately they work best with a lot of labelled examples available. To alleviate the second requirement for a particular dataset the transfer learning approach can be used: first the network is pre-trained on some dataset with large amount of training labels available, then the actual dataset is used to fine-tune the network. This strategy is not straightforward to apply with hyperspectral images, as it is often the case that only one particular image of some type or characteristic is available. In this paper, we propose and investigate a simple and effective strategy of transfer learning that uses unsupervised pre-training step without label information. This approach can be applied to many of the hyperspectral classification problems. Performed experiments show that it is very effective at improving the classification accuracy without being restricted to a particular image type or neural network architecture. The experiments were carried out on several deep neural network architectures and various sizes of labeled training sets. The greatest improvement in overall accuracy on the Indian Pines and Pavia University datasets is over 21 and 13 percentage points, respectively. An additional advantage of the proposed approach is the unsupervised nature of the pre-training step, which can be done immediately after image acquisition, without the need of the potentially costly expert's time.
△ Less
Submitted 22 October, 2020; v1 submitted 12 September, 2019;
originally announced September 2019.
-
Band selection with Higher Order Multivariate Cumulants for small target detection in hyperspectral images
Authors:
Przemysław Głomb,
Krzysztof Domino,
Michał Romaszewski,
Michał Cholewa
Abstract:
In the small target detection problem a pattern to be located is on the order of magnitude less numerous than other patterns present in the dataset. This applies both to the case of supervised detection, where the known template is expected to match in just a few areas and unsupervised anomaly detection, as anomalies are rare by definition. This problem is frequently related to the imaging applica…
▽ More
In the small target detection problem a pattern to be located is on the order of magnitude less numerous than other patterns present in the dataset. This applies both to the case of supervised detection, where the known template is expected to match in just a few areas and unsupervised anomaly detection, as anomalies are rare by definition. This problem is frequently related to the imaging applications, i.e. detection within the scene acquired by a camera. To maximize available data about the scene, hyperspectral cameras are used; at each pixel, they record spectral data in hundreds of narrow bands.
The typical feature of hyperspectral imaging is that characteristic properties of target materials are visible in the small number of bands, where light of certain wavelength interacts with characteristic molecules. A target-independent band selection method based on statistical principles is a versatile tool for solving this problem in different practical applications.
Combination of a regular background and a rare standing out anomaly will produce a distortion in the joint distribution of hyperspectral pixels. Higher Order Cumulants Tensors are a natural `window' into this distribution, allowing to measure properties and suggest candidate bands for removal. While there have been attempts at producing band selection algorithms based on the 3 rd cumulant's tensor i.e. the joint skewness, the literature lacks a systematic analysis of how the order of the cumulant tensor used affects effectiveness of band selection in detection applications. In this paper we present an analysis of a general algorithm for band selection based on higher order cumulants. We discuss its usability related to the observed breaking points in performance, depending both on method order and the desired number of bands. Finally we perform experiments and evaluate these methods in a hyperspectral detection scenario.
△ Less
Submitted 10 August, 2018;
originally announced August 2018.
-
Quantum Hidden Markov Models based on Transition Operation Matrices
Authors:
Michał Cholewa,
Piotr Gawron,
Przemysław Głomb,
Dariusz Kurzyk
Abstract:
In this work, we extend the idea of Quantum Markov chains [S. Gudder. Quantum Markov chains. J. Math. Phys., 49(7), 2008] in order to propose Quantum Hidden Markov Models (QHMMs). For that, we use the notions of Transition Operation Matrices (TOM) and Vector States, which are an extension of classical stochastic matrices and probability distributions. Our main result is the Mealy QHMM formulation…
▽ More
In this work, we extend the idea of Quantum Markov chains [S. Gudder. Quantum Markov chains. J. Math. Phys., 49(7), 2008] in order to propose Quantum Hidden Markov Models (QHMMs). For that, we use the notions of Transition Operation Matrices (TOM) and Vector States, which are an extension of classical stochastic matrices and probability distributions. Our main result is the Mealy QHMM formulation and proofs of algorithms needed for application of this model: Forward for general case and Vitterbi for a restricted class of QHMMs.
△ Less
Submitted 10 February, 2017; v1 submitted 30 March, 2015;
originally announced March 2015.
-
Deciding of HMM parameters based on number of critical points for gesture recognition from motion capture data
Authors:
Michał Cholewa,
Przemysław Głomb
Abstract:
This paper presents a method of choosing number of states of a HMM based on number of critical points of the motion capture data. The choice of Hidden Markov Models(HMM) parameters is crucial for recognizer's performance as it is the first step of the training and cannot be corrected automatically within HMM. In this article we define predictor of number of states based on number of critical point…
▽ More
This paper presents a method of choosing number of states of a HMM based on number of critical points of the motion capture data. The choice of Hidden Markov Models(HMM) parameters is crucial for recognizer's performance as it is the first step of the training and cannot be corrected automatically within HMM. In this article we define predictor of number of states based on number of critical points of the sequence and test its effectiveness against sample data.
△ Less
Submitted 28 October, 2011;
originally announced October 2011.
-
Natural hand gestures for human identification in a Human-Computer Interface
Authors:
Michał Romaszewski,
Przemysław Głomb,
Piotr Gawron
Abstract:
The goal of this work is the identification of humans based on motion data in the form of natural hand gestures. In this paper, the identification problem is formulated as classification with classes corresponding to persons' identities, based on recorded signals of performed gestures. The identification performance is examined with a database of twenty-two natural hand gestures recorded with two…
▽ More
The goal of this work is the identification of humans based on motion data in the form of natural hand gestures. In this paper, the identification problem is formulated as classification with classes corresponding to persons' identities, based on recorded signals of performed gestures. The identification performance is examined with a database of twenty-two natural hand gestures recorded with two types of hardware and three state-of-art classifiers: Linear Discrimination Analysis (LDA), Support Vector machines (SVM) and k-Nearest Neighbour (k-NN). Results show that natural hand gestures allow for an effective human classification.
△ Less
Submitted 19 March, 2013; v1 submitted 23 September, 2011;
originally announced September 2011.
-
Eigengestures for natural human computer interface
Authors:
Piotr Gawron,
Przemysław Głomb,
Jarosław Adam Miszczak,
Zbigniew Puchała
Abstract:
We present the application of Principal Component Analysis for data acquired during the design of a natural gesture interface. We investigate the concept of an eigengesture for motion capture hand gesture data and present the visualisation of principal components obtained in the course of conducted experiments. We also show the influence of dimensionality reduction on reconstructed gesture data qu…
▽ More
We present the application of Principal Component Analysis for data acquired during the design of a natural gesture interface. We investigate the concept of an eigengesture for motion capture hand gesture data and present the visualisation of principal components obtained in the course of conducted experiments. We also show the influence of dimensionality reduction on reconstructed gesture data quality.
△ Less
Submitted 6 May, 2011;
originally announced May 2011.