Search | arXiv e-print repository

Convolutional and Deep Learning based techniques for Time Series Ordinal Classification

Authors: Rafael Ayllón-Gavilán, David Guijo-Rubio, Pedro Antonio Gutiérrez, Anthony Bagnall, César Hervás-Martínez

Abstract: Time Series Classification (TSC) covers the supervised learning problem where input data is provided in the form of series of values observed through repeated measurements over time, and whose objective is to predict the category to which they belong. When the class values are ordinal, classifiers that take this into account can perform better than nominal classifiers. Time Series Ordinal Classifi… ▽ More Time Series Classification (TSC) covers the supervised learning problem where input data is provided in the form of series of values observed through repeated measurements over time, and whose objective is to predict the category to which they belong. When the class values are ordinal, classifiers that take this into account can perform better than nominal classifiers. Time Series Ordinal Classification (TSOC) is the field covering this gap, yet unexplored in the literature. There are a wide range of time series problems showing an ordered label structure, and TSC techniques that ignore the order relationship discard useful information. Hence, this paper presents a first benchmarking of TSOC methodologies, exploiting the ordering of the target labels to boost the performance of current TSC state-of-the-art. Both convolutional- and deep learning-based methodologies (among the best performing alternatives for nominal TSC) are adapted for TSOC. For the experiments, a selection of 18 ordinal problems from two well-known archives has been made. In this way, this paper contributes to the establishment of the state-of-the-art in TSOC. The results obtained by ordinal versions are found to be significantly better than current nominal TSC techniques in terms of ordinal performance metrics, outlining the importance of considering the ordering of the labels when dealing with this kind of problems. △ Less

Submitted 16 June, 2023; originally announced June 2023.

Comments: 13 pages, 9 figures, 3 tables

arXiv:2305.10059 [pdf]

A hybrid feature learning approach based on convolutional kernels for ATM fault prediction using event-log data

Authors: Víctor Manuel Vargas, Riccardo Rosati, César Hervás-Martínez, Adriano Mancini, Luca Romeo, Pedro Antonio Gutiérrez

Abstract: Predictive Maintenance (PdM) methods aim to facilitate the scheduling of maintenance work before equipment failure. In this context, detecting early faults in automated teller machines (ATMs) has become increasingly important since these machines are susceptible to various types of unpredictable failures. ATMs track execution status by generating massive event-log data that collect system messages… ▽ More Predictive Maintenance (PdM) methods aim to facilitate the scheduling of maintenance work before equipment failure. In this context, detecting early faults in automated teller machines (ATMs) has become increasingly important since these machines are susceptible to various types of unpredictable failures. ATMs track execution status by generating massive event-log data that collect system messages unrelated to the failure event. Predicting machine failure based on event logs poses additional challenges, mainly in extracting features that might represent sequences of events indicating impending failures. Accordingly, feature learning approaches are currently being used in PdM, where informative features are learned automatically from minimally processed sensor data. However, a gap remains to be seen on how these approaches can be exploited for deriving relevant features from event-log-based data. To fill this gap, we present a predictive model based on a convolutional kernel (MiniROCKET and HYDRA) to extract features from the original event-log data and a linear classifier to classify the sample based on the learned features. The proposed methodology is applied to a significant real-world collected dataset. Experimental results demonstrated how one of the proposed convolutional kernels (i.e. HYDRA) exhibited the best classification performance (accuracy of 0.759 and AUC of 0.693). In addition, statistical analysis revealed that the HYDRA and MiniROCKET models significantly overcome one of the established state-of-the-art approaches in time series classification (InceptionTime), and three non-temporal ML methods from the literature. The predictive model was integrated into a container-based decision support system to support operators in the timely maintenance of ATMs. △ Less

Submitted 17 May, 2023; originally announced May 2023.

ACM Class: I.2.1; I.5.4

arXiv:2305.09288 [pdf, other]

A Dictionary-based approach to Time Series Ordinal Classification

Authors: Rafael Ayllón-Gavilán, David Guijo-Rubio, Pedro Antonio Gutiérrez, César Hervás-Martinez

Abstract: Time Series Classification (TSC) is an extensively researched field from which a broad range of real-world problems can be addressed obtaining excellent results. One sort of the approaches performing well are the so-called dictionary-based techniques. The Temporal Dictionary Ensemble (TDE) is the current state-of-the-art dictionary-based TSC approach. In many TSC problems we find a natural orderin… ▽ More Time Series Classification (TSC) is an extensively researched field from which a broad range of real-world problems can be addressed obtaining excellent results. One sort of the approaches performing well are the so-called dictionary-based techniques. The Temporal Dictionary Ensemble (TDE) is the current state-of-the-art dictionary-based TSC approach. In many TSC problems we find a natural ordering in the labels associated with the time series. This characteristic is referred to as ordinality, and can be exploited to improve the methods performance. The area dealing with ordinal time series is the Time Series Ordinal Classification (TSOC) field, which is yet unexplored. In this work, we present an ordinal adaptation of the TDE algorithm, known as ordinal TDE (O-TDE). For this, a comprehensive comparison using a set of 18 TSOC problems is performed. Experiments conducted show the improvement achieved by the ordinal dictionary-based approach in comparison to four other existing nominal dictionary-based techniques. △ Less

Submitted 16 May, 2023; originally announced May 2023.

arXiv:1903.10012 [pdf, other]

A mixture of experts model for predicting persistent weather patterns

Authors: Maria Perez-Ortiz, Pedro A. Gutierrez, Peter Tino, Carlos Casanova-Mateo, Sancho Salcedo-Sanz

Abstract: Weather and atmospheric patterns are often persistent. The simplest weather forecasting method is the so-called persistence model, which assumes that the future state of a system will be similar (or equal) to the present state. Machine learning (ML) models are widely used in different weather forecasting applications, but they need to be compared to the persistence model to analyse whether they pr… ▽ More Weather and atmospheric patterns are often persistent. The simplest weather forecasting method is the so-called persistence model, which assumes that the future state of a system will be similar (or equal) to the present state. Machine learning (ML) models are widely used in different weather forecasting applications, but they need to be compared to the persistence model to analyse whether they provide a competitive solution to the problem at hand. In this paper, we devise a new model for predicting low-visibility in airports using the concepts of mixture of experts. Visibility level is coded as two different ordered categorical variables: cloud height and runway visual height. The underlying system in this application is stagnant approximately in 90% of the cases, and standard ML models fail to improve on the performance of the persistence model. Because of this, instead of trying to simply beat the persistence model using ML, we use this persistence as a baseline and learn an ordinal neural network model that refines its results by focusing on learning weather fluctuations. The results show that the proposal outperforms persistence and other ordinal autoregressive models, especially for longer time horizon predictions and for the runway visual height variable. △ Less

Submitted 24 March, 2019; originally announced March 2019.

Comments: Published in IEEE International Joint Conference on Neural Networks (IJCNN) 2018

arXiv:1812.10660 [pdf, other]

We all do better when we work together?

Authors: Thomas Fossati, Pedro A. Aranda Gutiérrez, Diego López

Abstract: This paper evaluates the impact of a RD signal on traffic crossing the mobile network through a set of experiments in a simulated 5G network scenario built with ns-3. In our experiments, we compare a scenario with no RD treatment (i.e. a single best--effort EPS bearer) with scenarios with RD treatment (i.e. separate EPS bearers to carry RD--partitioned traffic) with honest and cheating users. Our… ▽ More This paper evaluates the impact of a RD signal on traffic crossing the mobile network through a set of experiments in a simulated 5G network scenario built with ns-3. In our experiments, we compare a scenario with no RD treatment (i.e. a single best--effort EPS bearer) with scenarios with RD treatment (i.e. separate EPS bearers to carry RD--partitioned traffic) with honest and cheating users. Our objective is twofold. On the one hand, we want to explore the suitability of RD as a way to harmonise the LTE and Internet QoS models and on the other hand, we aim at providing data to inform protocol design as well as operations-related discussion on the notion of exposing a 1-bit, clear-text \ac{rd} signal from endpoints to the network path when the use of end-to-end encrypted protocols would otherwise prevent flow classification based on DPI. Specifically, the question we want help answer is whether the gain in terms of end users' QoE and radio spectrum efficiency is enough to justify making room for such signal. All the experiments are fully documented and the tooling used is made available as open-source to ensure their reproducibility. △ Less

Submitted 27 December, 2018; originally announced December 2018.

arXiv:1812.04834 [pdf]

Analysis and Consideration on Management of Encrypted Traffic

Authors: Pedro A. Aranda Gutiérrez, Diego López, Thomas Fossati

Abstract: Encrypted traffic poses new and unique challenges for network operators because information that is useful or necessary for management purposes is not accessible anymore. This paper examines proposed approaches to provide end-to-end encryption as well as ways to provide the access to the needed information for network management. The two main approaches we consider are 1.- extending and adapting a… ▽ More Encrypted traffic poses new and unique challenges for network operators because information that is useful or necessary for management purposes is not accessible anymore. This paper examines proposed approaches to provide end-to-end encryption as well as ways to provide the access to the needed information for network management. The two main approaches we consider are 1.- extending and adapting a widely deployed protocol such as TLS, so that information necessary for network management can be obtained; and 2.- introducing a new protocol layer, such as PLUS, that contains the necessary information outside of the encrypted payload. In this paper we discuss different aspects of these approaches in order to give recommendations for future work and suggest to raise awareness by establishing venues where discussions between interested parties can take place. △ Less

Submitted 16 December, 2018; v1 submitted 12 December, 2018; originally announced December 2018.

Comments: 11 pages

arXiv:1811.07155 [pdf, ps, other]

Monotonic classification: an overview on algorithms, performance measures and data sets

Authors: José-Ramón Cano, Pedro Antonio Gutiérrez, Bartosz Krawczyk, Michał Woźniak, Salvador García

Abstract: Currently, knowledge discovery in databases is an essential step to identify valid, novel and useful patterns for decision making. There are many real-world scenarios, such as bankruptcy prediction, option pricing or medical diagnosis, where the classification models to be learned need to fulfil restrictions of monotonicity (i.e. the target class label should not decrease when input attributes val… ▽ More Currently, knowledge discovery in databases is an essential step to identify valid, novel and useful patterns for decision making. There are many real-world scenarios, such as bankruptcy prediction, option pricing or medical diagnosis, where the classification models to be learned need to fulfil restrictions of monotonicity (i.e. the target class label should not decrease when input attributes values increase). For instance, it is rational to assume that a higher debt ratio of a company should never result in a lower level of bankruptcy risk. Consequently, there is a growing interest from the data mining research community concerning monotonic predictive models. This paper aims to present an overview about the literature in the field, analyzing existing techniques and proposing a taxonomy of the algorithms based on the type of model generated. For each method, we review the quality metrics considered in the evaluation and the different data sets and monotonic problems used in the analysis. In this way, this paper serves as an overview of the research about monotonic classification in specialized literature and can be used as a functional guide of the field. △ Less

Submitted 17 November, 2018; originally announced November 2018.

arXiv:1810.11624 [pdf, other]

Time series clustering based on the characterisation of segment typologies

Authors: David Guijo-Rubio, Antonio Manuel Durán-Rosal, Pedro Antonio Gutiérrez, Alicia Troncoso, César Hervás-Martínez

Abstract: Time series clustering is the process of grou** time series with respect to their similarity or characteristics. Previous approaches usually combine a specific distance measure for time series and a standard clustering method. However, these approaches do not take the similarity of the different subsequences of each time series into account, which can be used to better compare the time series ob… ▽ More Time series clustering is the process of grou** time series with respect to their similarity or characteristics. Previous approaches usually combine a specific distance measure for time series and a standard clustering method. However, these approaches do not take the similarity of the different subsequences of each time series into account, which can be used to better compare the time series objects of the dataset. In this paper, we propose a novel technique of time series clustering based on two clustering stages. In a first step, a least squares polynomial segmentation procedure is applied to each time series, which is based on a growing window technique that returns different-length segments. Then, all the segments are projected into same dimensional space, based on the coefficients of the model that approximates the segment and a set of statistical features. After map**, a first hierarchical clustering phase is applied to all mapped segments, returning groups of segments for each time series. These clusters are used to represent all time series in the same dimensional space, after defining another specific map** process. In a second and final clustering stage, all the time series objects are grouped. We consider internal clustering quality to automatically adjust the main parameter of the algorithm, which is an error threshold for the segmenta- tion. The results obtained on 84 datasets from the UCR Time Series Classification Archive have been compared against two state-of-the-art methods, showing that the performance of this methodology is very promising. △ Less

Submitted 27 October, 2018; originally announced October 2018.

Comments: 13 pages, 7 figures, 4 tables, 57 refs

arXiv:1810.09733 [pdf, ps, other]

OCAPIS: R package for Ordinal Classification And Preprocessing In Scala

Authors: M. Cristina Heredia-Gómez, Salvador García, Pedro Antonio Gutiérrez, Francisco Herrera

Abstract: Ordinal Data are those where a natural order exist between the labels. The classification and pre-processing of this type of data is attracting more and more interest in the area of machine learning, due to its presence in many common problems. Traditionally, ordinal classification problems have been approached as nominal problems. However, that implies not taking into account their natural order… ▽ More Ordinal Data are those where a natural order exist between the labels. The classification and pre-processing of this type of data is attracting more and more interest in the area of machine learning, due to its presence in many common problems. Traditionally, ordinal classification problems have been approached as nominal problems. However, that implies not taking into account their natural order constraints. In this paper, an innovative R package named ocapis (Ordinal Classification and Preprocessing In Scala) is introduced. Implemented mainly in Scala and available through Github, this library includes four learners and two pre-processing algorithms for ordinal and monotonic data. Main features of the package and examples of installation and use are explained throughout this manuscript. △ Less

Submitted 17 March, 2019; v1 submitted 23 October, 2018; originally announced October 2018.

Comments: 16 pages

arXiv:1707.09558 [pdf, other]

Lessons learnt from the NetIDE project: Taking SDN programming to the next level

Authors: Pedro A. Aranda Gutierrez, Roberto Doriguzzi-Corin, Elisa Rojas

Abstract: SDN promises to overcome vendor lock-in by enabling a multi-vendor hardware and software ecosystem in operator networks. However, we observe that this is currently not happening. A framework allowing to compose SDN applications combining different frameworks can help revert the trend. In this paper, we analyze the challenges in the current SDN landscape and then present the multi-controller SDN fr… ▽ More SDN promises to overcome vendor lock-in by enabling a multi-vendor hardware and software ecosystem in operator networks. However, we observe that this is currently not happening. A framework allowing to compose SDN applications combining different frameworks can help revert the trend. In this paper, we analyze the challenges in the current SDN landscape and then present the multi-controller SDN framework developed by the NetIDE project. Our architecture supports different SDN southbound protocols and we have implemented a proof of concept using the OpenFlow protocol, which has given us a greater insight on its shortcomings. △ Less

Submitted 29 July, 2017; originally announced July 2017.

Comments: 6 pages, 4 figures

ACM Class: C.2; C.2.1

Showing 1–10 of 10 results for author: Gutierrez, P A