Search | arXiv e-print repository

Task-Synchronized Recurrent Neural Networks

Authors: Mantas Lukoševičius, Arnas Uselis

Abstract: Data are often sampled irregularly in time. Dealing with this using Recurrent Neural Networks (RNNs) traditionally involved ignoring the fact, feeding the time differences as additional inputs, or resampling the data. All these methods have their shortcomings. We propose an elegant straightforward alternative approach where instead the RNN is in effect resampled in time to match the time of the da… ▽ More Data are often sampled irregularly in time. Dealing with this using Recurrent Neural Networks (RNNs) traditionally involved ignoring the fact, feeding the time differences as additional inputs, or resampling the data. All these methods have their shortcomings. We propose an elegant straightforward alternative approach where instead the RNN is in effect resampled in time to match the time of the data or the task at hand. We use Echo State Network (ESN) and Gated Recurrent Unit (GRU) as the basis for our solution. Such RNNs can be seen as discretizations of continuous-time dynamical systems, which gives a solid theoretical ground to our approach. Our Task-Synchronized ESN (TSESN) and GRU (TSGRU) models allow for a direct model time setting and require no additional training, parameter tuning, or computation (solving differential equations or interpolating data) compared to their regular counterparts, thus retaining their original efficiency. We confirm empirically that our models can effectively compensate for the time-non-uniformity of the data and demonstrate that they compare favorably to data resampling, classical RNN methods, and alternative RNN models proposed to deal with time irregularities on several real-world nonuniform-time datasets. We open-source the code at https://github.com/oshapio/task-synchronized-RNNs . △ Less

Submitted 2 July, 2024; v1 submitted 11 April, 2022; originally announced April 2022.

Comments: The 1st version was written in May 2019 and double-blind reviewed for a prominent conference. A major update. We changed the name of the article and methods to an arguably more precise one, and because a very similar title has been published in the meantime. We've rewritten much of the text, connected to the current literature, redone some experiments, figures, discussion, published source code

MSC Class: 68T07; 68T05; 37M10 ACM Class: I.2.6; G.1.2

arXiv:2006.11282 [pdf, other]

doi 10.1007/s12559-021-09849-2

Efficient implementations of echo state network cross-validation

Authors: Mantas Lukoševičius, Arnas Uselis

Abstract: Background/introduction: Cross-Validation (CV) is still uncommon in time series modeling. Echo State Networks (ESNs), as a prime example of Reservoir Computing (RC) models, are known for their fast and precise one-shot learning, that often benefit from good hyper-parameter tuning. This makes them ideal to change the status quo. Methods: We discuss CV of time series for predicting a concrete time… ▽ More Background/introduction: Cross-Validation (CV) is still uncommon in time series modeling. Echo State Networks (ESNs), as a prime example of Reservoir Computing (RC) models, are known for their fast and precise one-shot learning, that often benefit from good hyper-parameter tuning. This makes them ideal to change the status quo. Methods: We discuss CV of time series for predicting a concrete time interval of interest, suggest several schemes for cross-validating ESNs and introduce an efficient algorithm for implementing them. This algorithm is presented as two levels of optimizations of doing $k$-fold CV. Training an RC model typically consists of two stages: (i) running the reservoir with the data and (ii) computing the optimal readouts. The first level of our optimization addresses the most computationally expensive part (i) and makes it remain constant irrespective of $k$. It dramatically reduces reservoir computations in any type of RC system and is enough if $k$ is small. The second level of optimization also makes the (ii) part remain constant irrespective of large $k$, as long as the dimension of the output is low. We discuss when the proposed validation schemes for ESNs could be beneficial, three options for producing the final model and empirically investigate them on six different real-world datasets, as well as do empirical computation time experiments. We provide the code in an online repository. Results: Proposed CV schemes give better and more stable test performance in all the six different real-world datasets, three task types. Empirical run times confirm our complexity analysis. Conclusions: In most situations $k$-fold CV of ESNs and many other RC models can be done for virtually the same time and space complexity as a simple single-split validation. This enables CV to become a standard practice in RC. △ Less

Submitted 3 December, 2020; v1 submitted 19 June, 2020; originally announced June 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1908.08450

MSC Class: 68T05 (Primary) 37M10; 15A06 (Secondary) ACM Class: I.2.6

Journal ref: Cognitive Computation, 2021

arXiv:2005.05930 [pdf, other]

doi 10.3390/en13133440

Localized convolutional neural networks for geospatial wind forecasting

Authors: Arnas Uselis, Mantas Lukoševičius, Lukas Stasytis

Abstract: Convolutional Neural Networks (CNN) possess many positive qualities when it comes to spatial raster data. Translation invariance enables CNNs to detect features regardless of their position in the scene. However, in some domains, like geospatial, not all locations are exactly equal. In this work, we propose localized convolutional neural networks that enable convolutional architectures to learn lo… ▽ More Convolutional Neural Networks (CNN) possess many positive qualities when it comes to spatial raster data. Translation invariance enables CNNs to detect features regardless of their position in the scene. However, in some domains, like geospatial, not all locations are exactly equal. In this work, we propose localized convolutional neural networks that enable convolutional architectures to learn local features in addition to the global ones. We investigate their instantiations in the form of learnable inputs, local weights, and a more general form. They can be added to any convolutional layers, easily end-to-end trained, introduce minimal additional complexity, and let CNNs retain most of their benefits to the extent that they are needed. In this work we address spatio-temporal prediction: test the effectiveness of our methods on a synthetic benchmark dataset and tackle three real-world wind prediction datasets. For one of them, we propose a method to spatially order the unordered data. We compare the recent state-of-the-art spatio-temporal prediction models on the same data. Models that use convolutional layers can be and are extended with our localizations. In all these cases our extensions improve the results, and thus often the state-of-the-art. We share all the code at a public repository. △ Less

Submitted 10 July, 2020; v1 submitted 12 May, 2020; originally announced May 2020.

MSC Class: 68T05 ACM Class: I.2.6

Journal ref: Energies, 13 (13), pp. 3440, 2020

arXiv:1908.08450 [pdf, other]

doi 10.1007/978-3-030-30493-5_12

Efficient Cross-Validation of Echo State Networks

Authors: Mantas Lukoševičius, Arnas Uselis

Abstract: Echo State Networks (ESNs) are known for their fast and precise one-shot learning of time series. But they often need good hyper-parameter tuning for best performance. For this good validation is key, but usually, a single validation split is used. In this rather practical contribution we suggest several schemes for cross-validating ESNs and introduce an efficient algorithm for implementing them.… ▽ More Echo State Networks (ESNs) are known for their fast and precise one-shot learning of time series. But they often need good hyper-parameter tuning for best performance. For this good validation is key, but usually, a single validation split is used. In this rather practical contribution we suggest several schemes for cross-validating ESNs and introduce an efficient algorithm for implementing them. The component that dominates the time complexity of the already quite fast ESN training remains constant (does not scale up with $k$) in our proposed method of doing $k$-fold cross-validation. The component that does scale linearly with $k$ starts dominating only in some not very common situations. Thus in many situations $k$-fold cross-validation of ESNs can be done for virtually the same time complexity as a simple single split validation. Space complexity can also remain the same. We also discuss when the proposed validation schemes for ESNs could be beneficial and empirically investigate them on several different real-world datasets. △ Less

Submitted 22 August, 2019; originally announced August 2019.

Comments: Accepted in ICANN'19 Workshop on Reservoir Computing

MSC Class: 68T05 (Primary) 37M10; 15A06 (Secondary) ACM Class: I.2.6

Journal ref: Artificial Neural Networks and Machine Learning - ICANN 2019: Workshop and Special Sessions. ICANN 2019., pp. 121-133

Showing 1–4 of 4 results for author: Uselis, A