-
Impact of white Gaussian internal noise on analog echo-state neural networks
Authors:
Nadezhda Semenova
Abstract:
In recent years, more and more works have appeared devoted to the analog (hardware) implementation of artificial neural networks, in which neurons and the connection between them are based not on computer calculations, but on physical principles. Such networks offer improved energy efficiency and, in some cases, scalability, but may be susceptible to internal noise. This paper studies the influenc…
▽ More
In recent years, more and more works have appeared devoted to the analog (hardware) implementation of artificial neural networks, in which neurons and the connection between them are based not on computer calculations, but on physical principles. Such networks offer improved energy efficiency and, in some cases, scalability, but may be susceptible to internal noise. This paper studies the influence of noise on the functioning of recurrent networks using the example of trained echo state networks (ESNs). The most common reservoir connection matrices were chosen as various topologies of ESNs: random uniform and band matrices with different connectivity. White Gaussian noise was chosen as the influence, and according to the way of its introducing it was additive or multiplicative, as well as correlated or uncorrelated. In the paper, we show that the propagation of noise in reservoir is mainly controlled by the statistical properties of the output connection matrix, namely the mean and the mean square. Depending on these values, more correlated or uncorrelated noise accumulates in the network. We also show that there are conditions under which even noise with an intensity of $10^{-20}$ is already enough to completely lose the useful signal. In the article we show which types of noise are most critical for networks with different activation functions (hyperbolic tangent, sigmoid and linear) and if the network is self-closed.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
From Variability to Stability: Advancing RecSys Benchmarking Practices
Authors:
Valeriy Shevchenko,
Nikita Belousov,
Alexey Vasilev,
Vladimir Zholobov,
Artyom Sosedka,
Natalia Semenova,
Anna Volodkevich,
Andrey Savchenko,
Alexey Zaytsev
Abstract:
In the rapidly evolving domain of Recommender Systems (RecSys), new algorithms frequently claim state-of-the-art performance based on evaluations over a limited set of arbitrarily selected datasets. However, this approach may fail to holistically reflect their effectiveness due to the significant impact of dataset characteristics on algorithm performance. Addressing this deficiency, this paper int…
▽ More
In the rapidly evolving domain of Recommender Systems (RecSys), new algorithms frequently claim state-of-the-art performance based on evaluations over a limited set of arbitrarily selected datasets. However, this approach may fail to holistically reflect their effectiveness due to the significant impact of dataset characteristics on algorithm performance. Addressing this deficiency, this paper introduces a novel benchmarking methodology to facilitate a fair and robust comparison of RecSys algorithms, thereby advancing evaluation practices. By utilizing a diverse set of $30$ open datasets, including two introduced in this work, and evaluating $11$ collaborative filtering algorithms across $9$ metrics, we critically examine the influence of dataset characteristics on algorithm performance. We further investigate the feasibility of aggregating outcomes from multiple datasets into a unified ranking. Through rigorous experimental analysis, we validate the reliability of our methodology under the variability of datasets, offering a benchmarking strategy that balances quality and computational demands. This methodology enables a fair yet effective means of evaluating RecSys algorithms, providing valuable guidance for future research endeavors.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
GCT-TTE: Graph Convolutional Transformer for Travel Time Estimation
Authors:
Vladimir Mashurov,
Vaagn Chopurian,
Vadim Porvatov,
Arseny Ivanov,
Natalia Semenova
Abstract:
This paper introduces a new transformer-based model for the problem of travel time estimation. The key feature of the proposed GCT-TTE architecture is the utilization of different data modalities capturing different properties of an input path. Along with the extensive study regarding the model configuration, we implemented and evaluated a sufficient number of actual baselines for path-aware and p…
▽ More
This paper introduces a new transformer-based model for the problem of travel time estimation. The key feature of the proposed GCT-TTE architecture is the utilization of different data modalities capturing different properties of an input path. Along with the extensive study regarding the model configuration, we implemented and evaluated a sufficient number of actual baselines for path-aware and path-blind settings. The conducted computational experiments have confirmed the viability of our pipeline, which outperformed state-of-the-art models on both considered datasets. Additionally, GCT-TTE was deployed as a web service accessible for further experiments with user-defined routes.
△ Less
Submitted 15 October, 2023; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Revising deep learning methods in parking lot occupancy detection
Authors:
Anastasia Martynova,
Mikhail Kuznetsov,
Vadim Porvatov,
Vladislav Tishin,
Andrey Kuznetsov,
Natalia Semenova,
Ksenia Kuznetsova
Abstract:
Parking guidance systems have recently become a popular trend as a part of the smart cities' paradigm of development. The crucial part of such systems is the algorithm allowing drivers to search for available parking lots across regions of interest. The classic approach to this task is based on the application of neural network classifiers to camera records. However, existing systems demonstrate a…
▽ More
Parking guidance systems have recently become a popular trend as a part of the smart cities' paradigm of development. The crucial part of such systems is the algorithm allowing drivers to search for available parking lots across regions of interest. The classic approach to this task is based on the application of neural network classifiers to camera records. However, existing systems demonstrate a lack of generalization ability and appropriate testing regarding specific visual conditions. In this study, we extensively evaluate state-of-the-art parking lot occupancy detection algorithms, compare their prediction quality with the recently emerged vision transformers, and propose a new pipeline based on EfficientNet architecture. Performed computational experiments have demonstrated the performance increase in the case of our model, which was evaluated on 5 different datasets.
△ Less
Submitted 12 February, 2024; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Noise impact on recurrent neural network with linear activation function
Authors:
V. M. Moskvitin,
N. Semenova
Abstract:
In recent years, more and more researchers in the field of neural networks are interested in creating hardware implementations where neurons and the connection between them are realized physically. The physical implementation of ANN fundamentally changes the features of noise influence. In the case hardware ANNs, there are many internal sources of noise with different properties. The purpose of th…
▽ More
In recent years, more and more researchers in the field of neural networks are interested in creating hardware implementations where neurons and the connection between them are realized physically. The physical implementation of ANN fundamentally changes the features of noise influence. In the case hardware ANNs, there are many internal sources of noise with different properties. The purpose of this paper is to study the peculiarities of internal noise propagation in recurrent ANN on the example of echo state network (ESN), to reveal ways to suppress such noises and to justify the stability of networks to some types of noises.
In this paper we analyse ESN in presence of uncorrelated additive and multiplicative white Gaussian noise. Here we consider the case when artificial neurons have linear activation function with different slope coefficients. Starting from studying only one noisy neuron we complicate the problem by considering how the input signal and the memory property affect the accumulation of noise in ESN. In addition, we consider the influence of the main types of coupling matrices on the accumulation of noise. So, as such matrices, we take a uniform matrix and a diagonal-like matrices with different coefficients called "blurring" coefficient.
We have found that the general view of variance and signal-to-noise ratio of ESN output signal is similar to only one neuron. The noise is less accumulated in ESN with diagonal reservoir connection matrix with large "blurring" coefficient. Especially it concerns uncorrelated multiplicative noise.
△ Less
Submitted 23 March, 2023;
originally announced March 2023.
-
Symbiosis of an artificial neural network and models of biological neurons: training and testing
Authors:
Tatyana Bogatenko,
Konstantin Sergeev,
Andrei Slepnev,
Jürgen Kurths,
Nadezhda Semenova
Abstract:
In this paper we show the possibility of creating and identifying the features of an artificial neural network (ANN) which consists of mathematical models of biological neurons. The FitzHugh--Nagumo (FHN) system is used as an example of model demonstrating simplified neuron activity. First, in order to reveal how biological neurons can be embedded within an ANN, we train the ANN with nonlinear neu…
▽ More
In this paper we show the possibility of creating and identifying the features of an artificial neural network (ANN) which consists of mathematical models of biological neurons. The FitzHugh--Nagumo (FHN) system is used as an example of model demonstrating simplified neuron activity. First, in order to reveal how biological neurons can be embedded within an ANN, we train the ANN with nonlinear neurons to solve a a basic image recognition problem with MNIST database; and next, we describe how FHN systems can be introduced into this trained ANN. After all, we show that an ANN with FHN systems inside can be successfully trained and its accuracy becomes larger. What has been done above opens up great opportunities in terms of the direction of analog neural networks, in which artificial neurons can be replaced by biological ones. \end{abstract}
△ Less
Submitted 3 February, 2023;
originally announced February 2023.
-
Simple method for detecting sleep episodes in rats ECoG using machine learning
Authors:
Konstantin Sergeev,
Anastasiya Runnova,
Maxim Zhuravlev,
Evgenia Sitnikova,
Elizaveta Rutskova,
Kirill Smirnov,
Andrei Slepnev,
Nadezhda Semenova
Abstract:
In this paper we propose a new method for the automatic recognition of the state of behavioral sleep (BS) and waking state (WS) in freely moving rats using their electrocorticographic (ECoG) data. Three-channels ECoG signals were recorded from frontal left, frontal right and occipital right cortical areas. We employed a simple artificial neural network (ANN), in which the mean values and standard…
▽ More
In this paper we propose a new method for the automatic recognition of the state of behavioral sleep (BS) and waking state (WS) in freely moving rats using their electrocorticographic (ECoG) data. Three-channels ECoG signals were recorded from frontal left, frontal right and occipital right cortical areas. We employed a simple artificial neural network (ANN), in which the mean values and standard deviations of ECoG signals from two or three channels were used as inputs for the ANN. Results of wavelet-based recognition of BS/WS in the same data were used to train the ANN and evaluate correctness of our classifier. We tested different combinations of ECoG channels for detecting BS/WS.
Our results showed that the accuracy of ANN classification did not depend on ECoG-channel. For any ECoG-channel, networks were trained on one rat and applied to another rat with an accuracy of at least 80~\%. Itis important that we used a very simple network topology to achieve a relatively high accuracy of classification. Our classifier was based on a simple linear combination of input signals with some weights, and these weights could be replaced by the averaged weights of all trained ANNs without decreases in classification accuracy. In all, we introduce a new sleep recognition method that does not require additional network training. It is enough to know the coefficients and the equations suggested in this paper. The proposed method showed very fast performance and simple computations, therefore it could be used in real time experiments. It might be of high demand in preclinical studies in rodents that require vigilance control or monitoring of sleep-wake patterns.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
5q032e@SMM4H'22: Transformer-based classification of premise in tweets related to COVID-19
Authors:
Vadim Porvatov,
Natalia Semenova
Abstract:
Automation of social network data assessment is one of the classic challenges of natural language processing. During the COVID-19 pandemic, mining people's stances from public messages have become crucial regarding understanding attitudes towards health orders. In this paper, the authors propose the predictive model based on transformer architecture to classify the presence of premise in Twitter t…
▽ More
Automation of social network data assessment is one of the classic challenges of natural language processing. During the COVID-19 pandemic, mining people's stances from public messages have become crucial regarding understanding attitudes towards health orders. In this paper, the authors propose the predictive model based on transformer architecture to classify the presence of premise in Twitter texts. This work is completed as part of the Social Media Mining for Health (SMM4H) Workshop 2022. We explored modern transformer-based classifiers in order to construct the pipeline efficiently capturing tweets semantics. Our experiments on a Twitter dataset showed that RoBERTa is superior to the other transformer models in the case of the premise prediction task. The model achieved competitive performance with respect to ROC AUC value 0.807, and 0.7648 for the F1 score.
△ Less
Submitted 15 October, 2023; v1 submitted 8 September, 2022;
originally announced September 2022.
-
Logistics, Graphs, and Transformers: Towards improving Travel Time Estimation
Authors:
Natalia Semenova,
Vadim Porvatov,
Vladislav Tishin,
Artyom Sosedka,
Vladislav Zamkovoy
Abstract:
The problem of travel time estimation is widely considered as the fundamental challenge of modern logistics. The complex nature of interconnections between spatial aspects of roads and temporal dynamics of ground transport still preserves an area to experiment with. However, the total volume of currently accumulated data encourages the construction of the learning models which have the perspective…
▽ More
The problem of travel time estimation is widely considered as the fundamental challenge of modern logistics. The complex nature of interconnections between spatial aspects of roads and temporal dynamics of ground transport still preserves an area to experiment with. However, the total volume of currently accumulated data encourages the construction of the learning models which have the perspective to significantly outperform earlier solutions. In order to address the problems of travel time estimation, we propose a new method based on transformer architecture - TransTTE.
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
Noise mitigation strategies in physical feedforward neural networks
Authors:
Nadezhda Semenova,
Daniel Brunner
Abstract:
Physical neural networks are promising candidates for next generation artificial intelligence hardware. In such architectures, neurons and connections are physically realized and do not leverage digital concepts with their practically infinite signal-to-noise ratio to encode, transduce and transform information. They therefore are prone to noise with a variety of statistical and architectural prop…
▽ More
Physical neural networks are promising candidates for next generation artificial intelligence hardware. In such architectures, neurons and connections are physically realized and do not leverage digital concepts with their practically infinite signal-to-noise ratio to encode, transduce and transform information. They therefore are prone to noise with a variety of statistical and architectural properties, and effective strategies leveraging network-inherent assets to mitigate noise in an hardware-efficient manner are important in the pursuit of next generation neural network hardware. Based on analytical derivations, we here introduce and analyse a variety of different noise-mitigation approaches. We analytically show that intra-layer connections in which the connection matrix's squared mean exceeds the mean of its square fully suppresses uncorrelated noise. We go beyond and develop two synergistic strategies for noise that is uncorrelated and correlated across populations of neurons. First, we introduce the concept of ghost neurons, where each group of neurons perturbed by correlated noise has a negative connection to a single neuron, yet without receiving any input information. Secondly, we show that pooling of neuron populations is an efficient approach to suppress uncorrelated noise. As such, we developed a general noise mitigation strategy leveraging the statistical properties of the different noise terms most relevant in analogue hardware. Finally, we demonstrate the effectiveness of this combined approach for trained neural network classifying the MNIST handwritten digits, for which we achieve a 4-fold improvement of the output signal-to-noise ratio and increase the classification accuracy almost to the level of the noise-free network.
△ Less
Submitted 18 May, 2022; v1 submitted 20 April, 2022;
originally announced April 2022.
-
Hybrid Graph Embedding Techniques in Estimated Time of Arrival Task
Authors:
Vadim Porvatov,
Natalia Semenova,
Andrey Chertok
Abstract:
Recently, deep learning has achieved promising results in the calculation of Estimated Time of Arrival (ETA), which is considered as predicting the travel time from the start point to a certain place along a given path. ETA plays an essential role in intelligent taxi services or automotive navigation systems. A common practice is to use embedding vectors to represent the elements of a road network…
▽ More
Recently, deep learning has achieved promising results in the calculation of Estimated Time of Arrival (ETA), which is considered as predicting the travel time from the start point to a certain place along a given path. ETA plays an essential role in intelligent taxi services or automotive navigation systems. A common practice is to use embedding vectors to represent the elements of a road network, such as road segments and crossroads. Road elements have their own attributes like length, presence of crosswalks, lanes number, etc. However, many links in the road network are traversed by too few floating cars even in large ride-hailing platforms and affected by the wide range of temporal events. As the primary goal of the research, we explore the generalization ability of different spatial embedding strategies and propose a two-stage approach to deal with such problems.
△ Less
Submitted 8 October, 2021;
originally announced October 2021.
-
Understanding and mitigating noise in trained deep neural networks
Authors:
Nadezhda Semenova,
Laurent Larger,
Daniel Brunner
Abstract:
Deep neural networks unlocked a vast range of new applications by solving tasks of which many were previously deemed as reserved to higher human intelligence. One of the developments enabling this success was a boost in computing power provided by special purpose hardware, such as graphic or tensor processing units. However, these do not leverage fundamental features of neural networks like parall…
▽ More
Deep neural networks unlocked a vast range of new applications by solving tasks of which many were previously deemed as reserved to higher human intelligence. One of the developments enabling this success was a boost in computing power provided by special purpose hardware, such as graphic or tensor processing units. However, these do not leverage fundamental features of neural networks like parallelism and analog state variables. Instead, they emulate neural networks relying on binary computing, which results in unsustainable energy consumption and comparatively low speed. Fully parallel and analogue hardware promises to overcome these challenges, yet the impact of analogue neuron noise and its propagation, i.e. accumulation, threatens rendering such approaches inept. Here, we determine for the first time the propagation of noise in deep neural networks comprising noisy nonlinear neurons in trained fully connected layers. We study additive and multiplicative as well as correlated and uncorrelated noise, and develop analytical methods that predict the noise level in any layer of symmetric deep neural networks or deep neural networks trained with back propagation. We find that noise accumulation is generally bound, and adding additional network layers does not worsen the signal to noise ratio beyond a limit. Most importantly, noise accumulation can be suppressed entirely when neuron activation functions have a slope smaller than unity. We therefore developed the framework for noise in fully connected deep neural networks implemented in analog systems, and identify criteria allowing engineers to design noise-resilient novel neural network hardware.
△ Less
Submitted 16 December, 2021; v1 submitted 12 March, 2021;
originally announced March 2021.
-
Fundamental aspects of noise in analog-hardware neural networks
Authors:
Nadezhda Semenova,
Xavier Porte,
Louis Andreoli,
Maxime Jacquot,
Laurent Larger,
Daniel Brunner
Abstract:
We study and analyze the fundamental aspects of noise propagation in recurrent as well as deep, multi-layer networks. The main focus of our study are neural networks in analogue hardware, yet the methodology provides insight for networks in general. The system under study consists of noisy linear nodes, and we investigate the signal-to-noise ratio at the network's outputs which is the upper limit…
▽ More
We study and analyze the fundamental aspects of noise propagation in recurrent as well as deep, multi-layer networks. The main focus of our study are neural networks in analogue hardware, yet the methodology provides insight for networks in general. The system under study consists of noisy linear nodes, and we investigate the signal-to-noise ratio at the network's outputs which is the upper limit to such a system's computing accuracy. We consider additive and multiplicative noise which can be purely local as well as correlated across populations of neurons. This covers the chief internal-perturbations of hardware networks and noise amplitudes were obtained from a physically implemented recurrent neural network and therefore correspond to a real-world system. Analytic solutions agree exceptionally well with numerical data, enabling clear identification of the most critical components and aspects for noise management. Focusing on linear nodes isolates the impact of network connections and allows us to derive strategies for mitigating noise. Our work is the starting point in addressing this aspect of analogue neural networks, and our results identify notoriously sensitive points while simultaneously highlighting the robustness of such computational systems.
△ Less
Submitted 21 July, 2019;
originally announced July 2019.