-
On Leaky-Integrate-and Fire as Spike-Train-Quantization Operator on Dirac-Superimposed Continuous-Time Signals
Authors:
Bernhard A. Moser,
Michael Lunglmayr
Abstract:
Leaky-integrate-and-fire (LIF) is studied as a non-linear operator that maps an integrable signal $f$ to a sequence $η_f$ of discrete events, the spikes. In the case without any Dirac pulses in the input, it makes no difference whether to set the neuron's potential to zero or to subtract the threshold $\vartheta$ immediately after a spike triggering event. However, in the case of superimpose Dirac…
▽ More
Leaky-integrate-and-fire (LIF) is studied as a non-linear operator that maps an integrable signal $f$ to a sequence $η_f$ of discrete events, the spikes. In the case without any Dirac pulses in the input, it makes no difference whether to set the neuron's potential to zero or to subtract the threshold $\vartheta$ immediately after a spike triggering event. However, in the case of superimpose Dirac pulses the situation is different which raises the question of a mathematical justification of each of the proposed reset variants. In the limit case of zero refractory time the standard reset scheme based on threshold subtraction results in a modulo-based reset scheme which allows to characterize LIF as a quantization operator based on a weighted Alexiewicz norm $\|.\|_{A, α}$ with leaky parameter $α$. We prove the quantization formula $\|η_f - f\|_{A, α} < \vartheta$ under the general condition of local integrability, almost everywhere boundedness and locally finitely many superimposed weighted Dirac pulses which provides a much larger signal space and more flexible sparse signal representation than manageable by classical signal processing.
△ Less
Submitted 10 February, 2024;
originally announced February 2024.
-
SNN Architecture for Differential Time Encoding Using Decoupled Processing Time
Authors:
Daniel Windhager,
Bernhard A. Moser,
Michael Lunglmayr
Abstract:
Spiking neural networks (SNNs) have gained attention in recent years due to their ability to handle sparse and event-based data better than regular artificial neural networks (ANNs). Since the structure of SNNs is less suited for typically used accelerators such as GPUs than conventional ANNs, there is a demand for custom hardware accelerators for processing SNNs. In the past, the main focus was o…
▽ More
Spiking neural networks (SNNs) have gained attention in recent years due to their ability to handle sparse and event-based data better than regular artificial neural networks (ANNs). Since the structure of SNNs is less suited for typically used accelerators such as GPUs than conventional ANNs, there is a demand for custom hardware accelerators for processing SNNs. In the past, the main focus was on platforms that resemble the structure of multiprocessor systems. In this work, we propose a lightweight neuron layer architecture that allows network structures to be directly mapped onto digital hardware. Our approach is based on differential time coding of spike sequences and the decoupling of processing time and spike timing that allows the SNN to be processed on different hardware platforms. We present synthesis and performance results showing that this architecture can be implemented for networks of more than 1000 neurons with high clock speeds on a State-of-the-Art FPGA. We furthermore show results on the robustness of our approach to quantization. These results demonstrate that high-accuracy inference can be performed with bit widths as low as 4.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
Quantization in Spiking Neural Networks
Authors:
Bernhard A. Moser,
Michael Lunglmayr
Abstract:
In spiking neural networks (SNN), at each node, an incoming sequence of weighted Dirac pulses is converted into an output sequence of weighted Dirac pulses by a leaky-integrate-and-fire (LIF) neuron model based on spike aggregation and thresholding. We show that this map** can be understood as a quantization operator and state a corresponding formula for the quantization error by means of the Al…
▽ More
In spiking neural networks (SNN), at each node, an incoming sequence of weighted Dirac pulses is converted into an output sequence of weighted Dirac pulses by a leaky-integrate-and-fire (LIF) neuron model based on spike aggregation and thresholding. We show that this map** can be understood as a quantization operator and state a corresponding formula for the quantization error by means of the Alexiewicz norm. This analysis has implications for rethinking re-initialization in the LIF model, leading to the proposal of 'reset-to-mod' as a modulo-based reset variant.
△ Less
Submitted 8 February, 2024; v1 submitted 13 May, 2023;
originally announced May 2023.
-
Spiking Neural Networks in the Alexiewicz Topology: A New Perspective on Analysis and Error Bounds
Authors:
Bernhard A. Moser,
Michael Lunglmayr
Abstract:
In order to ease the analysis of error propagation in neuromorphic computing and to get a better understanding of spiking neural networks (SNN), we address the problem of mathematical analysis of SNNs as endomorphisms that map spike trains to spike trains. A central question is the adequate structure for a space of spike trains and its implication for the design of error measurements of SNNs inclu…
▽ More
In order to ease the analysis of error propagation in neuromorphic computing and to get a better understanding of spiking neural networks (SNN), we address the problem of mathematical analysis of SNNs as endomorphisms that map spike trains to spike trains. A central question is the adequate structure for a space of spike trains and its implication for the design of error measurements of SNNs including time delay, threshold deviations, and the design of the reinitialization mode of the leaky-integrate-and-fire (LIF) neuron model. First we identify the underlying topology by analyzing the closure of all sub-threshold signals of a LIF model. For zero leakage this approach yields the Alexiewicz topology, which we adopt to LIF neurons with arbitrary positive leakage. As a result LIF can be understood as spike train quantization in the corresponding norm. This way we obtain various error bounds and inequalities such as a quasi isometry relation between incoming and outgoing spike trains. Another result is a Lipschitz-style global upper bound for the error propagation and a related resonance-type phenomenon.
△ Less
Submitted 8 February, 2024; v1 submitted 9 May, 2023;
originally announced May 2023.
-
A Fiber Measurement System with Approximate Deconvolution Based on the Analysis of Fault Clusters in Linearized Bregman Iterations
Authors:
Yuneisy Garcia Guzman,
Felipe Calliari,
Gustavo C. Amaral,
Michael Lunglmayr
Abstract:
Automatic detection of faults in optical fibers is an active area of research that plays a significant role in the design of reliable and stable optical networks. A fiber measurement system that combines automated data acquisition and processing represents a disruptive impact in the management of optical fiber networks with fast and reliable event detection. It has been shown in the literature tha…
▽ More
Automatic detection of faults in optical fibers is an active area of research that plays a significant role in the design of reliable and stable optical networks. A fiber measurement system that combines automated data acquisition and processing represents a disruptive impact in the management of optical fiber networks with fast and reliable event detection. It has been shown in the literature that the linearized Bregman iterations (LBI) algorithm and variations can be successfully used for processing and accurately identifying faults in a fiber profile. One of the factors that impact the performance of these algorithms is the degradation of spatial resolution, which is mainly caused by the appearance of fault clusters due to a reduced number of iterations. In this paper, a method is proposed based on an approximate deconvolution approach for increasing the spatial resolution, possible after a thorough analysis of fault clusters that appear in the algorithm's output. The effect of such approximate deconvolution is shown to extend beyond the improvement of spatial resolution, allowing for better performances to be reached at shorter processing times. An efficient hardware architecture that implements the approximate deconvolution, compatible with the hardware structure recently presented for the LBI algorithm, is also proposed and discussed.
△ Less
Submitted 4 November, 2021;
originally announced November 2021.
-
Efficient Majority Voting in Digital Hardware
Authors:
Stefan Baumgartner,
Mario Huemer,
Michael Lunglmayr
Abstract:
In recent years, machine learning methods became increasingly important for a manifold number of applications. However, they often suffer from high computational requirements impairing their efficient use in real-time systems, even when employing dedicated hardware accelerators. Ensemble learning methods are especially suitable for hardware acceleration since they can be constructed from individua…
▽ More
In recent years, machine learning methods became increasingly important for a manifold number of applications. However, they often suffer from high computational requirements impairing their efficient use in real-time systems, even when employing dedicated hardware accelerators. Ensemble learning methods are especially suitable for hardware acceleration since they can be constructed from individual learners of low complexity and thus offer large parallelization potential. For classification, the outputs of these learners are typically combined by majority voting, which often represents the bottleneck of a hardware accelerator for ensemble inference. In this work, we present a novel architecture that allows obtaining a majority decision in a number of clock cycles that is logarithmic in the number of inputs. We show, that for the example application of handwritten digit recognition a random forest processing engine employing this majority decision architecture implemented on an FPGA allows the classification of more than seven million images per second.
△ Less
Submitted 9 August, 2021;
originally announced August 2021.
-
Efficient Non-sequential Division for FPGAs
Authors:
Michael Lunglmayr
Abstract:
The division operation is important for many areas of data processing. Especially considering today's demand for hardware accelerators for machine learning algorithms, there is a high demand for an efficient calculation of the division function, e.g. for averaging operations or the online calculation of activation functions. For such algorithms, which are often iterative in nature, one would like…
▽ More
The division operation is important for many areas of data processing. Especially considering today's demand for hardware accelerators for machine learning algorithms, there is a high demand for an efficient calculation of the division function, e.g. for averaging operations or the online calculation of activation functions. For such algorithms, which are often iterative in nature, one would like to have a non-sequential way of calculating the division operation. The work presents such an approach especially tailored to FPGAs as processing platforms. It is based on an efficient way of calculating the reciprocal operation, based on a low complexity approximation combined with a correction function. The described approach allows approximating the division operation (with errors that can be made arbitrarily low), within one clock cycle using only low hardware requirements. These hardware requirements are scaleable depending on the desired precision. We show results obtained by synthesis and hardware simulations demonstrating the low complexity and high clock speeds achievable with the described method compared to other methods described in the literature.
△ Less
Submitted 9 September, 2022; v1 submitted 12 May, 2021;
originally announced May 2021.
-
Fast approximate reciprocal approximations for iterative algorithms
Authors:
Michael Lunglmayr,
Oliver Ploder
Abstract:
The reciprocal function, 1/x, is important for many real-time algorithms. It is used in a large variety of algorithms from areas ranging from iterative estimation to machine learning. Many of these algorithms are iterative in nature and require the online computation of the reciprocal. Such an iterative structure often prevents effective use of pipelining for implementation of the reciprocal. For…
▽ More
The reciprocal function, 1/x, is important for many real-time algorithms. It is used in a large variety of algorithms from areas ranging from iterative estimation to machine learning. Many of these algorithms are iterative in nature and require the online computation of the reciprocal. Such an iterative structure often prevents effective use of pipelining for implementation of the reciprocal. For this reason, a reciprocal algorithm requiring only a low amount of clock cycles is desired. Many real-time algorithms, often being of approximate nature, can tolerate the use of only an approximate solution of the reciprocal.
For this reason, we present a low complexity non-iterative approximation of the reciprocal function. This approximation can be calculated using only combinatorial logic. We present synthesis results showing that the proposed approach can be implemented with low area requirements at high clock frequencies. We analytically describe the error of the approximation and show that by optimizing a constant value used in the approximation, different variants with different error behaviors can be obtained. We furthermore present performance results of application examples that, when using our proposed method, show only negligible performance degradation compared to when using the exact reciprocal function, demonstrating the versatility of our proposed approach.
△ Less
Submitted 13 July, 2020;
originally announced July 2020.
-
FPGA-Embedded Linearized Bregman Iterations Algorithm for Trend Break Detection
Authors:
Felipe Calliari,
Gustavo C. Amaral,
Michael Lunglmayr
Abstract:
Detection of level shifts in a noisy signal, or trend break detection, is a problem that appears in several research fields, from biophysics to optics and economics. Although many algorithms have been developed to deal with such problem, accurate and low-complexity trend break detection is still an active topic of research. The linearized Bregman Iterations have been recently presented as a low-co…
▽ More
Detection of level shifts in a noisy signal, or trend break detection, is a problem that appears in several research fields, from biophysics to optics and economics. Although many algorithms have been developed to deal with such problem, accurate and low-complexity trend break detection is still an active topic of research. The linearized Bregman Iterations have been recently presented as a low-complexity and computationally-efficient algorithm to tackle this problem, with a formidable structure that could benefit immensely from hardware implementation. In this work, a hardware architecture of the Linearized Bregman Iterations algorithm is presented and tested on a Field Programmable Gate Array (FPGA). The hardware is synthesized in different sized FPGAs and the percentage of used hardware as well as the maximum frequency enabled by the design indicate that an approximately 100 gain factor in processing time, with respect to the software implementation, can be achieved. This represents a tremendous advantage in using a dedicated unit for trend break detection applications.
△ Less
Submitted 15 February, 2019;
originally announced February 2019.
-
A stochastic computing architecture for iterative estimation
Authors:
Michael Lunglmayr,
Daniel Wiesinger,
Werner Haselmayr
Abstract:
Stochastic computing (SC) is a promising candidate for fault tolerant computing in digital circuits. We present a novel stochastic computing estimation architecture allowing to solve a large group of estimation problems including least squares estimation as well as sparse estimation. This allows utilizing the high fault tolerance of stochastic computing for implementing estimation algorithms. The…
▽ More
Stochastic computing (SC) is a promising candidate for fault tolerant computing in digital circuits. We present a novel stochastic computing estimation architecture allowing to solve a large group of estimation problems including least squares estimation as well as sparse estimation. This allows utilizing the high fault tolerance of stochastic computing for implementing estimation algorithms. The presented architecture is based on the recently proposed linearized-Bregman-based Sparse Kaczmarz algorithm. To realize this architecture, we develop a shrink function in stochastic computing and analytically describe its error probability. We compare the stochastic computing architecture to a fixed-point binary implementation and present bit-true simulation results as well as synthesis results demonstrating the feasibility of the proposed architecture for practical implementation.
△ Less
Submitted 31 October, 2018;
originally announced October 2018.
-
High-Accuracy and Fault Tolerant Stochastic Inner Product Design
Authors:
Werner Haselmayr,
Daniel Wiesinger,
Michael Lunglmayr
Abstract:
In this work, we present a novel inner product design for stochastic computing. Stochastic computing is an emerging computing technique, that encodes a number in the probability of observing a one in a random bit stream. This leads to reduced hardware costs and high error tolerance. The proposed inner product design is based on a two-line bipolar encoding format and applies sequential processing o…
▽ More
In this work, we present a novel inner product design for stochastic computing. Stochastic computing is an emerging computing technique, that encodes a number in the probability of observing a one in a random bit stream. This leads to reduced hardware costs and high error tolerance. The proposed inner product design is based on a two-line bipolar encoding format and applies sequential processing of the input in a central accumulation unit. Sequential processing significantly increases the computation accuracy, since it allows for preliminary cancelation of carry bits. Moreover, the central accumulation unit gives a much better scalability compared to conventional adder tree approaches. We show that the proposed inner product design outperforms state-of-the-art designs in terms of hardware costs for high accuracy requirements and fault tolerance.
△ Less
Submitted 20 November, 2018; v1 submitted 17 August, 2018;
originally announced August 2018.
-
Design and Analysis of Efficient Maximum/Minimum Circuits for Stochastic Computing
Authors:
Michael Lunglmayr,
Daniel Wiesinger,
Werner Haselmayr
Abstract:
In stochastic computing (SC), a real-valued number is represented by a stochastic bit stream, encoding its value in the probability of obtaining a one. This leads to a significantly lower hardware effort for various functions and provides a higher tolerance to errors (e.g., bit flips) compared to binary radix representation. The implementation of a stochastic max/min function is important for many…
▽ More
In stochastic computing (SC), a real-valued number is represented by a stochastic bit stream, encoding its value in the probability of obtaining a one. This leads to a significantly lower hardware effort for various functions and provides a higher tolerance to errors (e.g., bit flips) compared to binary radix representation. The implementation of a stochastic max/min function is important for many areas where SC has been successfully applied, such as image processing or machine learning (e.g., max pooling in neural networks). In this work, we propose a novel shift-register-based architecture for a stochastic max/min function. We show that the proposed circuit has a significantly higher accuracy than state-of-the-art architectures at comparable hardware cost. Moreover, we analytically proof the correctness of the proposed circuit and provide a new error analysis, based on the individual bits of the stochastic streams. Interestingly, the analysis reveals that for a certain practical bit stream length a finite optimal shift register length exists and it allows to determine the optimal length.
△ Less
Submitted 17 July, 2018;
originally announced July 2018.
-
A Robust Nonlinear RLS Type Adaptive Filter for Second-Order-Intermodulation Distortion Cancellation in FDD LTE and 5G Direct Conversion Transceivers
Authors:
Andreas Gebhard,
Oliver Lang,
Michael Lunglmayr,
Christian Motz,
Ram Sunil Kanumalli,
Christina Auer,
Thomas Paireder,
Matthias Wagner,
Harald Pretl,
Mario Huemer
Abstract:
Transceivers operating in frequency division duplex experience a transmitter leakage (TxL) signal into the receiver due to the limited duplexer stop-band isolation. This TxL signal in combination with the second-order nonlinearity of the receive mixer may lead to a baseband (BB) second-order intermodulation distortion (IMD2) with twice the transmit signal bandwidth. In direct conversion receivers,…
▽ More
Transceivers operating in frequency division duplex experience a transmitter leakage (TxL) signal into the receiver due to the limited duplexer stop-band isolation. This TxL signal in combination with the second-order nonlinearity of the receive mixer may lead to a baseband (BB) second-order intermodulation distortion (IMD2) with twice the transmit signal bandwidth. In direct conversion receivers, this nonlinear IMD2 interference may cause a severe signal-to-interference-plus-noise ratio degradation of the wanted receive signal. This contribution presents a nonlinear Wiener model recursive least-squares (RLS) type adaptive filter for the cancellation of the IMD2 interference in the digital BB. The included channel-select-, and DC-notch filter at the output of the proposed adaptive filter ensure that the provided IMD2 replica includes the receiver front-end filtering. A second, robust version of the nonlinear RLS algorithm is derived which provides numerical stability for highly correlated input signals which arise in e.g. LTE-A intra-band multi-cluster transmission scenarios. The performance of the proposed algorithms is evaluated by numerical simulations and by measurement data.
△ Less
Submitted 11 July, 2018;
originally announced July 2018.
-
Linearized Bregman Iterations for Automatic Optical Fiber Fault Analysis
Authors:
Michael Lunglmayr,
Gustavo C. Amaral
Abstract:
Supervision of the physical layer of optical networks is an extremely relevant subject. To detect fiber faults, single-ended solutions such as the Optical Time Domain Reflectometer (OTDR) allow for precise measurements of fault profiles. Combining the OTDR with a signal processing approach for high-dimensional sparse parameter estimation allows for automated and reliable results in reduced time. I…
▽ More
Supervision of the physical layer of optical networks is an extremely relevant subject. To detect fiber faults, single-ended solutions such as the Optical Time Domain Reflectometer (OTDR) allow for precise measurements of fault profiles. Combining the OTDR with a signal processing approach for high-dimensional sparse parameter estimation allows for automated and reliable results in reduced time. In this work, a measurement system composed of a Photon-Counting OTDR data acquisition unit and a processing unit based on a Linearized Bregman Iterations algorithm for automatic fault finding is proposed. An in-depth comparative study of the proposed algorithm's fault-finding prowess in the presence of noise is presented. Characteristics such as sensitivity, specificity, processing time, and complexity, are analysed in simulated environments. Real-life measurements that are conducted using the Photon-Counting OTDR subsystem for data acquisition and the Linearized Bregman-based processing unit for automated data analysis demonstrated accurate results. It is concluded that the proposed measurement system is particularly well suited to the task of fault finding. The natural characteristic of the algorithm fosters embedding the solution in digital hardware, allowing for reduced costs and processing time.
△ Less
Submitted 19 December, 2018; v1 submitted 15 May, 2018;
originally announced May 2018.
-
Knowledge-Aided Kaczmarz and LMS Algorithms
Authors:
Michael Lunglmayr,
Oliver Lang,
Mario Huemer
Abstract:
The least mean squares (LMS) filter is often derived via the Wiener filter solution. For a system identification scenario, such a derivation makes it hard to incorporate prior information on the system's impulse response. We present an alternative way based on the maximum a posteriori solution, which allows develo** a Knowledge-Aided Kaczmarz algorithm. Based on this Knowledge-Aided Kaczmarz we…
▽ More
The least mean squares (LMS) filter is often derived via the Wiener filter solution. For a system identification scenario, such a derivation makes it hard to incorporate prior information on the system's impulse response. We present an alternative way based on the maximum a posteriori solution, which allows develo** a Knowledge-Aided Kaczmarz algorithm. Based on this Knowledge-Aided Kaczmarz we formulate a Knowledge-Aided LMS filter. Both algorithms allow incorporating the prior mean and covariance matrix on the parameter to be estimated. The algorithms use this prior information in addition to the measurement information in the gradient for the iterative update of their estimates. We analyze the convergence of the algorithms and show simulation results on their performance. As expected, reliable prior information allows improving the performance of the algorithms for low signal-to-noise (SNR) scenarios. The results show that the presented algorithms can nearly achieve the optimal maximum a posteriori (MAP) performance.
△ Less
Submitted 6 December, 2017;
originally announced December 2017.
-
Parameter Estimation Under Model Uncertainties by Iterative Covariance Approximation
Authors:
Oliver Lang,
Michael Lunglmayr,
Mario Huemer
Abstract:
We propose a novel iterative algorithm for estimating a deterministic but unknown parameter vector in the presence of model uncertainties. This iterative algorithm is based on a system model where an overall noise term describes both, the measurement noise and the noise resulting from the model uncertainties. This overall noise term is a function of the true parameter vector, allowing for an itera…
▽ More
We propose a novel iterative algorithm for estimating a deterministic but unknown parameter vector in the presence of model uncertainties. This iterative algorithm is based on a system model where an overall noise term describes both, the measurement noise and the noise resulting from the model uncertainties. This overall noise term is a function of the true parameter vector, allowing for an iterative algorithm. The proposed algorithm can be applied on structured as well as unstructured models and it outperforms prior art algorithms for a broad range of applications.
△ Less
Submitted 23 November, 2017; v1 submitted 13 December, 2016;
originally announced December 2016.
-
Approximate Least Squares
Authors:
Michael Lunglmayr,
Christoph Unterrieder,
Mario Huemer
Abstract:
We present a novel iterative algorithm for approximating the linear least squares solution with low complexity. After a motivation of the algorithm we discuss the algorithm's properties including its complexity, and we present theoretical results as well as simulation based performance results. We describe the analysis of its convergence behavior and show that in the noise free case the algorithm…
▽ More
We present a novel iterative algorithm for approximating the linear least squares solution with low complexity. After a motivation of the algorithm we discuss the algorithm's properties including its complexity, and we present theoretical results as well as simulation based performance results. We describe the analysis of its convergence behavior and show that in the noise free case the algorithm converges to the least squares solution.
△ Less
Submitted 11 December, 2013;
originally announced December 2013.