Search | arXiv e-print repository

Outlier-Insensitive Kalman Filtering Using NUV Priors

Authors: Shunit Truzman, Guy Revach, Nir Shlezinger, Itzik Klein

Abstract: The Kalman filter (KF) is a widely-used algorithm for tracking the latent state of a dynamical system from noisy observations. For systems that are well-described by linear Gaussian state space models, the KF minimizes the mean-squared error (MSE). However, in practice, observations are corrupted by outliers, severely impairing the KFs performance. In this work, an outlier-insensitive KF is propos… ▽ More The Kalman filter (KF) is a widely-used algorithm for tracking the latent state of a dynamical system from noisy observations. For systems that are well-described by linear Gaussian state space models, the KF minimizes the mean-squared error (MSE). However, in practice, observations are corrupted by outliers, severely impairing the KFs performance. In this work, an outlier-insensitive KF is proposed, where robustness is achieved by modeling each potential outlier as a normally distributed random variable with unknown variance (NUV). The NUVs variances are estimated online, using both expectation-maximization (EM) and alternating maximization (AM). The former was previously proposed for the task of smoothing with outliers and was adapted here to filtering, while both EM and AM obtained the same performance and outperformed the other algorithms, the AM approach is less complex and thus requires 40 percentage less run-time. Our empirical study demonstrates that the MSE of our proposed outlier-insensitive KF outperforms previously proposed algorithms, and that for data clean of outliers, it reverts to the classic KF, i.e., MSE optimality is preserved △ Less

Submitted 12 October, 2022; originally announced October 2022.

arXiv:2202.10418 [pdf, ps, other]

Anomaly Search over Composite Hypotheses in Hierarchical Statistical Models

Authors: Benjamin Wolff, Tomer Gafni, Guy Revach, Nir Shlezinger, Kobi Cohen

Abstract: Detection of anomalies among a large number of processes is a fundamental task that has been studied in multiple research areas, with diverse applications spanning from spectrum access to cyber-security. Anomalous events are characterized by deviations in data distributions, and thus can be inferred from noisy observations based on statistical methods. In some scenarios, one can often obtain noisy… ▽ More Detection of anomalies among a large number of processes is a fundamental task that has been studied in multiple research areas, with diverse applications spanning from spectrum access to cyber-security. Anomalous events are characterized by deviations in data distributions, and thus can be inferred from noisy observations based on statistical methods. In some scenarios, one can often obtain noisy observations aggregated from a chosen subset of processes. Such hierarchical search can further minimize the sample complexity while retaining accuracy. An anomaly search strategy should thus be designed based on multiple requirements, such as maximizing the detection accuracy; efficiency, be efficient in terms of sample complexity; and be able to cope with statistical models that are known only up to some missing parameters (i.e., composite hypotheses). In this paper, we consider anomaly detection with observations taken from a chosen subset of processes that conforms to a predetermined tree structure with partially known statistical model. We propose Hierarchical Dynamic Search (HDS), a sequential search strategy that uses two variations of the Generalized Log Likelihood Ratio (GLLR) statistic, and can be used for detection of multiple anomalies. HDS is shown to be order-optimal in terms of the size of the search space, and asymptotically optimal in terms of detection accuracy. An explicit upper bound on the error probability is established for the finite sample regime. In addition to extensive experiments on synthetic datasets, experiments have been conducted on the DARPA intrusion detection dataset, showing that HDS is superior to existing methods. △ Less

Submitted 11 August, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

Comments: A short version of this paper was presented at IEEE International Symposium on Information Theory (ISIT) 2022

arXiv:2107.10043 [pdf, other]

doi 10.1109/TSP.2022.3158588

KalmanNet: Neural Network Aided Kalman Filtering for Partially Known Dynamics

Authors: Guy Revach, Nir Shlezinger, Xiaoyong Ni, Adria Lopez Escoriza, Ruud J. G. van Sloun, Yonina C. Eldar

Abstract: State estimation of dynamical systems in real-time is a fundamental task in signal processing. For systems that are well-represented by a fully known linear Gaussian state space (SS) model, the celebrated Kalman filter (KF) is a low complexity optimal solution. However, both linearity of the underlying SS model and accurate knowledge of it are often not encountered in practice. Here, we present Ka… ▽ More State estimation of dynamical systems in real-time is a fundamental task in signal processing. For systems that are well-represented by a fully known linear Gaussian state space (SS) model, the celebrated Kalman filter (KF) is a low complexity optimal solution. However, both linearity of the underlying SS model and accurate knowledge of it are often not encountered in practice. Here, we present KalmanNet, a real-time state estimator that learns from data to carry out Kalman filtering under non-linear dynamics with partial information. By incorporating the structural SS model with a dedicated recurrent neural network module in the flow of the KF, we retain data efficiency and interpretability of the classic algorithm while implicitly learning complex dynamics from data. We demonstrate numerically that KalmanNet overcomes non-linearities and model mismatch, outperforming classic filtering methods operating with both mismatched and accurate domain knowledge. △ Less

Submitted 10 March, 2022; v1 submitted 21 July, 2021; originally announced July 2021.

Comments: Accepted for publication in IEEE Transactions on Signal Processing - TSP

arXiv:2101.04726 [pdf, other]

Model-Based Machine Learning for Communications

Authors: Nir Shlezinger, Nariman Farsad, Yonina C. Eldar, Andrea J. Goldsmith

Abstract: We present an introduction to model-based machine learning for communication systems. We begin by reviewing existing strategies for combining model-based algorithms and machine learning from a high level perspective, and compare them to the conventional deep learning approach which utilizes established deep neural network (DNN) architectures trained in an end-to-end manner. Then, we focus on symbo… ▽ More We present an introduction to model-based machine learning for communication systems. We begin by reviewing existing strategies for combining model-based algorithms and machine learning from a high level perspective, and compare them to the conventional deep learning approach which utilizes established deep neural network (DNN) architectures trained in an end-to-end manner. Then, we focus on symbol detection, which is one of the fundamental tasks of communication receivers. We show how the different strategies of conventional deep architectures, deep unfolding, and DNN-aided hybrid algorithms, can be applied to this problem. The last two approaches constitute a middle ground between purely model-based and solely DNN-based receivers. By focusing on this specific task, we highlight the advantages and drawbacks of each strategy, and present guidelines to facilitate the design of future model-based deep learning systems for communications. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: arXiv admin note: text overlap with arXiv:2002.07806

arXiv:2011.07271 [pdf, other]

FedRec: Federated Learning of Universal Receivers over Fading Channels

Authors: Mahdi Boloursaz Mashhadi, Nir Shlezinger, Yonina C. Eldar, Deniz Gunduz

Abstract: Wireless communications is often subject to channel fading. Various statistical models have been proposed to capture the inherent randomness in fading, and conventional model-based receiver designs rely on accurate knowledge of this underlying distribution, which, in practice, may be complex and intractable. In this work, we propose a neural network-based symbol detection technique for downlink fa… ▽ More Wireless communications is often subject to channel fading. Various statistical models have been proposed to capture the inherent randomness in fading, and conventional model-based receiver designs rely on accurate knowledge of this underlying distribution, which, in practice, may be complex and intractable. In this work, we propose a neural network-based symbol detection technique for downlink fading channels, which is based on the maximum a-posteriori probability (MAP) detector. To enable training on a diverse ensemble of fading realizations, we propose a federated training scheme, in which multiple users collaborate to jointly learn a universal data-driven detector, hence the name FedRec. The performance of the resulting receiver is shown to approach the MAP performance in diverse channel conditions without requiring knowledge of the fading statistics, while inducing a substantially reduced communication overhead in its training procedure compared to centralized training. △ Less

Submitted 26 March, 2021; v1 submitted 14 November, 2020; originally announced November 2020.

Comments: Submitted for publication

arXiv:2010.06072 [pdf, other]

Multi-Level Group Testing with Application to One-Shot Pooled COVID-19 Tests

Authors: Amit Solomon, Alejandro Cohen, Nir Shlezinger, Yonina C. Eldar, Muriel Médard

Abstract: A key requirement in containing contagious diseases, such as the Coronavirus disease 2019 (COVID-19) pandemic, is the ability to efficiently carry out mass diagnosis over large populations. Some of the leading testing procedures, such as those utilizing qualitative polymerase chain reaction, involve using dedicated machinery which can simultaneously process a limited amount of samples. A candidate… ▽ More A key requirement in containing contagious diseases, such as the Coronavirus disease 2019 (COVID-19) pandemic, is the ability to efficiently carry out mass diagnosis over large populations. Some of the leading testing procedures, such as those utilizing qualitative polymerase chain reaction, involve using dedicated machinery which can simultaneously process a limited amount of samples. A candidate method to increase the test throughput is to examine pooled samples comprised of a mixture of samples from different patients. In this work we study pooling based tests which operate in a one-shot fashion, while providing an indication not solely on the presence of infection, but also on its level, without additional pool tests, as often required in COVID-19 testing. As these requirements limit the application of traditional group-testing (GT) methods, we propose a multi-level GT scheme, which builds upon GT principles to enable accurate recovery using much fewer tests than patients, while operating in a one-shot manner and providing multi-level indications. We provide a theoretical analysis of the proposed scheme and characterize conditions under which the algorithm operates reliably and at affordable computational complexity. Our numerical results demonstrate that multi level GT accurately and efficiently detects infection levels, while achieving improved performance over previously proposed one-shot COVID-19 pooled-testing methods. △ Less

Submitted 30 August, 2022; v1 submitted 12 October, 2020; originally announced October 2020.

arXiv:2009.12787 [pdf, other]

doi 10.1109/TSP.2021.3090323

Over-the-Air Federated Learning from Heterogeneous Data

Authors: Tomer Sery, Nir Shlezinger, Kobi Cohen, Yonina C. Eldar

Abstract: Federated learning (FL) is a framework for distributed learning of centralized models. In FL, a set of edge devices train a model using their local data, while repeatedly exchanging their trained updates with a central server. This procedure allows tuning a centralized model in a distributed fashion without having the users share their possibly private data. In this paper, we focus on over-the-air… ▽ More Federated learning (FL) is a framework for distributed learning of centralized models. In FL, a set of edge devices train a model using their local data, while repeatedly exchanging their trained updates with a central server. This procedure allows tuning a centralized model in a distributed fashion without having the users share their possibly private data. In this paper, we focus on over-the-air (OTA) FL, which has been suggested recently to reduce the communication overhead of FL due to the repeated transmissions of the model updates by a large number of users over the wireless channel. In OTA FL, all users simultaneously transmit their updates as analog signals over a multiple access channel, and the server receives a superposition of the analog transmitted signals. However, this approach results in the channel noise directly affecting the optimization procedure, which may degrade the accuracy of the trained model. We develop a Convergent OTA FL (COTAF) algorithm which enhances the common local stochastic gradient descent (SGD) FL algorithm, introducing precoding at the users and scaling at the server, which gradually mitigates the effect of the noise. We analyze the convergence of COTAF to the loss minimizing model and quantify the effect of a statistically heterogeneous setup, i.e. when the training data of each user obeys a different distribution. Our analysis reveals the ability of COTAF to achieve a convergence rate similar to that achievable over error-free channels. Our simulations demonstrate the improved convergence of COTAF over vanilla OTA local SGD for training using non-synthetic datasets. Furthermore, we numerically show that the precoding induced by COTAF notably improves the convergence rate and the accuracy of models trained via OTA FL. △ Less

Submitted 2 October, 2020; v1 submitted 27 September, 2020; originally announced September 2020.

Comments: 15 pages, 8 figures. Part of the results were accepted to be presented in the 2020 IEEE Global Communications Conference (GLOBECOM)

arXiv:2006.03262 [pdf, other]

doi 10.1109/TSP.2020.3046971

UVeQFed: Universal Vector Quantization for Federated Learning

Authors: Nir Shlezinger, Mingzhe Chen, Yonina C. Eldar, H. Vincent Poor, Shuguang Cui

Abstract: Traditional deep learning models are trained at a centralized server using labeled data samples collected from end devices or users. Such data samples often include private information, which the users may not be willing to share. Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data. In FL, each user… ▽ More Traditional deep learning models are trained at a centralized server using labeled data samples collected from end devices or users. Such data samples often include private information, which the users may not be willing to share. Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data. In FL, each user trains its copy of the learning model locally. The server then collects the individual updates and aggregates them into a global model. A major challenge that arises in this method is the need of each user to efficiently transmit its learned model over the throughput limited uplink channel. In this work, we tackle this challenge using tools from quantization theory. In particular, we identify the unique characteristics associated with conveying trained models over rate-constrained channels, and propose a suitable quantization scheme for such settings, referred to as universal vector quantization for FL (UVeQFed). We show that combining universal vector quantization methods with FL yields a decentralized training system in which the compression of the trained models induces only a minimum distortion. We then theoretically analyze the distortion, showing that it vanishes as the number of users grows. We also characterize the convergence of models trained with the traditional federated averaging method combined with UVeQFed to the model which minimizes the loss function. Our numerical results demonstrate the gains of UVeQFed over previously proposed methods in terms of both distortion induced in quantization and accuracy of the resulting aggregated model. △ Less

Submitted 14 December, 2020; v1 submitted 5 June, 2020; originally announced June 2020.

arXiv:2006.03258 [pdf, other]

Learned Factor Graphs for Inference from Stationary Time Sequences

Authors: Nir Shlezinger, Nariman Farsad, Yonina C. Eldar, Andrea J. Goldsmith

Abstract: The design of methods for inference from time sequences has traditionally relied on statistical models that describe the relation between a latent desired sequence and the observed one. A broad family of model-based algorithms have been derived to carry out inference at controllable complexity using recursive computations over the factor graph representing the underlying distribution. An alternati… ▽ More The design of methods for inference from time sequences has traditionally relied on statistical models that describe the relation between a latent desired sequence and the observed one. A broad family of model-based algorithms have been derived to carry out inference at controllable complexity using recursive computations over the factor graph representing the underlying distribution. An alternative model-agnostic approach utilizes machine learning (ML) methods. Here we propose a framework that combines model-based algorithms and data-driven ML tools for stationary time sequences. In the proposed approach, neural networks are developed to separately learn specific components of a factor graph describing the distribution of the time sequence, rather than the complete inference task. By exploiting stationary properties of this distribution, the resulting approach can be applied to sequences of varying temporal duration. Learned factor graph can be realized using compact neural networks that are trainable using small training sets, or alternatively, be used to improve upon existing deep inference systems. We present an inference algorithm based on learned stationary factor graphs, which learns to implement the sum-product scheme from labeled data, and can be applied to sequences of different lengths. Our experimental results demonstrate the ability of the proposed learned factor graphs to learn to carry out accurate inference from small training sets for sleep stage detection using the Sleep-EDF dataset, as well as for symbol detection in digital communications with unknown channels. △ Less

Submitted 24 December, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

arXiv:2002.07806 [pdf, other]

Data-Driven Symbol Detection via Model-Based Machine Learning

Authors: Nariman Farsad, Nir Shlezinger, Andrea J. Goldsmith, Yonina C. Eldar

Abstract: The design of symbol detectors in digital communication systems has traditionally relied on statistical channel models that describe the relation between the transmitted symbols and the observed signal at the receiver. Here we review a data-driven framework to symbol detection design which combines machine learning (ML) and model-based algorithms. In this hybrid approach, well-known channel-model-… ▽ More The design of symbol detectors in digital communication systems has traditionally relied on statistical channel models that describe the relation between the transmitted symbols and the observed signal at the receiver. Here we review a data-driven framework to symbol detection design which combines machine learning (ML) and model-based algorithms. In this hybrid approach, well-known channel-model-based algorithms such as the Viterbi method, BCJR detection, and multiple-input multiple-output (MIMO) soft interference cancellation (SIC) are augmented with ML-based algorithms to remove their channel-model-dependence, allowing the receiver to learn to implement these algorithms solely from data. The resulting data-driven receivers are most suitable for systems where the underlying channel models are poorly understood, highly complex, or do not well-capture the underlying physics. Our approach is unique in that it only replaces the channel-model-based computations with dedicated neural networks that can be trained from a small amount of data, while kee** the general algorithm intact. Our results demonstrate that these techniques can yield near-optimal performance of model-based algorithms without knowing the exact channel input-output statistical relationship and in the presence of channel state information uncertainty. △ Less

Submitted 14 February, 2020; originally announced February 2020.

Comments: arXiv admin note: text overlap with arXiv:1905.10750

arXiv:2002.03214 [pdf, other]

DeepSIC: Deep Soft Interference Cancellation for Multiuser MIMO Detection

Authors: Nir Shlezinger, Rong Fu, Yonina C. Eldar

Abstract: Digital receivers are required to recover the transmitted symbols from their observed channel output. In multiuser multiple-input multiple-output (MIMO) setups, where multiple symbols are simultaneously transmitted, accurate symbol detection is challenging. A family of algorithms capable of reliably recovering multiple symbols is based on interference cancellation. However, these methods assume th… ▽ More Digital receivers are required to recover the transmitted symbols from their observed channel output. In multiuser multiple-input multiple-output (MIMO) setups, where multiple symbols are simultaneously transmitted, accurate symbol detection is challenging. A family of algorithms capable of reliably recovering multiple symbols is based on interference cancellation. However, these methods assume that the channel is linear, a model which does not reflect many relevant channels, as well as require accurate channel state information (CSI), which may not be available. In this work we propose a multiuser MIMO receiver which learns to jointly detect in a data-driven fashion, without assuming a specific channel model or requiring CSI. In particular, we propose a data-driven implementation of the iterative soft interference cancellation (SIC) algorithm which we refer to as DeepSIC. The resulting symbol detector is based on integrating dedicated machine-learning (ML) methods into the iterative SIC algorithm. DeepSIC learns to carry out joint detection from a limited set of training samples without requiring the channel to be linear and its parameters to be known. Our numerical evaluations demonstrate that for linear channels with full CSI, DeepSIC approaches the performance of iterative SIC, which is comparable to the optimal performance, and outperforms previously proposed ML-based MIMO receivers. Furthermore, in the presence of CSI uncertainty, DeepSIC significantly outperforms model-based approaches. Finally, we show that DeepSIC accurately detects symbols in non-linear channels, where conventional iterative SIC fails even when accurate CSI is available. △ Less

Submitted 14 June, 2020; v1 submitted 8 February, 2020; originally announced February 2020.

Comments: arXiv admin note: text overlap with arXiv:2002.07806

arXiv:2002.00758 [pdf, other]

Data-Driven Factor Graphs for Deep Symbol Detection

Authors: Nir Shlezinger, Nariman Farsad, Yonina C. Eldar, Andrea J. Goldsmith

Abstract: Many important schemes in signal processing and communications, ranging from the BCJR algorithm to the Kalman filter, are instances of factor graph methods. This family of algorithms is based on recursive message passing-based computations carried out over graphical models, representing a factorization of the underlying statistics. Consequently, in order to implement these algorithms, one must hav… ▽ More Many important schemes in signal processing and communications, ranging from the BCJR algorithm to the Kalman filter, are instances of factor graph methods. This family of algorithms is based on recursive message passing-based computations carried out over graphical models, representing a factorization of the underlying statistics. Consequently, in order to implement these algorithms, one must have accurate knowledge of the statistical model of the considered signals. In this work we propose to implement factor graph methods in a data-driven manner. In particular, we propose to use machine learning (ML) tools to learn the factor graph, instead of the overall system task, which in turn is used for inference by message passing over the learned graph. We apply the proposed approach to learn the factor graph representing a finite-memory channel, demonstrating the resulting ability to implement BCJR detection in a data-driven fashion. We demonstrate that the proposed system, referred to as BCJRNet, learns to implement the BCJR algorithm from a small training set, and that the resulting receiver exhibits improved robustness to inaccurate training compared to the conventional channel-model-based receiver operating under the same level of uncertainty. Our results indicate that by utilizing ML tools to learn factor graphs from labeled data, one can implement a broad range of model-based algorithms, which traditionally require full knowledge of the underlying statistics, in a data-driven fashion. △ Less

Submitted 31 January, 2020; originally announced February 2020.

arXiv:1908.06845 [pdf, other]

Deep Task-Based Quantization

Authors: Nir Shlezinger, Yonina C. Eldar

Abstract: Quantizers play a critical role in digital signal processing systems. Recent works have shown that the performance of quantization systems acquiring multiple analog signals using scalar analog-to-digital converters (ADCs) can be significantly improved by properly processing the analog signals prior to quantization. However, the design of such hybrid quantizers is quite complex, and their implement… ▽ More Quantizers play a critical role in digital signal processing systems. Recent works have shown that the performance of quantization systems acquiring multiple analog signals using scalar analog-to-digital converters (ADCs) can be significantly improved by properly processing the analog signals prior to quantization. However, the design of such hybrid quantizers is quite complex, and their implementation requires complete knowledge of the statistical model of the analog signal, which may not be available in practice. In this work we design data-driven task-oriented quantization systems with scalar ADCs, which determine how to map an analog signal into its digital representation using deep learning tools. These representations are designed to facilitate the task of recovering underlying information from the quantized signals, which can be a set of parameters to estimate, or alternatively, a classification task. By utilizing deep learning, we circumvent the need to explicitly recover the system model and to find the proper quantization rule for it. Our main target application is multiple-input multiple-output (MIMO) communication receivers, which simultaneously acquire a set of analog signals, and are commonly subject to constraints on the number of bits. Our results indicate that, in a MIMO channel estimation setup, the proposed deep task-bask quantizer is capable of approaching the optimal performance limits dictated by indirect rate-distortion theory, achievable using vector quantizers and requiring complete knowledge of the underlying statistical model. Furthermore, for a symbol detection scenario, it is demonstrated that the proposed approach can realize reliable bit-efficient hybrid MIMO receivers capable of setting their quantization rule in light of the task, e.g., to minimize the bit error rate. △ Less

Submitted 1 August, 2019; originally announced August 2019.

arXiv:1905.10750 [pdf, other]

ViterbiNet: A Deep Learning Based Viterbi Algorithm for Symbol Detection

Authors: Nir Shlezinger, Nariman Farsad, Yonina C. Eldar, Andrea J. Goldsmith

Abstract: Symbol detection plays an important role in the implementation of digital receivers. In this work, we propose ViterbiNet, which is a data-driven symbol detector that does not require channel state information (CSI). ViterbiNet is obtained by integrating deep neural networks (DNNs) into the Viterbi algorithm. We identify the specific parts of the Viterbi algorithm that are channel-model-based, and… ▽ More Symbol detection plays an important role in the implementation of digital receivers. In this work, we propose ViterbiNet, which is a data-driven symbol detector that does not require channel state information (CSI). ViterbiNet is obtained by integrating deep neural networks (DNNs) into the Viterbi algorithm. We identify the specific parts of the Viterbi algorithm that are channel-model-based, and design a DNN to implement only those computations, leaving the rest of the algorithm structure intact. We then propose a meta-learning based approach to train ViterbiNet online based on recent decisions, allowing the receiver to track dynamic channel conditions without requiring new training samples for every coherence block. Our numerical evaluations demonstrate that the performance of ViterbiNet, which is ignorant of the CSI, approaches that of the CSI-based Viterbi algorithm, and is capable of tracking time-varying channels without needing instantaneous CSI or additional training data. Moreover, unlike conventional Viterbi detection, ViterbiNet is robust to CSI uncertainty, and it can be reliably implemented in complex channel models with constrained computational burden. More broadly, our results demonstrate the conceptual benefit of designing communication systems to that integrate DNNs into established algorithms. △ Less

Submitted 29 September, 2020; v1 submitted 26 May, 2019; originally announced May 2019.

Comments: arXiv admin note: text overlap with arXiv:2002.07806

Showing 1–14 of 14 results for author: Shlezinger, N