-
Outlier-Insensitive Kalman Filtering Using NUV Priors
Authors:
Shunit Truzman,
Guy Revach,
Nir Shlezinger,
Itzik Klein
Abstract:
The Kalman filter (KF) is a widely-used algorithm for tracking the latent state of a dynamical system from noisy observations. For systems that are well-described by linear Gaussian state space models, the KF minimizes the mean-squared error (MSE). However, in practice, observations are corrupted by outliers, severely impairing the KFs performance. In this work, an outlier-insensitive KF is propos…
▽ More
The Kalman filter (KF) is a widely-used algorithm for tracking the latent state of a dynamical system from noisy observations. For systems that are well-described by linear Gaussian state space models, the KF minimizes the mean-squared error (MSE). However, in practice, observations are corrupted by outliers, severely impairing the KFs performance. In this work, an outlier-insensitive KF is proposed, where robustness is achieved by modeling each potential outlier as a normally distributed random variable with unknown variance (NUV). The NUVs variances are estimated online, using both expectation-maximization (EM) and alternating maximization (AM). The former was previously proposed for the task of smoothing with outliers and was adapted here to filtering, while both EM and AM obtained the same performance and outperformed the other algorithms, the AM approach is less complex and thus requires 40 percentage less run-time. Our empirical study demonstrates that the MSE of our proposed outlier-insensitive KF outperforms previously proposed algorithms, and that for data clean of outliers, it reverts to the classic KF, i.e., MSE optimality is preserved
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Anomaly Search over Composite Hypotheses in Hierarchical Statistical Models
Authors:
Benjamin Wolff,
Tomer Gafni,
Guy Revach,
Nir Shlezinger,
Kobi Cohen
Abstract:
Detection of anomalies among a large number of processes is a fundamental task that has been studied in multiple research areas, with diverse applications spanning from spectrum access to cyber-security. Anomalous events are characterized by deviations in data distributions, and thus can be inferred from noisy observations based on statistical methods. In some scenarios, one can often obtain noisy…
▽ More
Detection of anomalies among a large number of processes is a fundamental task that has been studied in multiple research areas, with diverse applications spanning from spectrum access to cyber-security. Anomalous events are characterized by deviations in data distributions, and thus can be inferred from noisy observations based on statistical methods. In some scenarios, one can often obtain noisy observations aggregated from a chosen subset of processes. Such hierarchical search can further minimize the sample complexity while retaining accuracy. An anomaly search strategy should thus be designed based on multiple requirements, such as maximizing the detection accuracy; efficiency, be efficient in terms of sample complexity; and be able to cope with statistical models that are known only up to some missing parameters (i.e., composite hypotheses). In this paper, we consider anomaly detection with observations taken from a chosen subset of processes that conforms to a predetermined tree structure with partially known statistical model. We propose Hierarchical Dynamic Search (HDS), a sequential search strategy that uses two variations of the Generalized Log Likelihood Ratio (GLLR) statistic, and can be used for detection of multiple anomalies. HDS is shown to be order-optimal in terms of the size of the search space, and asymptotically optimal in terms of detection accuracy. An explicit upper bound on the error probability is established for the finite sample regime. In addition to extensive experiments on synthetic datasets, experiments have been conducted on the DARPA intrusion detection dataset, showing that HDS is superior to existing methods.
△ Less
Submitted 11 August, 2022; v1 submitted 21 February, 2022;
originally announced February 2022.
-
KalmanNet: Neural Network Aided Kalman Filtering for Partially Known Dynamics
Authors:
Guy Revach,
Nir Shlezinger,
Xiaoyong Ni,
Adria Lopez Escoriza,
Ruud J. G. van Sloun,
Yonina C. Eldar
Abstract:
State estimation of dynamical systems in real-time is a fundamental task in signal processing. For systems that are well-represented by a fully known linear Gaussian state space (SS) model, the celebrated Kalman filter (KF) is a low complexity optimal solution. However, both linearity of the underlying SS model and accurate knowledge of it are often not encountered in practice. Here, we present Ka…
▽ More
State estimation of dynamical systems in real-time is a fundamental task in signal processing. For systems that are well-represented by a fully known linear Gaussian state space (SS) model, the celebrated Kalman filter (KF) is a low complexity optimal solution. However, both linearity of the underlying SS model and accurate knowledge of it are often not encountered in practice. Here, we present KalmanNet, a real-time state estimator that learns from data to carry out Kalman filtering under non-linear dynamics with partial information. By incorporating the structural SS model with a dedicated recurrent neural network module in the flow of the KF, we retain data efficiency and interpretability of the classic algorithm while implicitly learning complex dynamics from data. We demonstrate numerically that KalmanNet overcomes non-linearities and model mismatch, outperforming classic filtering methods operating with both mismatched and accurate domain knowledge.
△ Less
Submitted 10 March, 2022; v1 submitted 21 July, 2021;
originally announced July 2021.
-
Model-Based Machine Learning for Communications
Authors:
Nir Shlezinger,
Nariman Farsad,
Yonina C. Eldar,
Andrea J. Goldsmith
Abstract:
We present an introduction to model-based machine learning for communication systems. We begin by reviewing existing strategies for combining model-based algorithms and machine learning from a high level perspective, and compare them to the conventional deep learning approach which utilizes established deep neural network (DNN) architectures trained in an end-to-end manner. Then, we focus on symbo…
▽ More
We present an introduction to model-based machine learning for communication systems. We begin by reviewing existing strategies for combining model-based algorithms and machine learning from a high level perspective, and compare them to the conventional deep learning approach which utilizes established deep neural network (DNN) architectures trained in an end-to-end manner. Then, we focus on symbol detection, which is one of the fundamental tasks of communication receivers. We show how the different strategies of conventional deep architectures, deep unfolding, and DNN-aided hybrid algorithms, can be applied to this problem. The last two approaches constitute a middle ground between purely model-based and solely DNN-based receivers. By focusing on this specific task, we highlight the advantages and drawbacks of each strategy, and present guidelines to facilitate the design of future model-based deep learning systems for communications.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.
-
FedRec: Federated Learning of Universal Receivers over Fading Channels
Authors:
Mahdi Boloursaz Mashhadi,
Nir Shlezinger,
Yonina C. Eldar,
Deniz Gunduz
Abstract:
Wireless communications is often subject to channel fading. Various statistical models have been proposed to capture the inherent randomness in fading, and conventional model-based receiver designs rely on accurate knowledge of this underlying distribution, which, in practice, may be complex and intractable. In this work, we propose a neural network-based symbol detection technique for downlink fa…
▽ More
Wireless communications is often subject to channel fading. Various statistical models have been proposed to capture the inherent randomness in fading, and conventional model-based receiver designs rely on accurate knowledge of this underlying distribution, which, in practice, may be complex and intractable. In this work, we propose a neural network-based symbol detection technique for downlink fading channels, which is based on the maximum a-posteriori probability (MAP) detector. To enable training on a diverse ensemble of fading realizations, we propose a federated training scheme, in which multiple users collaborate to jointly learn a universal data-driven detector, hence the name FedRec. The performance of the resulting receiver is shown to approach the MAP performance in diverse channel conditions without requiring knowledge of the fading statistics, while inducing a substantially reduced communication overhead in its training procedure compared to centralized training.
△ Less
Submitted 26 March, 2021; v1 submitted 14 November, 2020;
originally announced November 2020.
-
Multi-Level Group Testing with Application to One-Shot Pooled COVID-19 Tests
Authors:
Amit Solomon,
Alejandro Cohen,
Nir Shlezinger,
Yonina C. Eldar,
Muriel Médard
Abstract:
A key requirement in containing contagious diseases, such as the Coronavirus disease 2019 (COVID-19) pandemic, is the ability to efficiently carry out mass diagnosis over large populations. Some of the leading testing procedures, such as those utilizing qualitative polymerase chain reaction, involve using dedicated machinery which can simultaneously process a limited amount of samples. A candidate…
▽ More
A key requirement in containing contagious diseases, such as the Coronavirus disease 2019 (COVID-19) pandemic, is the ability to efficiently carry out mass diagnosis over large populations. Some of the leading testing procedures, such as those utilizing qualitative polymerase chain reaction, involve using dedicated machinery which can simultaneously process a limited amount of samples. A candidate method to increase the test throughput is to examine pooled samples comprised of a mixture of samples from different patients. In this work we study pooling based tests which operate in a one-shot fashion, while providing an indication not solely on the presence of infection, but also on its level, without additional pool tests, as often required in COVID-19 testing. As these requirements limit the application of traditional group-testing (GT) methods, we propose a multi-level GT scheme, which builds upon GT principles to enable accurate recovery using much fewer tests than patients, while operating in a one-shot manner and providing multi-level indications. We provide a theoretical analysis of the proposed scheme and characterize conditions under which the algorithm operates reliably and at affordable computational complexity. Our numerical results demonstrate that multi level GT accurately and efficiently detects infection levels, while achieving improved performance over previously proposed one-shot COVID-19 pooled-testing methods.
△ Less
Submitted 30 August, 2022; v1 submitted 12 October, 2020;
originally announced October 2020.
-
Over-the-Air Federated Learning from Heterogeneous Data
Authors:
Tomer Sery,
Nir Shlezinger,
Kobi Cohen,
Yonina C. Eldar
Abstract:
Federated learning (FL) is a framework for distributed learning of centralized models. In FL, a set of edge devices train a model using their local data, while repeatedly exchanging their trained updates with a central server. This procedure allows tuning a centralized model in a distributed fashion without having the users share their possibly private data. In this paper, we focus on over-the-air…
▽ More
Federated learning (FL) is a framework for distributed learning of centralized models. In FL, a set of edge devices train a model using their local data, while repeatedly exchanging their trained updates with a central server. This procedure allows tuning a centralized model in a distributed fashion without having the users share their possibly private data. In this paper, we focus on over-the-air (OTA) FL, which has been suggested recently to reduce the communication overhead of FL due to the repeated transmissions of the model updates by a large number of users over the wireless channel. In OTA FL, all users simultaneously transmit their updates as analog signals over a multiple access channel, and the server receives a superposition of the analog transmitted signals. However, this approach results in the channel noise directly affecting the optimization procedure, which may degrade the accuracy of the trained model. We develop a Convergent OTA FL (COTAF) algorithm which enhances the common local stochastic gradient descent (SGD) FL algorithm, introducing precoding at the users and scaling at the server, which gradually mitigates the effect of the noise. We analyze the convergence of COTAF to the loss minimizing model and quantify the effect of a statistically heterogeneous setup, i.e. when the training data of each user obeys a different distribution. Our analysis reveals the ability of COTAF to achieve a convergence rate similar to that achievable over error-free channels. Our simulations demonstrate the improved convergence of COTAF over vanilla OTA local SGD for training using non-synthetic datasets. Furthermore, we numerically show that the precoding induced by COTAF notably improves the convergence rate and the accuracy of models trained via OTA FL.
△ Less
Submitted 2 October, 2020; v1 submitted 27 September, 2020;
originally announced September 2020.
-
UVeQFed: Universal Vector Quantization for Federated Learning
Authors:
Nir Shlezinger,
Mingzhe Chen,
Yonina C. Eldar,
H. Vincent Poor,
Shuguang Cui
Abstract:
Traditional deep learning models are trained at a centralized server using labeled data samples collected from end devices or users. Such data samples often include private information, which the users may not be willing to share. Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data. In FL, each user…
▽ More
Traditional deep learning models are trained at a centralized server using labeled data samples collected from end devices or users. Such data samples often include private information, which the users may not be willing to share. Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data. In FL, each user trains its copy of the learning model locally. The server then collects the individual updates and aggregates them into a global model. A major challenge that arises in this method is the need of each user to efficiently transmit its learned model over the throughput limited uplink channel. In this work, we tackle this challenge using tools from quantization theory. In particular, we identify the unique characteristics associated with conveying trained models over rate-constrained channels, and propose a suitable quantization scheme for such settings, referred to as universal vector quantization for FL (UVeQFed). We show that combining universal vector quantization methods with FL yields a decentralized training system in which the compression of the trained models induces only a minimum distortion. We then theoretically analyze the distortion, showing that it vanishes as the number of users grows. We also characterize the convergence of models trained with the traditional federated averaging method combined with UVeQFed to the model which minimizes the loss function. Our numerical results demonstrate the gains of UVeQFed over previously proposed methods in terms of both distortion induced in quantization and accuracy of the resulting aggregated model.
△ Less
Submitted 14 December, 2020; v1 submitted 5 June, 2020;
originally announced June 2020.
-
Learned Factor Graphs for Inference from Stationary Time Sequences
Authors:
Nir Shlezinger,
Nariman Farsad,
Yonina C. Eldar,
Andrea J. Goldsmith
Abstract:
The design of methods for inference from time sequences has traditionally relied on statistical models that describe the relation between a latent desired sequence and the observed one. A broad family of model-based algorithms have been derived to carry out inference at controllable complexity using recursive computations over the factor graph representing the underlying distribution. An alternati…
▽ More
The design of methods for inference from time sequences has traditionally relied on statistical models that describe the relation between a latent desired sequence and the observed one. A broad family of model-based algorithms have been derived to carry out inference at controllable complexity using recursive computations over the factor graph representing the underlying distribution. An alternative model-agnostic approach utilizes machine learning (ML) methods. Here we propose a framework that combines model-based algorithms and data-driven ML tools for stationary time sequences. In the proposed approach, neural networks are developed to separately learn specific components of a factor graph describing the distribution of the time sequence, rather than the complete inference task. By exploiting stationary properties of this distribution, the resulting approach can be applied to sequences of varying temporal duration. Learned factor graph can be realized using compact neural networks that are trainable using small training sets, or alternatively, be used to improve upon existing deep inference systems. We present an inference algorithm based on learned stationary factor graphs, which learns to implement the sum-product scheme from labeled data, and can be applied to sequences of different lengths. Our experimental results demonstrate the ability of the proposed learned factor graphs to learn to carry out accurate inference from small training sets for sleep stage detection using the Sleep-EDF dataset, as well as for symbol detection in digital communications with unknown channels.
△ Less
Submitted 24 December, 2021; v1 submitted 5 June, 2020;
originally announced June 2020.
-
Data-Driven Symbol Detection via Model-Based Machine Learning
Authors:
Nariman Farsad,
Nir Shlezinger,
Andrea J. Goldsmith,
Yonina C. Eldar
Abstract:
The design of symbol detectors in digital communication systems has traditionally relied on statistical channel models that describe the relation between the transmitted symbols and the observed signal at the receiver. Here we review a data-driven framework to symbol detection design which combines machine learning (ML) and model-based algorithms. In this hybrid approach, well-known channel-model-…
▽ More
The design of symbol detectors in digital communication systems has traditionally relied on statistical channel models that describe the relation between the transmitted symbols and the observed signal at the receiver. Here we review a data-driven framework to symbol detection design which combines machine learning (ML) and model-based algorithms. In this hybrid approach, well-known channel-model-based algorithms such as the Viterbi method, BCJR detection, and multiple-input multiple-output (MIMO) soft interference cancellation (SIC) are augmented with ML-based algorithms to remove their channel-model-dependence, allowing the receiver to learn to implement these algorithms solely from data. The resulting data-driven receivers are most suitable for systems where the underlying channel models are poorly understood, highly complex, or do not well-capture the underlying physics. Our approach is unique in that it only replaces the channel-model-based computations with dedicated neural networks that can be trained from a small amount of data, while kee** the general algorithm intact. Our results demonstrate that these techniques can yield near-optimal performance of model-based algorithms without knowing the exact channel input-output statistical relationship and in the presence of channel state information uncertainty.
△ Less
Submitted 14 February, 2020;
originally announced February 2020.
-
DeepSIC: Deep Soft Interference Cancellation for Multiuser MIMO Detection
Authors:
Nir Shlezinger,
Rong Fu,
Yonina C. Eldar
Abstract:
Digital receivers are required to recover the transmitted symbols from their observed channel output. In multiuser multiple-input multiple-output (MIMO) setups, where multiple symbols are simultaneously transmitted, accurate symbol detection is challenging. A family of algorithms capable of reliably recovering multiple symbols is based on interference cancellation. However, these methods assume th…
▽ More
Digital receivers are required to recover the transmitted symbols from their observed channel output. In multiuser multiple-input multiple-output (MIMO) setups, where multiple symbols are simultaneously transmitted, accurate symbol detection is challenging. A family of algorithms capable of reliably recovering multiple symbols is based on interference cancellation. However, these methods assume that the channel is linear, a model which does not reflect many relevant channels, as well as require accurate channel state information (CSI), which may not be available. In this work we propose a multiuser MIMO receiver which learns to jointly detect in a data-driven fashion, without assuming a specific channel model or requiring CSI. In particular, we propose a data-driven implementation of the iterative soft interference cancellation (SIC) algorithm which we refer to as DeepSIC. The resulting symbol detector is based on integrating dedicated machine-learning (ML) methods into the iterative SIC algorithm. DeepSIC learns to carry out joint detection from a limited set of training samples without requiring the channel to be linear and its parameters to be known. Our numerical evaluations demonstrate that for linear channels with full CSI, DeepSIC approaches the performance of iterative SIC, which is comparable to the optimal performance, and outperforms previously proposed ML-based MIMO receivers. Furthermore, in the presence of CSI uncertainty, DeepSIC significantly outperforms model-based approaches. Finally, we show that DeepSIC accurately detects symbols in non-linear channels, where conventional iterative SIC fails even when accurate CSI is available.
△ Less
Submitted 14 June, 2020; v1 submitted 8 February, 2020;
originally announced February 2020.
-
Data-Driven Factor Graphs for Deep Symbol Detection
Authors:
Nir Shlezinger,
Nariman Farsad,
Yonina C. Eldar,
Andrea J. Goldsmith
Abstract:
Many important schemes in signal processing and communications, ranging from the BCJR algorithm to the Kalman filter, are instances of factor graph methods. This family of algorithms is based on recursive message passing-based computations carried out over graphical models, representing a factorization of the underlying statistics. Consequently, in order to implement these algorithms, one must hav…
▽ More
Many important schemes in signal processing and communications, ranging from the BCJR algorithm to the Kalman filter, are instances of factor graph methods. This family of algorithms is based on recursive message passing-based computations carried out over graphical models, representing a factorization of the underlying statistics. Consequently, in order to implement these algorithms, one must have accurate knowledge of the statistical model of the considered signals. In this work we propose to implement factor graph methods in a data-driven manner. In particular, we propose to use machine learning (ML) tools to learn the factor graph, instead of the overall system task, which in turn is used for inference by message passing over the learned graph. We apply the proposed approach to learn the factor graph representing a finite-memory channel, demonstrating the resulting ability to implement BCJR detection in a data-driven fashion. We demonstrate that the proposed system, referred to as BCJRNet, learns to implement the BCJR algorithm from a small training set, and that the resulting receiver exhibits improved robustness to inaccurate training compared to the conventional channel-model-based receiver operating under the same level of uncertainty. Our results indicate that by utilizing ML tools to learn factor graphs from labeled data, one can implement a broad range of model-based algorithms, which traditionally require full knowledge of the underlying statistics, in a data-driven fashion.
△ Less
Submitted 31 January, 2020;
originally announced February 2020.
-
Deep Task-Based Quantization
Authors:
Nir Shlezinger,
Yonina C. Eldar
Abstract:
Quantizers play a critical role in digital signal processing systems. Recent works have shown that the performance of quantization systems acquiring multiple analog signals using scalar analog-to-digital converters (ADCs) can be significantly improved by properly processing the analog signals prior to quantization. However, the design of such hybrid quantizers is quite complex, and their implement…
▽ More
Quantizers play a critical role in digital signal processing systems. Recent works have shown that the performance of quantization systems acquiring multiple analog signals using scalar analog-to-digital converters (ADCs) can be significantly improved by properly processing the analog signals prior to quantization. However, the design of such hybrid quantizers is quite complex, and their implementation requires complete knowledge of the statistical model of the analog signal, which may not be available in practice. In this work we design data-driven task-oriented quantization systems with scalar ADCs, which determine how to map an analog signal into its digital representation using deep learning tools. These representations are designed to facilitate the task of recovering underlying information from the quantized signals, which can be a set of parameters to estimate, or alternatively, a classification task. By utilizing deep learning, we circumvent the need to explicitly recover the system model and to find the proper quantization rule for it. Our main target application is multiple-input multiple-output (MIMO) communication receivers, which simultaneously acquire a set of analog signals, and are commonly subject to constraints on the number of bits. Our results indicate that, in a MIMO channel estimation setup, the proposed deep task-bask quantizer is capable of approaching the optimal performance limits dictated by indirect rate-distortion theory, achievable using vector quantizers and requiring complete knowledge of the underlying statistical model. Furthermore, for a symbol detection scenario, it is demonstrated that the proposed approach can realize reliable bit-efficient hybrid MIMO receivers capable of setting their quantization rule in light of the task, e.g., to minimize the bit error rate.
△ Less
Submitted 1 August, 2019;
originally announced August 2019.
-
ViterbiNet: A Deep Learning Based Viterbi Algorithm for Symbol Detection
Authors:
Nir Shlezinger,
Nariman Farsad,
Yonina C. Eldar,
Andrea J. Goldsmith
Abstract:
Symbol detection plays an important role in the implementation of digital receivers. In this work, we propose ViterbiNet, which is a data-driven symbol detector that does not require channel state information (CSI). ViterbiNet is obtained by integrating deep neural networks (DNNs) into the Viterbi algorithm. We identify the specific parts of the Viterbi algorithm that are channel-model-based, and…
▽ More
Symbol detection plays an important role in the implementation of digital receivers. In this work, we propose ViterbiNet, which is a data-driven symbol detector that does not require channel state information (CSI). ViterbiNet is obtained by integrating deep neural networks (DNNs) into the Viterbi algorithm. We identify the specific parts of the Viterbi algorithm that are channel-model-based, and design a DNN to implement only those computations, leaving the rest of the algorithm structure intact. We then propose a meta-learning based approach to train ViterbiNet online based on recent decisions, allowing the receiver to track dynamic channel conditions without requiring new training samples for every coherence block. Our numerical evaluations demonstrate that the performance of ViterbiNet, which is ignorant of the CSI, approaches that of the CSI-based Viterbi algorithm, and is capable of tracking time-varying channels without needing instantaneous CSI or additional training data. Moreover, unlike conventional Viterbi detection, ViterbiNet is robust to CSI uncertainty, and it can be reliably implemented in complex channel models with constrained computational burden. More broadly, our results demonstrate the conceptual benefit of designing communication systems to that integrate DNNs into established algorithms.
△ Less
Submitted 29 September, 2020; v1 submitted 26 May, 2019;
originally announced May 2019.