-
Rate-Distortion-Perception Tradeoff for Gaussian Vector Sources
Authors:
**g**g Qian,
Sadaf Salehkalaibar,
Jun Chen,
Ashish Khisti,
Wei Yu,
Wuxian Shi,
Yiqun Ge,
Wen Tong
Abstract:
This paper studies the rate-distortion-perception (RDP) tradeoff for a Gaussian vector source coding problem where the goal is to compress the multi-component source subject to distortion and perception constraints. The purpose of imposing a perception constraint is to ensure visually pleasing reconstructions. This paper studies this RDP setting with either the Kullback-Leibler (KL) divergence or…
▽ More
This paper studies the rate-distortion-perception (RDP) tradeoff for a Gaussian vector source coding problem where the goal is to compress the multi-component source subject to distortion and perception constraints. The purpose of imposing a perception constraint is to ensure visually pleasing reconstructions. This paper studies this RDP setting with either the Kullback-Leibler (KL) divergence or Wasserstein-2 metric as the perception loss function, and shows that for Gaussian vector sources, jointly Gaussian reconstructions are optimal. We further demonstrate that the optimal tradeoff can be expressed as an optimization problem, which can be explicitly solved. An interesting property of the optimal solution is as follows. Without the perception constraint, the traditional reverse water-filling solution for characterizing the rate-distortion (RD) tradeoff of a Gaussian vector source states that the optimal rate allocated to each component depends on a constant, called the water-level. If the variance of a specific component is below the water-level, it is assigned a {zero} compression rate. However, with active distortion and perception constraints, we show that the optimal rates allocated to the different components are always {positive}. Moreover, the water-levels that determine the optimal rate allocation for different components are unequal. We further treat the special case of perceptually perfect reconstruction and study its RDP function in the high-distortion and low-distortion regimes to obtain insight to the structure of the optimal solution.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Information Compression in the AI Era: Recent Advances and Future Challenges
Authors:
Jun Chen,
Yong Fang,
Ashish Khisti,
Ayfer Ozgur,
Nir Shlezinger,
Chao Tian
Abstract:
This survey articles focuses on emerging connections between the fields of machine learning and data compression. While fundamental limits of classical (lossy) data compression are established using rate-distortion theory, the connections to machine learning have resulted in new theoretical analysis and application areas. We survey recent works on task-based and goal-oriented compression, the rate…
▽ More
This survey articles focuses on emerging connections between the fields of machine learning and data compression. While fundamental limits of classical (lossy) data compression are established using rate-distortion theory, the connections to machine learning have resulted in new theoretical analysis and application areas. We survey recent works on task-based and goal-oriented compression, the rate-distortion-perception theory and compression for estimation and inference. Deep learning based approaches also provide natural data-driven algorithmic approaches to compression. We survey recent works on applying deep learning techniques to task-based or goal-oriented compression, as well as image and video compression. We also discuss the potential use of large language models for text compression. We finally provide some directions for future research in this promising field.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Secure Inference for Vertically Partitioned Data Using Multiparty Homomorphic Encryption
Authors:
Shuangyi Chen,
Yue Ju,
Zhongwen Zhu,
Ashish Khisti
Abstract:
We propose a secure inference protocol for a distributed setting involving a single server node and multiple client nodes. We assume that the observed data vector is partitioned across multiple client nodes while the deep learning model is located at the server node. Each client node is required to encrypt its portion of the data vector and transmit the resulting ciphertext to the server node. The…
▽ More
We propose a secure inference protocol for a distributed setting involving a single server node and multiple client nodes. We assume that the observed data vector is partitioned across multiple client nodes while the deep learning model is located at the server node. Each client node is required to encrypt its portion of the data vector and transmit the resulting ciphertext to the server node. The server node is required to collect the ciphertexts and perform inference in the encrypted domain. We demonstrate an application of multi-party homomorphic encryption (MPHE) to satisfy these requirements. We propose a packing scheme, that enables the server to form the ciphertext of the complete data by aggregating the ciphertext of data subsets encrypted using MPHE. While our proposed protocol builds upon prior horizontal federated training protocol~\cite{sav2020poseidon}, we focus on the inference for vertically partitioned data and avoid the transmission of (encrypted) model weights from the server node to the client nodes.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
SECO: Secure Inference With Model Splitting Across Multi-Server Hierarchy
Authors:
Shuangyi Chen,
Ashish Khisti
Abstract:
In the context of prediction-as-a-service, concerns about the privacy of the data and the model have been brought up and tackled via secure inference protocols. These protocols are built up by using single or multiple cryptographic tools designed under a variety of different security assumptions.
In this paper, we introduce SECO, a secure inference protocol that enables a user holding an input d…
▽ More
In the context of prediction-as-a-service, concerns about the privacy of the data and the model have been brought up and tackled via secure inference protocols. These protocols are built up by using single or multiple cryptographic tools designed under a variety of different security assumptions.
In this paper, we introduce SECO, a secure inference protocol that enables a user holding an input data vector and multiple server nodes deployed with a split neural network model to collaboratively compute the prediction, without compromising either party's data privacy. We extend prior work on secure inference that requires the entire neural network model to be located on a single server node, to a multi-server hierarchy, where the user communicates to a gateway server node, which in turn communicates to remote server nodes. The inference task is split across the server nodes and must be performed over an encrypted copy of the data vector.
We adopt multiparty homomorphic encryption and multiparty garbled circuit schemes, making the system secure against dishonest majority of semi-honest servers as well as protecting the partial model structure from the user. We evaluate SECO on multiple models, achieving the reduction of computation and communication cost for the user, making the protocol applicable to user's devices with limited resources.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Subset Adaptive Relaying for Streaming Erasure Codes
Authors:
Muhammad Ahmad Kaleem,
Gustavo Kasper Facenda,
Ashish Khisti
Abstract:
This paper investigates adaptive streaming codes over a three-node relayed network. In this setting, a source transmits a sequence of message packets through a relay under a delay constraint of $T$ time slots per packet. The source-to-relay and relay-to-destination links are unreliable and introduce a maximum of $N_1$ and $N_2$ packet erasures respectively. Recent work has proposed adaptive (time…
▽ More
This paper investigates adaptive streaming codes over a three-node relayed network. In this setting, a source transmits a sequence of message packets through a relay under a delay constraint of $T$ time slots per packet. The source-to-relay and relay-to-destination links are unreliable and introduce a maximum of $N_1$ and $N_2$ packet erasures respectively. Recent work has proposed adaptive (time variant) and nonadaptive (time invariant) code constructions for this setting and has shown that adaptive codes can achieve higher rates. However, the adaptive construction deals with many possibilities, leading to an impractical code with very large block lengths. In this work, we propose a simplified adaptive code construction which greatly improves the practicality of the code, with only a small cost to the achievable rates. We analyze the construction in terms of the achievable rates and field size requirements, and perform numerical simulations over statistical channels to estimate packet loss probabilities.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
Rate-Distortion-Perception Tradeoff Based on the Conditional-Distribution Perception Measure
Authors:
Sadaf Salehkalaibar,
Jun Chen,
Ashish Khisti,
Wei Yu
Abstract:
We study the rate-distortion-perception (RDP) tradeoff for a memoryless source model in the asymptotic limit of large block-lengths. Our perception measure is based on a divergence between the distributions of the source and reconstruction sequences conditioned on the encoder output, which was first proposed in [1], [2]. We consider the case when there is no shared randomness between the encoder a…
▽ More
We study the rate-distortion-perception (RDP) tradeoff for a memoryless source model in the asymptotic limit of large block-lengths. Our perception measure is based on a divergence between the distributions of the source and reconstruction sequences conditioned on the encoder output, which was first proposed in [1], [2]. We consider the case when there is no shared randomness between the encoder and the decoder. For the case of discrete memoryless sources we derive a single-letter characterization of the RDP function, thus settling a problem that remains open for the marginal metric introduced in Blau and Michaeli [3] (with no shared randomness). Our achievability scheme is based on lossy source coding with a posterior reference map proposed in [4]. For the case of continuous valued sources under squared error distortion measure and squared quadratic Wasserstein perception measure we also derive a single-letter characterization and show that a noise-adding mechanism at the decoder suffices to achieve the optimal representation. For the case of zero perception loss, we show that our characterization interestingly coincides with the results for the marginal metric derived in [5], [6] and again demonstrate that zero perception loss can be achieved with a $3$-dB penalty in the minimum distortion. Finally we specialize our results to the case of Gaussian sources. We derive the RDP function for vector Gaussian sources and propose a waterfilling type solution. We also partially characterize the RDP function for a mixture of vector Gaussians.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Importance Matching Lemma for Lossy Compression with Side Information
Authors:
Buu Phan,
Ashish Khisti,
Christos Louizos
Abstract:
We propose two extensions to existing importance sampling based methods for lossy compression. First, we introduce an importance sampling based compression scheme that is a variant of ordered random coding (Theis and Ahmed, 2022) and is amenable to direct evaluation of the achievable compression rate for a finite number of samples. Our second and major contribution is the importance matching lemma…
▽ More
We propose two extensions to existing importance sampling based methods for lossy compression. First, we introduce an importance sampling based compression scheme that is a variant of ordered random coding (Theis and Ahmed, 2022) and is amenable to direct evaluation of the achievable compression rate for a finite number of samples. Our second and major contribution is the importance matching lemma, which is a finite proposal counterpart of the recently introduced Poisson matching lemma (Li and Anantharam, 2021). By integrating with deep learning, we provide a new coding scheme for distributed lossy compression with side information at the decoder. We demonstrate the effectiveness of the proposed scheme through experiments involving synthetic Gaussian sources, distributed image compression with MNIST and vertical federated learning with CIFAR-10.
△ Less
Submitted 8 March, 2024; v1 submitted 4 January, 2024;
originally announced January 2024.
-
On the Choice of Perception Loss Function for Learned Video Compression
Authors:
Sadaf Salehkalaibar,
Buu Phan,
Jun Chen,
Wei Yu,
Ashish Khisti
Abstract:
We study causal, low-latency, sequential video compression when the output is subjected to both a mean squared-error (MSE) distortion loss as well as a perception loss to target realism. Motivated by prior approaches, we consider two different perception loss functions (PLFs). The first, PLF-JD, considers the joint distribution (JD) of all the video frames up to the current one, while the second m…
▽ More
We study causal, low-latency, sequential video compression when the output is subjected to both a mean squared-error (MSE) distortion loss as well as a perception loss to target realism. Motivated by prior approaches, we consider two different perception loss functions (PLFs). The first, PLF-JD, considers the joint distribution (JD) of all the video frames up to the current one, while the second metric, PLF-FMD, considers the framewise marginal distributions (FMD) between the source and reconstruction. Using information theoretic analysis and deep-learning based experiments, we demonstrate that the choice of PLF can have a significant effect on the reconstruction, especially at low-bit rates. In particular, while the reconstruction based on PLF-JD can better preserve the temporal correlation across frames, it also imposes a significant penalty in distortion compared to PLF-FMD and further makes it more difficult to recover from errors made in the earlier output frames. Although the choice of PLF decisively affects reconstruction quality, we also demonstrate that it may not be essential to commit to a particular PLF during encoding and the choice of PLF can be delegated to the decoder. In particular, encoded representations generated by training a system to minimize the MSE (without requiring either PLF) can be {\em near universal} and can generate close to optimal reconstructions for either choice of PLF at the decoder. We validate our results using (one-shot) information-theoretic analysis, detailed study of the rate-distortion-perception tradeoff of the Gauss-Markov source model as well as deep-learning based experiments on moving MNIST and KTH datasets.
△ Less
Submitted 22 August, 2023; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Random Edge Coding: One-Shot Bits-Back Coding of Large Labeled Graphs
Authors:
Daniel Severo,
James Townsend,
Ashish Khisti,
Alireza Makhzani
Abstract:
We present a one-shot method for compressing large labeled graphs called Random Edge Coding. When paired with a parameter-free model based on Pólya's Urn, the worst-case computational and memory complexities scale quasi-linearly and linearly with the number of observed edges, making it efficient on sparse graphs, and requires only integer arithmetic. Key to our method is bits-back coding, which is…
▽ More
We present a one-shot method for compressing large labeled graphs called Random Edge Coding. When paired with a parameter-free model based on Pólya's Urn, the worst-case computational and memory complexities scale quasi-linearly and linearly with the number of observed edges, making it efficient on sparse graphs, and requires only integer arithmetic. Key to our method is bits-back coding, which is used to sample edges and vertices without replacement from the edge-list in a way that preserves the structure of the graph. Optimality is proven under a class of random graph models that are invariant to permutations of the edges and of vertices within an edge. Experiments indicate Random Edge Coding can achieve competitive compression performance on real-world network datasets and scales to graphs with millions of nodes and edges.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
Quadratic Functional Encryption for Secure Training in Vertical Federated Learning
Authors:
Shuangyi Chen,
Anuja Modi,
Shweta Agrawal,
Ashish Khisti
Abstract:
Vertical federated learning (VFL) enables the collaborative training of machine learning (ML) models in settings where the data is distributed amongst multiple parties who wish to protect the privacy of their individual data. Notably, in VFL, the labels are available to a single party and the complete feature set is formed only when data from all parties is combined. Recently, Xu et al. proposed a…
▽ More
Vertical federated learning (VFL) enables the collaborative training of machine learning (ML) models in settings where the data is distributed amongst multiple parties who wish to protect the privacy of their individual data. Notably, in VFL, the labels are available to a single party and the complete feature set is formed only when data from all parties is combined. Recently, Xu et al. proposed a new framework called FedV for secure gradient computation for VFL using multi-input functional encryption. In this work, we explain how some of the information leakage in Xu et al. can be avoided by using Quadratic functional encryption when training generalized linear models for vertical federated learning.
△ Less
Submitted 19 June, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
Sequential Gradient Coding For Straggler Mitigation
Authors:
M. Nikhil Krishnan,
MohammadReza Ebrahimi,
Ashish Khisti
Abstract:
In distributed computing, slower nodes (stragglers) usually become a bottleneck. Gradient Coding (GC), introduced by Tandon et al., is an efficient technique that uses principles of error-correcting codes to distribute gradient computation in the presence of stragglers. In this paper, we consider the distributed computation of a sequence of gradients $\{g(1),g(2),\ldots,g(J)\}$, where processing o…
▽ More
In distributed computing, slower nodes (stragglers) usually become a bottleneck. Gradient Coding (GC), introduced by Tandon et al., is an efficient technique that uses principles of error-correcting codes to distribute gradient computation in the presence of stragglers. In this paper, we consider the distributed computation of a sequence of gradients $\{g(1),g(2),\ldots,g(J)\}$, where processing of each gradient $g(t)$ starts in round-$t$ and finishes by round-$(t+T)$. Here $T\geq 0$ denotes a delay parameter. For the GC scheme, coding is only across computing nodes and this results in a solution where $T=0$. On the other hand, having $T>0$ allows for designing schemes which exploit the temporal dimension as well. In this work, we propose two schemes that demonstrate improved performance compared to GC. Our first scheme combines GC with selective repetition of previously unfinished tasks and achieves improved straggler mitigation. In our second scheme, which constitutes our main contribution, we apply GC to a subset of the tasks and repetition for the remainder of the tasks. We then multiplex these two classes of tasks across workers and rounds in an adaptive manner, based on past straggler patterns. Using theoretical analysis, we demonstrate that our second scheme achieves significant reduction in the computational load. In our experiments, we study a practical setting of concurrently training multiple neural networks over an AWS Lambda cluster involving 256 worker nodes, where our framework naturally applies. We demonstrate that the latter scheme can yield a 16\% improvement in runtime over the baseline GC scheme, in the presence of naturally occurring, non-simulated stragglers.
△ Less
Submitted 28 June, 2023; v1 submitted 24 November, 2022;
originally announced November 2022.
-
Adaptive relaying for streaming erasure codes in a three node relay network
Authors:
Gustavo Kasper Facenda,
M. Nikhil Krishnan,
Elad Domanovitz,
Silas L. Fong,
Ashish Khisti,
Wai-Tian Tan,
John Apostolopoulos
Abstract:
This paper investigates adaptive streaming codes over a three-node relayed network. In this setting, a source node transmits a sequence of message packets to a destination through a relay. The source-to-relay and relay-to-destination links are unreliable and introduce at most $N_1$ and $N_2$ packet erasures, respectively. The destination node must recover each message packet within a strict delay…
▽ More
This paper investigates adaptive streaming codes over a three-node relayed network. In this setting, a source node transmits a sequence of message packets to a destination through a relay. The source-to-relay and relay-to-destination links are unreliable and introduce at most $N_1$ and $N_2$ packet erasures, respectively. The destination node must recover each message packet within a strict delay constraint $T$. The paper presents achievable streaming codes for all feasible parameters $\{N_1, N_2, T\}$ that exploit the fact that the relay naturally observes the erasure pattern occurring in the link from source to relay, thus it can adapt its relaying strategy based on these observations. In a recent work, Fong et al. provide streaming codes featuring channel-state-independent relaying strategies. The codes proposed in this paper achieve rates higher than the ones proposed by Fong et al. whenever $N_2 > N_1$, and achieve the same rate when $N_2 = N_1$. The paper also presents an upper bound on the achievable rate that takes into account erasures in both links in order to bound the rate in the second link. The upper bound is shown to be tighter than a trivial bound that considers only the erasures in the second link.
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
Variational Model Inversion Attacks
Authors:
Kuan-Chieh Wang,
Yan Fu,
Ke Li,
Ashish Khisti,
Richard Zemel,
Alireza Makhzani
Abstract:
Given the ubiquity of deep neural networks, it is important that these models do not reveal information about sensitive data that they have been trained on. In model inversion attacks, a malicious user attempts to recover the private dataset used to train a supervised neural network. A successful model inversion attack should generate realistic and diverse samples that accurately describe each of…
▽ More
Given the ubiquity of deep neural networks, it is important that these models do not reveal information about sensitive data that they have been trained on. In model inversion attacks, a malicious user attempts to recover the private dataset used to train a supervised neural network. A successful model inversion attack should generate realistic and diverse samples that accurately describe each of the classes in the private dataset. In this work, we provide a probabilistic interpretation of model inversion attacks, and formulate a variational objective that accounts for both diversity and accuracy. In order to optimize this variational objective, we choose a variational family defined in the code space of a deep generative model, trained on a public auxiliary dataset that shares some structural similarity with the target dataset. Empirically, our method substantially improves performance in terms of target attack accuracy, sample realism, and diversity on datasets of faces and chest X-ray images.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Error-correcting codes for low latency streaming over multiple link relay networks
Authors:
Gustavo Kasper Facenda,
Elad Domanovitz,
Ashish Khisti,
Wai-Tian Tan,
John Apostolopoulos
Abstract:
This paper investigates the performance of streaming codes in low-latency applications over a multi-link three-node relayed network. The source wishes to transmit a sequence of messages to the destination through a relay. Each message must be reconstructed after a fixed decoding delay. The special case with one link connecting each node has been studied by Fong et. al [1], and a multi-hop multi-li…
▽ More
This paper investigates the performance of streaming codes in low-latency applications over a multi-link three-node relayed network. The source wishes to transmit a sequence of messages to the destination through a relay. Each message must be reconstructed after a fixed decoding delay. The special case with one link connecting each node has been studied by Fong et. al [1], and a multi-hop multi-link setting has been studied by Domanovitz et. al [2]. The topology with three nodes and multiple links is studied in this paper. Each link is subject to a different number of erasures due to different channel conditions. An information-theoretic upper bound is derived, and an achievable scheme is presented. The proposed scheme judiciously allocates rates for each link based on the concept of delay spectrum. The achievable scheme is compared to two baseline schemes and the scheme proposed in [2]. Experimental results show that this scheme achieves higher rates than the other schemes, and can achieve the upper bound even in non-trivial scenarios. The scheme is further extended to handle different propagation delays in each link, something not previously considered in the literature. Simulations over statistical channels show that the proposed scheme can outperform the simpler baseline under practical models.
△ Less
Submitted 17 January, 2022;
originally announced January 2022.
-
Regularized Classification-Aware Quantization
Authors:
Daniel Severo,
Elad Domanovitz,
Ashish Khisti
Abstract:
Traditionally, quantization is designed to minimize the reconstruction error of a data source. When considering downstream classification tasks, other measures of distortion can be of interest; such as the 0-1 classification loss. Furthermore, it is desirable that the performance of these quantizers not deteriorate once they are deployed into production, as relearning the scheme online is not alwa…
▽ More
Traditionally, quantization is designed to minimize the reconstruction error of a data source. When considering downstream classification tasks, other measures of distortion can be of interest; such as the 0-1 classification loss. Furthermore, it is desirable that the performance of these quantizers not deteriorate once they are deployed into production, as relearning the scheme online is not always possible. In this work, we present a class of algorithms that learn distributed quantization schemes for binary classification tasks. Our method performs well on unseen data, and is faster than previous methods proportional to a quadratic term of the dataset size. It works by regularizing the 0-1 loss with the reconstruction error. We present experiments on synthetic mixture and bivariate Gaussian data and compare training, testing, and generalization errors with a family of benchmark quantization schemes from the literature. Our method is called Regularized Classification-Aware Quantization.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.
-
Compressing Multisets with Large Alphabets using Bits-Back Coding
Authors:
Daniel Severo,
James Townsend,
Ashish Khisti,
Alireza Makhzani,
Karen Ullrich
Abstract:
Current methods which compress multisets at an optimal rate have computational complexity that scales linearly with alphabet size, making them too slow to be practical in many real-world settings. We show how to convert a compression algorithm for sequences into one for multisets, in exchange for an additional complexity term that is quasi-linear in sequence length. This allows us to compress mult…
▽ More
Current methods which compress multisets at an optimal rate have computational complexity that scales linearly with alphabet size, making them too slow to be practical in many real-world settings. We show how to convert a compression algorithm for sequences into one for multisets, in exchange for an additional complexity term that is quasi-linear in sequence length. This allows us to compress multisets of exchangeable symbols at an optimal rate, with computational complexity decoupled from the alphabet size. The key insight is to avoid encoding the multiset directly, and instead compress a proxy sequence, using a technique called `bits-back coding'. We demonstrate the method experimentally on tasks which are intractable with previous optimal-rate methods: compression of multisets of images and JavaScript Object Notation (JSON) files. Code for our experiments is available at https://github.com/facebookresearch/multiset-compression.
△ Less
Submitted 27 February, 2023; v1 submitted 15 July, 2021;
originally announced July 2021.
-
Universal Rate-Distortion-Perception Representations for Lossy Compression
Authors:
George Zhang,
**g**g Qian,
Jun Chen,
Ashish Khisti
Abstract:
In the context of lossy compression, Blau & Michaeli (2019) adopt a mathematical notion of perceptual quality and define the information rate-distortion-perception function, generalizing the classical rate-distortion tradeoff. We consider the notion of universal representations in which one may fix an encoder and vary the decoder to achieve any point within a collection of distortion and perceptio…
▽ More
In the context of lossy compression, Blau & Michaeli (2019) adopt a mathematical notion of perceptual quality and define the information rate-distortion-perception function, generalizing the classical rate-distortion tradeoff. We consider the notion of universal representations in which one may fix an encoder and vary the decoder to achieve any point within a collection of distortion and perception constraints. We prove that the corresponding information-theoretic universal rate-distortion-perception function is operationally achievable in an approximate sense. Under MSE distortion, we show that the entire distortion-perception tradeoff of a Gaussian source can be achieved by a single encoder of the same rate asymptotically. We then characterize the achievable distortion-perception region for a fixed representation in the case of arbitrary distributions, identify conditions under which the aforementioned results continue to hold approximately, and study the case when the rate is not fixed in advance. This motivates the study of practical constructions that are approximately universal across the RDP tradeoff, thereby alleviating the need to design a new encoder for each objective. We provide experimental results on MNIST and SVHN suggesting that on image compression tasks, the operational tradeoffs achieved by machine learning models with a fixed encoder suffer only a small penalty when compared to their variable encoder counterparts.
△ Less
Submitted 21 December, 2021; v1 submitted 18 June, 2021;
originally announced June 2021.
-
Two-terminal Erasure Source-Broadcast with Feedback
Authors:
Louis Tan,
Kaveh Mahdaviani,
Ashish Khisti
Abstract:
We study the effects of introducing a feedback channel in the two-receiver erasure source-broadcast problem in which a binary equiprobable source is to be sent over an erasure broadcast channel to two receivers subject to erasure distortion constraints. The receivers each require a certain fraction of a source sequence, and we are interested in the minimum latency, or transmission time, required t…
▽ More
We study the effects of introducing a feedback channel in the two-receiver erasure source-broadcast problem in which a binary equiprobable source is to be sent over an erasure broadcast channel to two receivers subject to erasure distortion constraints. The receivers each require a certain fraction of a source sequence, and we are interested in the minimum latency, or transmission time, required to serve them all. We first show that for a two-user broadcast channel, a point-to-point outer bound can always be achieved. We further show that the point-to-point outer bound can also be achieved if only one of the users, the stronger user, has a feedback channel. Our coding scheme relies on a hybrid approach that combines transmitting both random linear combinations of source symbols as well as a retransmission strategy.
△ Less
Submitted 1 May, 2021;
originally announced May 2021.
-
Markov Rewards Processes with Impulse Rewards and Absorbing States
Authors:
Louis Tan,
Kaveh Mahdaviani,
Ashish Khisti
Abstract:
We study the expected accumulated reward for a discrete-time Markov reward model with absorbing states. The rewards are impulse rewards, where a reward $ρ_{ij}$ is accumulated when transitioning from state $i$ to state $j$. We derive an explicit, single-letter expression for the expected accumulated reward as a function of the number of time steps $n$ and include in our analysis the limit in which…
▽ More
We study the expected accumulated reward for a discrete-time Markov reward model with absorbing states. The rewards are impulse rewards, where a reward $ρ_{ij}$ is accumulated when transitioning from state $i$ to state $j$. We derive an explicit, single-letter expression for the expected accumulated reward as a function of the number of time steps $n$ and include in our analysis the limit in which $n \to \infty$.
△ Less
Submitted 1 May, 2021;
originally announced May 2021.
-
On the Generalization of Stochastic Gradient Descent with Momentum
Authors:
Ali Ramezani-Kebrya,
Ashish Khisti,
Ben Liang
Abstract:
While momentum-based methods, in conjunction with stochastic gradient descent (SGD), are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that there exists a convex loss function for which algorithmic stability fails to establish generalization guarantees when SGD with standard heav…
▽ More
While momentum-based methods, in conjunction with stochastic gradient descent (SGD), are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that there exists a convex loss function for which algorithmic stability fails to establish generalization guarantees when SGD with standard heavy-ball momentum (SGDM) is run for multiple epochs. Then, for smooth Lipschitz loss functions, we analyze a modified momentum-based update rule, i.e., SGD with early momentum (SGDEM), and show that it admits an upper-bound on the generalization error. Thus, our results show that machine learning models can be trained for multiple epochs of SGDEM with a guarantee for generalization. Finally, for the special case of strongly convex loss functions, we find a range of momentum such that multiple epochs of standard SGDM, as a special form of SGDEM, also generalizes. Extending our results on generalization, we also develop an upper-bound on the expected true risk, in terms of the number of training steps, the size of the training set, and the momentum parameter. Experimental evaluations verify the consistency between the numerical results and our theoretical bounds and the effectiveness of SGDEM for smooth Lipschitz loss functions.
△ Less
Submitted 23 September, 2021; v1 submitted 26 February, 2021;
originally announced February 2021.
-
Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding
Authors:
Yangjun Ruan,
Karen Ullrich,
Daniel Severo,
James Townsend,
Ashish Khisti,
Arnaud Doucet,
Alireza Makhzani,
Chris J. Maddison
Abstract:
Latent variable models have been successfully applied in lossless compression with the bits-back coding algorithm. However, bits-back suffers from an increase in the bitrate equal to the KL divergence between the approximate posterior and the true posterior. In this paper, we show how to remove this gap asymptotically by deriving bits-back coding algorithms from tighter variational bounds. The key…
▽ More
Latent variable models have been successfully applied in lossless compression with the bits-back coding algorithm. However, bits-back suffers from an increase in the bitrate equal to the KL divergence between the approximate posterior and the true posterior. In this paper, we show how to remove this gap asymptotically by deriving bits-back coding algorithms from tighter variational bounds. The key idea is to exploit extended space representations of Monte Carlo estimators of the marginal likelihood. Naively applied, our schemes would require more initial bits than the standard bits-back coder, but we show how to drastically reduce this additional cost with couplings in the latent space. When parallel architectures can be exploited, our coders can achieve better rates than bits-back with little additional cost. We demonstrate improved lossless compression rates in a variety of settings, especially in out-of-distribution or sequential data compression.
△ Less
Submitted 14 June, 2021; v1 submitted 22 February, 2021;
originally announced February 2021.
-
Streaming Erasure Codes over Multi-Access Relayed Networks
Authors:
Gustavo Kasper Facenda,
Elad Domanovitz,
Ashish Khisti,
Wai-Tian Tan,
John Apostolopoulos
Abstract:
Many emerging multimedia streaming applications involve multiple users communicating under strict latency constraints. In this paper we study streaming codes for a network involving two source nodes, one relay node and a destination node. In our setting, each source node transmits a stream of messages, through the relay, to a destination, who is required to decode the messages under a strict delay…
▽ More
Many emerging multimedia streaming applications involve multiple users communicating under strict latency constraints. In this paper we study streaming codes for a network involving two source nodes, one relay node and a destination node. In our setting, each source node transmits a stream of messages, through the relay, to a destination, who is required to decode the messages under a strict delay constraint. For the case of a single source node, a class of streaming codes has been proposed by Fong et al. [2], using the concept of {\em delay-spectrum}. In the present work we present an in-depth analysis of the properties of delay-spectrum and apply them to develop streaming codes for our proposed setting through a novel framework. Our first scheme involves greedily selecting the rate on the link from relay to destination and using properties of the delay-spectrum to find feasible streaming codes that satisfy the required delay constraints. We provide a closed form expression for the achievable rate region and identify conditions when the proposed scheme is optimal by establishing a natural outer bound. Our second scheme builds upon this approach, but uses a numerical optimization-based approach to improve the achievable rate region over the first scheme. We demonstrate that our proposed schemes achieve significant improvements over baseline schemes based on single-user codes.
△ Less
Submitted 12 October, 2021; v1 submitted 26 January, 2021;
originally announced January 2021.
-
Streaming Erasure Codes over Multi-hop Relay Network
Authors:
Elad Domanovitz,
Ashish Khisti,
Wai-Tian Tan,
Xiaoqing Zhu,
John Apostolopoulos
Abstract:
This paper studies low-latency streaming codes for the multi-hop network. The source is transmitting a sequence of messages (streaming messages) to a destination through a chain of relays where each hop is subject to packet erasures. Every source message has to be recovered perfectly at the destination within a delay constraint of $T$ time slots. In any sliding window of $T+1$ time slots, we assum…
▽ More
This paper studies low-latency streaming codes for the multi-hop network. The source is transmitting a sequence of messages (streaming messages) to a destination through a chain of relays where each hop is subject to packet erasures. Every source message has to be recovered perfectly at the destination within a delay constraint of $T$ time slots. In any sliding window of $T+1$ time slots, we assume no more than $N_j$ erasures introduced by the $j$'th hop channel. The capacity in case of a single relay (a three-node network) was derived by Fong [1], et al. While the converse derived for the three-node case can be extended to any number of nodes using a similar technique (analyzing the case where erasures on other links are consecutive), we demonstrate next that the achievable scheme, which suggested a clever symbol-wise decode and forward strategy, can not be straightforwardly extended without a loss in performance. The coding scheme for the three-node network, which was shown to achieve the upper bound, was ``state-independent'' (i.e., it does not depend on specific erasure pattern). While this is a very desirable property, in this paper, we suggest a ``state-dependent'' (i.e., a scheme which depends on specific erasure pattern) and show that it achieves the upper bound up to the size of an additional header. Since, as we show, the size of the header does not depend on the field size, the gap between the achievable rate and the upper bound decreases as the field size increases.
△ Less
Submitted 10 June, 2020;
originally announced June 2020.
-
Time-Resolved fMRI Shared Response Model using Gaussian Process Factor Analysis
Authors:
MohammadReza Ebrahimi,
Navona Calarco,
Kieran Campbell,
Colin Hawco,
Aristotle Voineskos,
Ashish Khisti
Abstract:
Multi-subject fMRI studies are challenging due to the high variability of both brain anatomy and functional brain topographies across participants. An effective way of aggregating multi-subject fMRI data is to extract a shared representation that filters out unwanted variability among subjects. Some recent work has implemented probabilistic models to extract a shared representation in task fMRI. I…
▽ More
Multi-subject fMRI studies are challenging due to the high variability of both brain anatomy and functional brain topographies across participants. An effective way of aggregating multi-subject fMRI data is to extract a shared representation that filters out unwanted variability among subjects. Some recent work has implemented probabilistic models to extract a shared representation in task fMRI. In the present work, we improve upon these models by incorporating temporal information in the common latent structures. We introduce a new model, Shared Gaussian Process Factor Analysis (S-GPFA), that discovers shared latent trajectories and subject-specific functional topographies, while modelling temporal correlation in fMRI data. We demonstrate the efficacy of our model in revealing ground truth latent structures using simulated data, and replicate experimental performance of time-segment matching and inter-subject similarity on the publicly available Raider and Sherlock datasets. We further test the utility of our model by analyzing its learned model parameters in the large multi-site SPINS dataset, on a social cognition task from participants with and without schizophrenia.
△ Less
Submitted 4 September, 2020; v1 submitted 9 June, 2020;
originally announced June 2020.
-
Sharpened Generalization Bounds based on Conditional Mutual Information and an Application to Noisy, Iterative Algorithms
Authors:
Mahdi Haghifam,
Jeffrey Negrea,
Ashish Khisti,
Daniel M. Roy,
Gintare Karolina Dziugaite
Abstract:
The information-theoretic framework of Russo and J. Zou (2016) and Xu and Raginsky (2017) provides bounds on the generalization error of a learning algorithm in terms of the mutual information between the algorithm's output and the training sample. In this work, we study the proposal, by Steinke and Zakynthinou (2020), to reason about the generalization error of a learning algorithm by introducing…
▽ More
The information-theoretic framework of Russo and J. Zou (2016) and Xu and Raginsky (2017) provides bounds on the generalization error of a learning algorithm in terms of the mutual information between the algorithm's output and the training sample. In this work, we study the proposal, by Steinke and Zakynthinou (2020), to reason about the generalization error of a learning algorithm by introducing a super sample that contains the training sample as a random subset and computing mutual information conditional on the super sample. We first show that these new bounds based on the conditional mutual information are tighter than those based on the unconditional mutual information. We then introduce yet tighter bounds, building on the "individual sample" idea of Bu, S. Zou, and Veeravalli (2019) and the "data dependent" ideas of Negrea et al. (2019), using disintegrated mutual information. Finally, we apply these bounds to the study of Langevin dynamics algorithm, showing that conditioning on the super sample allows us to exploit information in the optimization trajectory to obtain tighter bounds based on hypothesis tests.
△ Less
Submitted 23 October, 2020; v1 submitted 27 April, 2020;
originally announced April 2020.
-
Sequential Classification with Empirically Observed Statistics
Authors:
Mahdi Haghifam,
Vincent Y. F. Tan,
Ashish Khisti
Abstract:
Motivated by real-world machine learning applications, we consider a statistical classification task in a sequential setting where test samples arrive sequentially. In addition, the generating distributions are unknown and only a set of empirically sampled sequences are available to a decision maker. The decision maker is tasked to classify a test sequence which is known to be generated according…
▽ More
Motivated by real-world machine learning applications, we consider a statistical classification task in a sequential setting where test samples arrive sequentially. In addition, the generating distributions are unknown and only a set of empirically sampled sequences are available to a decision maker. The decision maker is tasked to classify a test sequence which is known to be generated according to either one of the distributions. In particular, for the binary case, the decision maker wishes to perform the classification task with minimum number of the test samples, so, at each step, she declares that either hypothesis 1 is true, hypothesis 2 is true, or she requests for an additional test sample. We propose a classifier and analyze the type-I and type-II error probabilities. We demonstrate the significant advantage of our sequential scheme compared to an existing non-sequential classifier proposed by Gutman. Finally, we extend our setup and results to the multi-class classification scenario and again demonstrate that the variable-length nature of the problem affords significant advantages as one can achieve the same set of exponents as Gutman's fixed-length setting but without having the rejection option.
△ Less
Submitted 9 February, 2021; v1 submitted 2 December, 2019;
originally announced December 2019.
-
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Authors:
Jeffrey Negrea,
Mahdi Haghifam,
Gintare Karolina Dziugaite,
Ashish Khisti,
Daniel M. Roy
Abstract:
In this work, we improve upon the stepwise analysis of noisy iterative learning algorithms initiated by Pensia, Jog, and Loh (2018) and recently extended by Bu, Zou, and Veeravalli (2019). Our main contributions are significantly improved mutual information bounds for Stochastic Gradient Langevin Dynamics via data-dependent estimates. Our approach is based on the variational characterization of mu…
▽ More
In this work, we improve upon the stepwise analysis of noisy iterative learning algorithms initiated by Pensia, Jog, and Loh (2018) and recently extended by Bu, Zou, and Veeravalli (2019). Our main contributions are significantly improved mutual information bounds for Stochastic Gradient Langevin Dynamics via data-dependent estimates. Our approach is based on the variational characterization of mutual information and the use of data-dependent priors that forecast the mini-batch gradient based on a subset of the training samples. Our approach is broadly applicable within the information-theoretic framework of Russo and Zou (2015) and Xu and Raginsky (2017). Our bound can be tied to a measure of flatness of the empirical risk surface. As compared with other bounds that depend on the squared norms of gradients, empirical investigations show that the terms in our bounds are orders of magnitude smaller.
△ Less
Submitted 25 January, 2020; v1 submitted 5 November, 2019;
originally announced November 2019.
-
Low-Latency Network-Adaptive Error Control for Interactive Streaming
Authors:
Salma Emara,
Silas L. Fong,
Baochun Li,
Ashish Khisti,
Wai-Tian Tan,
Xiaoqing Zhu,
John Apostolopoulos
Abstract:
We introduce a novel network-adaptive algorithm that is suitable for alleviating network packet losses for low-latency interactive communications between a source and a destination. Our network-adaptive algorithm estimates in real-time the best parameters of a recently proposed streaming code that uses forward error correction (FEC) to correct both arbitrary and burst losses, which cause a crackli…
▽ More
We introduce a novel network-adaptive algorithm that is suitable for alleviating network packet losses for low-latency interactive communications between a source and a destination. Our network-adaptive algorithm estimates in real-time the best parameters of a recently proposed streaming code that uses forward error correction (FEC) to correct both arbitrary and burst losses, which cause a crackling noise and undesirable jitters, respectively in audio. In particular, the destination estimates appropriate coding parameters based on its observed packet loss pattern and sends them back to the source for updating the underlying code. Besides, a new explicit construction of practical low-latency streaming codes that achieve the optimal tradeoff between the capability of correcting arbitrary losses and the capability of correcting burst losses is provided. Simulation evaluations based on statistical losses and real-world packet loss traces reveal the following: (i) Our proposed network-adaptive algorithm combined with our optimal streaming codes can achieve significantly higher performance compared to uncoded and non-adaptive FEC schemes over UDP (User Datagram Protocol); (ii) Our explicit streaming codes can significantly outperform traditional MDS (maximum-distance separable) streaming schemes when they are used along with our network-adaptive algorithm.
△ Less
Submitted 29 June, 2020; v1 submitted 14 September, 2019;
originally announced September 2019.
-
An Explicit Rate-Optimal Streaming Code for Channels with Burst and Arbitrary Erasures
Authors:
Elad Domanovitz,
Silas L. Fong,
Ashish Khisti
Abstract:
This paper considers the transmission of an infinite sequence of messages (a streaming source) over a packet erasure channel, where every source message must be recovered perfectly at the destination subject to a fixed decoding delay. While the capacity of a channel that introduces only bursts of erasures is well known, only recently, the capacity of a channel with either one burst of erasures or…
▽ More
This paper considers the transmission of an infinite sequence of messages (a streaming source) over a packet erasure channel, where every source message must be recovered perfectly at the destination subject to a fixed decoding delay. While the capacity of a channel that introduces only bursts of erasures is well known, only recently, the capacity of a channel with either one burst of erasures or multiple arbitrary erasures in any fixed-sized sliding window has been established. However, the codes shown to achieve this capacity are either non-explicit constructions (proven to exist) or explicit constructions that require large field size that scales exponentially with the delay. This work describes an explicit rate-optimal construction for admissible channel and delay parameters over a field size that scales only quadratically with the delay.
△ Less
Submitted 13 May, 2020; v1 submitted 11 April, 2019;
originally announced April 2019.
-
Maximal Information Leakage based Privacy Preserving Data Disclosure Mechanisms
Authors:
Tianrui Xiao,
Ashish Khisti
Abstract:
It is often necessary to disclose training data to the public domain, while protecting privacy of certain sensitive labels. We use information theoretic measures to develop such privacy preserving data disclosure mechanisms. Our mechanism involves perturbing the data vectors in a manner that strikes a balance in the privacy-utility trade-off. We use maximal information leakage between the output d…
▽ More
It is often necessary to disclose training data to the public domain, while protecting privacy of certain sensitive labels. We use information theoretic measures to develop such privacy preserving data disclosure mechanisms. Our mechanism involves perturbing the data vectors in a manner that strikes a balance in the privacy-utility trade-off. We use maximal information leakage between the output data vector and the confidential label as our privacy metric. We first study the theoretical Bernoulli-Gaussian model and study the privacy-utility trade-off when only the mean of the Gaussian distributions can be perturbed. We show that the optimal solution is the same as the case when the utility is measured using probability of error at the adversary. We then consider an application of this framework to a data driven setting and provide an empirical approximation to the Sibson mutual information. By performing experiments on the MNIST and FERG data-sets, we show that our proposed framework achieves equivalent or better privacy than previous methods based on mutual information.
△ Less
Submitted 4 April, 2019; v1 submitted 1 April, 2019;
originally announced April 2019.
-
An Explicit Construction of Optimal Streaming Codes for Channels with Burst and Arbitrary Erasures
Authors:
Damian Dudzicz,
Silas L. Fong,
Ashish Khisti
Abstract:
This paper presents a new construction of error correcting codes which achieves optimal recovery of a streaming source over a packet erasure channel. The channel model considered is the sliding window erasure model, with burst and arbitrary losses, introduced by Badr et al. . Recently, two independents works by Fong et al. and Krishnan and Kumar have identified optimal streaming codes within this…
▽ More
This paper presents a new construction of error correcting codes which achieves optimal recovery of a streaming source over a packet erasure channel. The channel model considered is the sliding window erasure model, with burst and arbitrary losses, introduced by Badr et al. . Recently, two independents works by Fong et al. and Krishnan and Kumar have identified optimal streaming codes within this framework. In this paper, we introduce streaming code when the rate of the code is at least 1/2. Our proposed construction is explicit and systematic, uses off-the-shelf maximum distance separable (MDS) codes and maximum rank distance (MRD) Gabidulin block codes as constituent codes and achieves the optimal error correction. It presents a natural generalization to the construction of Martinian and Sundberg to tolerate an arbitrary number of sparse erasures. The field size requirement which depends on the constituent MDS and MRD codes is also analyzed.
△ Less
Submitted 22 April, 2019; v1 submitted 18 March, 2019;
originally announced March 2019.
-
Optimal Multiplexed Erasure Codes for Streaming Messages with Different Decoding Delays
Authors:
Silas L. Fong,
Ashish Khisti,
Baochun Li,
Wai-Tian Tan,
Xiaoqing Zhu,
John Apostolopoulos
Abstract:
This paper considers multiplexing two sequences of messages with two different decoding delays over a packet erasure channel. In each time slot, the source constructs a packet based on the current and previous messages and transmits the packet, which may be erased when the packet travels from the source to the destination. The destination must perfectly recover every source message in the first se…
▽ More
This paper considers multiplexing two sequences of messages with two different decoding delays over a packet erasure channel. In each time slot, the source constructs a packet based on the current and previous messages and transmits the packet, which may be erased when the packet travels from the source to the destination. The destination must perfectly recover every source message in the first sequence subject to a decoding delay $T_\mathrm{v}$ and every source message in the second sequence subject to a shorter decoding delay $T_\mathrm{u}\le T_\mathrm{v}$. We assume that the channel loss model introduces a burst erasure of a fixed length $B$ on the discrete timeline. Under this channel loss assumption, the capacity region for the case where $T_\mathrm{v}\le T_\mathrm{u}+B$ was previously solved. In this paper, we fully characterize the capacity region for the remaining case $T_\mathrm{v}> T_\mathrm{u}+B$. The key step in the achievability proof is achieving the non-trivial corner point of the capacity region through using a multiplexed streaming code constructed by superimposing two single-stream codes. The main idea in the converse proof is obtaining a genie-aided bound when the channel is subject to a periodic erasure pattern where each period consists of a length-$B$ burst erasure followed by a length-$T_\mathrm{u}$ noiseless duration.
△ Less
Submitted 8 November, 2019; v1 submitted 11 January, 2019;
originally announced January 2019.
-
On the Generalization of Stochastic Gradient Descent with Momentum
Authors:
Ali Ramezani-Kebrya,
Kimon Antonakopoulos,
Volkan Cevher,
Ashish Khisti,
Ben Liang
Abstract:
While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that there exists a convex loss function for which the stability gap for multiple epochs of SGD with standard heavy-ball momentum (SGDM) becomes unboun…
▽ More
While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that there exists a convex loss function for which the stability gap for multiple epochs of SGD with standard heavy-ball momentum (SGDM) becomes unbounded. Then, for smooth Lipschitz loss functions, we analyze a modified momentum-based update rule, i.e., SGD with early momentum (SGDEM) under a broad range of step-sizes, and show that it can train machine learning models for multiple epochs with a guarantee for generalization. Finally, for the special case of strongly convex loss functions, we find a range of momentum such that multiple epochs of standard SGDM, as a special form of SGDEM, also generalizes. Extending our results on generalization, we also develop an upper bound on the expected true risk, in terms of the number of training steps, sample size, and momentum. Our experimental evaluations verify the consistency between the numerical results and our theoretical bounds. SGDEM improves the generalization error of SGDM when training ResNet-18 on ImageNet in practical distributed settings.
△ Less
Submitted 15 January, 2024; v1 submitted 12 September, 2018;
originally announced September 2018.
-
Optimal Streaming Erasure Codes over the Three-Node Relay Network
Authors:
Silas L. Fong,
Ashish Khisti,
Baochun Li,
Wai-Tian Tan,
Xiaoqing Zhu,
John Apostolopoulos
Abstract:
This paper investigates low-latency streaming codes for a three-node relay network. The source transmits a sequence of messages (streaming messages) to the destination through the relay between them, where the first-hop channel from the source to the relay and the second-hop channel from the relay to the destination are subject to packet erasures. Every source message must be recovered perfectly a…
▽ More
This paper investigates low-latency streaming codes for a three-node relay network. The source transmits a sequence of messages (streaming messages) to the destination through the relay between them, where the first-hop channel from the source to the relay and the second-hop channel from the relay to the destination are subject to packet erasures. Every source message must be recovered perfectly at the destination subject to a fixed decoding delay of $T$ time slots. In any sliding window of $T+1$ time slots, we assume no more than $N_1$ and $N_2$ erasures are introduced by the first-hop channel and second-hop channel respectively. Under this channel loss assumption, we fully characterize the maximum achievable rate in terms of $T$, $N_1$ and $N_2$. The achievability is proved by using a symbol-wise decode-forward strategy where the source symbols within the same message are decoded by the relay with different delays. The converse is proved by analyzing the maximum achievable rate for each channel when the erasures in the other channel are consecutive (bursty). In addition, we show that traditional message-wise decode-forward strategies, which require the source symbols within the same message to be decoded by the relay with the same delay, are sub-optimal in general.
△ Less
Submitted 10 March, 2022; v1 submitted 25 June, 2018;
originally announced June 2018.
-
A Survey of Physical Layer Security Techniques for 5G Wireless Networks and Challenges Ahead
Authors:
Yongpeng Wu,
Ashish Khisti,
Chengshan Xiao,
Giuseppe Caire,
Kai-Kit Wong,
Xiqi Gao
Abstract:
Physical layer security which safeguards data confidentiality based on the information-theoretic approaches has received significant research interest recently. The key idea behind physical layer security is to utilize the intrinsic randomness of the transmission channel to guarantee the security in physical layer. The evolution towards 5G wireless communications poses new challenges for physical…
▽ More
Physical layer security which safeguards data confidentiality based on the information-theoretic approaches has received significant research interest recently. The key idea behind physical layer security is to utilize the intrinsic randomness of the transmission channel to guarantee the security in physical layer. The evolution towards 5G wireless communications poses new challenges for physical layer security research. This paper provides a latest survey of the physical layer security research on various promising 5G technologies, including physical layer security coding, massive multiple-input multiple-output, millimeter wave communications, heterogeneous networks, non-orthogonal multiple access, full duplex technology, etc. Technical challenges which remain unresolved at the time of writing are summarized and the future trends of physical layer security in 5G and beyond are discussed.
△ Less
Submitted 16 January, 2018;
originally announced January 2018.
-
Optimal Streaming Codes for Channels with Burst and Arbitrary Erasures
Authors:
Silas L. Fong,
Ashish Khisti,
Baochun Li,
Wai-Tian Tan,
Xiaoqing Zhu,
John Apostolopoulos
Abstract:
This paper considers transmitting a sequence of messages (a streaming source) over a packet erasure channel. In each time slot, the source constructs a packet based on the current and the previous messages and transmits the packet, which may be erased when the packet travels from the source to the destination. Every source message must be recovered perfectly at the destination subject to a fixed d…
▽ More
This paper considers transmitting a sequence of messages (a streaming source) over a packet erasure channel. In each time slot, the source constructs a packet based on the current and the previous messages and transmits the packet, which may be erased when the packet travels from the source to the destination. Every source message must be recovered perfectly at the destination subject to a fixed decoding delay. We assume that the channel loss model introduces either one burst erasure or multiple arbitrary erasures in any fixed-sized sliding window. Under this channel loss assumption, we fully characterize the maximum achievable rate by constructing streaming codes that achieve the optimal rate. In addition, our construction of optimal streaming codes implies the full characterization of the maximum achievable rate for convolutional codes with any given column distance, column span and decoding delay. Numerical results demonstrate that the optimal streaming codes outperform existing streaming codes of comparable complexity over some instances of the Gilbert-Elliott channel and the Fritchman channel.
△ Less
Submitted 5 December, 2018; v1 submitted 12 January, 2018;
originally announced January 2018.
-
Bandwidth Adaptive & Error Resilient MBR Exact Repair Regenerating Codes
Authors:
Kaveh Mahdaviani,
Ashish Khisti,
Soheil Mohajer
Abstract:
Regenerating codes are efficient methods for distributed storage in storage networks, where node failures are common. They guarantee low cost data reconstruction and repair through accessing only a predefined number of arbitrarily chosen storage nodes in the network. In this work we consider two simultaneous extensions to the original regenerating codes framework introduced in [1]; i) both data re…
▽ More
Regenerating codes are efficient methods for distributed storage in storage networks, where node failures are common. They guarantee low cost data reconstruction and repair through accessing only a predefined number of arbitrarily chosen storage nodes in the network. In this work we consider two simultaneous extensions to the original regenerating codes framework introduced in [1]; i) both data reconstruction and repair are resilient to the presence of a certain number of erroneous nodes in the network and ii) the number of helper nodes in every repair is not fixed, but is a flexible parameter that can be selected during the runtime. We study the fundamental limits of required total repair bandwidth and provide an upper bound for the storage capacity of these codes under these assumptions. We then focus on the minimum repair bandwidth (MBR) case and derive the exact storage capacity by presenting explicit coding schemes with exact repair, which achieve the upper bound of the storage capacity in the considered setup. To this end, we first provide a more natural extension of the well-known Product Matrix (PM) MBR codes [2], modified to provide flexibility in the choice of number of helpers in each repair, and simultaneously be robust to erroneous nodes in the network. This is achieved by proving the non-singularity of family of matrices in large enough finite fields. We next provide another extension of the PM codes, based on novel repair schemes which enable flexibility in the number of helpers and robustness against erroneous nodes without any extra cost in field size compared to the original PM codes.
△ Less
Submitted 7 November, 2017;
originally announced November 2017.
-
Product Matrix Minimum Storage Regenerating Codes with Flexible Number of Helpers
Authors:
Kaveh Mahdaviani,
Soheil Mohajer,
Ashish Khisti
Abstract:
In coding for distributed storage systems, efficient data reconstruction and repair through accessing a predefined number of arbitrarily chosen storage nodes is guaranteed by regenerating codes. Traditionally, code parameters, specially the number of helper nodes participating in a repair process, are predetermined. However, depending on the state of the system and network traffic, it is desirable…
▽ More
In coding for distributed storage systems, efficient data reconstruction and repair through accessing a predefined number of arbitrarily chosen storage nodes is guaranteed by regenerating codes. Traditionally, code parameters, specially the number of helper nodes participating in a repair process, are predetermined. However, depending on the state of the system and network traffic, it is desirable to adapt such parameters accordingly in order to minimize the cost of repair. In this work a class of regenerating codes with minimum storage is introduced that can simultaneously operate at the optimal repair bandwidth, for a wide range of exact repair mechanisms, based on different number of helper nodes.
△ Less
Submitted 28 December, 2017; v1 submitted 20 August, 2017;
originally announced August 2017.
-
Product Matrix MSR Codes with Bandwidth Adaptive Exact Repair
Authors:
Kaveh Mahdaviani,
Soheil Mohajer,
Ashish Khisti
Abstract:
In a distributed storage systems (DSS) with $k$ systematic nodes, robustness against node failure is commonly provided by storing redundancy in a number of other nodes and performing repair mechanism to reproduce the content of the failed nodes. Efficiency is then achieved by minimizing the storage overhead and the amount of data transmission required for data reconstruction and repair, provided b…
▽ More
In a distributed storage systems (DSS) with $k$ systematic nodes, robustness against node failure is commonly provided by storing redundancy in a number of other nodes and performing repair mechanism to reproduce the content of the failed nodes. Efficiency is then achieved by minimizing the storage overhead and the amount of data transmission required for data reconstruction and repair, provided by coding solutions such as regenerating codes [1]. Common explicit regenerating code constructions enable efficient repair through accessing a predefined number, $d$, of arbitrary chosen available nodes, namely helpers. In practice, however, the state of the system dynamically changes based on the request load, the link traffic, etc., and the parameters which optimize system's performance vary accordingly. It is then desirable to have coding schemes which are able to operate optimally under a range of different parameters simultaneously. Specifically, adaptivity in the number of helper nodes for repair is of interest. While robustness requires capability of performing repair with small number of helpers, it is desirable to use as many helpers as available to reduce the transmission delay and total repair traffic.
In this work we focus on the minimum storage regenerating (MSR) codes, where each node is supposed to store $α$ information units, and the source data of size $kα$ could be recovered from any arbitrary set of $k$ nodes. We introduce a class MSR codes that realize optimal repair bandwidth simultaneously with a set of different choices for the number of helpers, namely $D=\{d_{1}, \cdots, d_δ\}$. Our coding scheme follows the Product Matrix (PM) framework introduced in [2], and could be considered as a generalization of the PM MSR code presented in [2], such that any $d_{i} = (i+1)(k-1)$ helpers can perform an optimal repair. ...
△ Less
Submitted 28 December, 2017; v1 submitted 10 August, 2017;
originally announced August 2017.
-
Covert Communication with Channel-State Information at the Transmitter
Authors:
Si-Hyeon Lee,
Ligong Wang,
Ashish Khisti,
Gregory W. Wornell
Abstract:
We consider the problem of covert communication over a state-dependent channel, where the transmitter has causal or noncausal knowledge of the channel states. Here, "covert" means that a warden on the channel should observe similar statistics when the transmitter is sending a message and when it is not. When a sufficiently long secret key is shared between the transmitter and the receiver, we deri…
▽ More
We consider the problem of covert communication over a state-dependent channel, where the transmitter has causal or noncausal knowledge of the channel states. Here, "covert" means that a warden on the channel should observe similar statistics when the transmitter is sending a message and when it is not. When a sufficiently long secret key is shared between the transmitter and the receiver, we derive closed-form formulas for the maximum achievable covert communication rate ("covert capacity") for discrete memoryless channels and, when the transmitter's channel-state information (CSI) is noncausal, for additive white Gaussian noise (AWGN) channels. For certain channel models, including the AWGN channel, we show that the covert capacity is positive with CSI at the transmitter, but is zero without CSI. We also derive lower bounds on the rate of the secret key that is needed for the transmitter and the receiver to achieve the covert capacity.
△ Less
Submitted 8 August, 2017;
originally announced August 2017.
-
Secure Broadcasting Using Independent Secret Keys
Authors:
Rafael F. Schaefer,
Ashish Khisti,
H. Vincent Poor
Abstract:
The problem of secure broadcasting with independent secret keys is studied. The particular scenario is analyzed in which a common message has to be broadcast to two legitimate receivers, while kee** an external eavesdropper ignorant of it. The transmitter shares independent secret keys of sufficiently high rates with both legitimate receivers, which can be used in different ways: they can be use…
▽ More
The problem of secure broadcasting with independent secret keys is studied. The particular scenario is analyzed in which a common message has to be broadcast to two legitimate receivers, while kee** an external eavesdropper ignorant of it. The transmitter shares independent secret keys of sufficiently high rates with both legitimate receivers, which can be used in different ways: they can be used as one-time pads to encrypt the common message, as fictitious messages for wiretap coding, or as a hybrid of these. In this paper, capacity results are established when the broadcast channels involving the three receivers are degraded. If both legitimate channels are degraded versions of the eavesdropper's channel, it is shown that the one-time pad approach is optimal for several cases, yielding corresponding capacity expressions. Alternatively, the wiretap coding approach is shown to be optimal if the eavesdropper's channel is degraded with respect to both legitimate channels, establishing capacity in this case as well. If the eavesdropper's channel is neither the strongest nor the weakest, an intricate scheme that carefully combines both concepts of one-time pad and wiretap coding with fictitious messages turns out to be capacity-achieving. Finally we also obtain some results for the general non-degraded broadcast channel.
△ Less
Submitted 19 February, 2018; v1 submitted 19 June, 2017;
originally announced June 2017.
-
Tracking and Control of Gauss-Markov Processes over Packet-Drop Channels with Acknowledgments
Authors:
Anatoly Khina,
Victoria Kostina,
Ashish Khisti,
Babak Hassibi
Abstract:
We consider the problem of tracking the state of Gauss-Markov processes over rate-limited erasure-prone links. We concentrate first on the scenario in which several independent processes are seen by a single observer. The observer maps the processes into finite-rate packets that are sent over the erasure-prone links to a state estimator, and are acknowledged upon packet arrivals. The aim of the st…
▽ More
We consider the problem of tracking the state of Gauss-Markov processes over rate-limited erasure-prone links. We concentrate first on the scenario in which several independent processes are seen by a single observer. The observer maps the processes into finite-rate packets that are sent over the erasure-prone links to a state estimator, and are acknowledged upon packet arrivals. The aim of the state estimator is to track the processes with zero delay and with minimum mean square error (MMSE). We show that, in the limit of many processes, greedy quantization with respect to the squared error distortion is optimal. That is, there is no tension between optimizing the MMSE of the process in the current time instant and that of future times. For the case of packet erasures with delayed acknowledgments, we connect the problem to that of compression with side information that is known at the observer and may be known at the state estimator - where the most recent packets serve as side information that may have been erased, and demonstrate that the loss due to a delay by one time unit is rather small. For the scenario where only one process is tracked by the observer-state estimator system, we further show that variable-length coding techniques are within a small gap of the many-process outer bound. We demonstrate the usefulness of the proposed approach for the simple setting of discrete-time scalar linear quadratic Gaussian control with a limited data-rate feedback that is susceptible to packet erasures.
△ Less
Submitted 23 May, 2018; v1 submitted 6 February, 2017;
originally announced February 2017.
-
The Wiretapped Diamond-Relay Channel
Authors:
Si-Hyeon Lee,
Ashish Khisti
Abstract:
In this paper, we study a diamond-relay channel where the source is connected to $M$ relays through orthogonal links and the relays transmit to the destination over a wireless multiple-access channel in the presence of an eavesdropper. The eavesdropper not only observes the relay transmissions through another multiple-access channel, but also observes a certain number of source-relay links. The le…
▽ More
In this paper, we study a diamond-relay channel where the source is connected to $M$ relays through orthogonal links and the relays transmit to the destination over a wireless multiple-access channel in the presence of an eavesdropper. The eavesdropper not only observes the relay transmissions through another multiple-access channel, but also observes a certain number of source-relay links. The legitimate terminals know neither the eavesdropper's channel state information nor the location of source-relay links revealed to the eavesdropper except the total number of such links.
For this wiretapped diamond-relay channel, we establish the optimal secure degrees of freedom. In the achievability part, our proposed scheme uses the source-relay links to transmit a judiciously constructed combination of message symbols, artificial noise symbols as well as fictitious message symbols associated with secure network coding. The relays use a combination of beamforming and interference alignment in their transmission scheme. For the converse part, we take a genie-aided approach assuming that the location of wiretapped links is known.
△ Less
Submitted 19 June, 2016;
originally announced June 2016.
-
Exact Moderate Deviation Asymptotics in Streaming Data Transmission
Authors:
Si-Hyeon Lee,
Vincent Y. F. Tan,
Ashish Khisti
Abstract:
In this paper, a streaming transmission setup is considered where an encoder observes a new message in the beginning of each block and a decoder sequentially decodes each message after a delay of $T$ blocks. In this streaming setup, the fundamental interplay between the coding rate, the error probability, and the blocklength in the moderate deviations regime is studied. For output symmetric channe…
▽ More
In this paper, a streaming transmission setup is considered where an encoder observes a new message in the beginning of each block and a decoder sequentially decodes each message after a delay of $T$ blocks. In this streaming setup, the fundamental interplay between the coding rate, the error probability, and the blocklength in the moderate deviations regime is studied. For output symmetric channels, the moderate deviations constant is shown to improve over the block coding or non-streaming setup by exactly a factor of $T$ for a certain range of moderate deviations scalings. For the converse proof, a more powerful decoder to which some extra information is fedforward is assumed. The error probability is bounded first for an auxiliary channel and this result is translated back to the original channel by using a newly developed change-of-measure lemma, where the speed of decay of the remainder term in the exponent is carefully characterized. For the achievability proof, a known coding technique that involves a joint encoding and decoding of fresh and past messages is applied with some manipulations in the error analysis.
△ Less
Submitted 22 April, 2016;
originally announced April 2016.
-
Streaming Data Transmission in the Moderate Deviations and Central Limit Regimes
Authors:
Si-Hyeon Lee,
Vincent Y. F. Tan,
Ashish Khisti
Abstract:
We consider streaming data transmission over a discrete memoryless channel. A new message is given to the encoder at the beginning of each block and the decoder decodes each message sequentially, after a delay of $T$ blocks. In this streaming setup, we study the fundamental interplay between the rate and error probability in the central limit and moderate deviations regimes and show that i) in the…
▽ More
We consider streaming data transmission over a discrete memoryless channel. A new message is given to the encoder at the beginning of each block and the decoder decodes each message sequentially, after a delay of $T$ blocks. In this streaming setup, we study the fundamental interplay between the rate and error probability in the central limit and moderate deviations regimes and show that i) in the moderate deviations regime, the moderate deviations constant improves over the block coding or non-streaming setup by a factor of $T$ and ii) in the central limit regime, the second-order coding rate improves by a factor of approximately $\sqrt{T}$ for a wide range of channel parameters. For both regimes, we propose coding techniques that incorporate a joint encoding of fresh and previous messages. In particular, for the central limit regime, we propose a coding technique with truncated memory to ensure that a summation of constants, which arises as a result of applications of the central limit theorem, does not diverge in the error analysis.
Furthermore, we explore interesting variants of the basic streaming setup in the moderate deviations regime. We first consider a scenario with an erasure option at the decoder and show that both the exponents of the total error and the undetected error probabilities improve by factors of $T$. Next, by utilizing the erasure option, we show that the exponent of the total error probability can be improved to that of the undetected error probability (in the order sense) at the expense of a variable decoding delay. Finally, we also extend our results to the case where the message rate is not fixed but alternates between two values.
△ Less
Submitted 19 December, 2015;
originally announced December 2015.
-
Secure Degrees of Freedom of the Gaussian Diamond-Wiretap Channel
Authors:
Si-Hyeon Lee,
Wanyao Zhao,
Ashish Khisti
Abstract:
In this paper, we consider the Gaussian diamond-wiretap channel that consists of an orthogonal broadcast channel from a source to two relays and a Gaussian fast-fading multiple access-wiretap channel from the two relays to a legitimate destination and an eavesdropper. For the multiple access part, we consider both the case with full channel state information (CSI) and the case with no eavesdropper…
▽ More
In this paper, we consider the Gaussian diamond-wiretap channel that consists of an orthogonal broadcast channel from a source to two relays and a Gaussian fast-fading multiple access-wiretap channel from the two relays to a legitimate destination and an eavesdropper. For the multiple access part, we consider both the case with full channel state information (CSI) and the case with no eavesdropper's CSI, at the relays and the legitimate destination. For both the cases, we establish the exact secure degrees of freedom and generalize the results for multiple relays.
For the converse part, we introduce a new technique of capturing the trade-off between the message rate and the amount of individual randomness injected at each relay. In the achievability part, we show (i) how to strike a balance between sending message symbols and common noise symbols from the source to the relays in the broadcast component and (ii) how to combine artificial noise-beamforming and noise-alignment techniques at the relays in the multiple access component. In the case with full CSI, we propose a scheme where the relays simultaneously beamform common noise signals in the null space of the legitimate destination's channel, and align them with the message signals at the eavesdropper. In the case with no eavesdropper's CSI, we present a scheme that efficiently utilizes the broadcast links by incorporating computation between the message and common noise symbols at the source. Finally, most of our achievability and converse techniques can also be adapted to the Gaussian (non-fading) channel model.
△ Less
Submitted 19 December, 2015;
originally announced December 2015.
-
Information-Theoretic Privacy for Smart Metering Systems with a Rechargeable Battery
Authors:
Simon Li,
Ashish Khisti,
Aditya Mahajan
Abstract:
Smart-metering systems report electricity usage of a user to the utility provider on almost real-time basis. This could leak private information about the user to the utility provider. In this work we investigate the use of a rechargeable battery in order to provide privacy to the user. We assume that the user load sequence is a first-order Markov process, the battery satisfies ideal charge conser…
▽ More
Smart-metering systems report electricity usage of a user to the utility provider on almost real-time basis. This could leak private information about the user to the utility provider. In this work we investigate the use of a rechargeable battery in order to provide privacy to the user. We assume that the user load sequence is a first-order Markov process, the battery satisfies ideal charge conservation, and that privacy is measured using normalized mutual information (leakage rate) between the user load and the battery output. We consider battery charging policies in this setup that satisfy the feasibility constraints. We propose a series reductions on the original problem and ultimately recast it as a Markov Decision Process (MDP) that can be solved using a dynamic program. In the special case of i.i.d. demand, we explicitly characterize the optimal policy and show that the associated leakage rate can be expressed as a single-letter mutual information expression. In this case we show that the optimal charging policy admits an intuitive interpretation of preserving a certain invariance property of the state. Interestingly an alternative proof of optimality can be provided that does not rely on the MDP approach, but is based on purely information theoretic reductions.
△ Less
Submitted 15 September, 2017; v1 submitted 24 October, 2015;
originally announced October 2015.
-
The MIMO Wiretap Channel Decomposed
Authors:
Anatoly Khina,
Yuval Kochman,
Ashish Khisti
Abstract:
The problem of sending a secret message over the Gaussian multiple-input multiple-output (MIMO) wiretap channel is studied. While the capacity of this channel is known, it is not clear how to construct optimal coding schemes that achieve this capacity. In this work, we use linear operations along with successive interference cancellation to attain effective parallel single-antenna wiretap channels…
▽ More
The problem of sending a secret message over the Gaussian multiple-input multiple-output (MIMO) wiretap channel is studied. While the capacity of this channel is known, it is not clear how to construct optimal coding schemes that achieve this capacity. In this work, we use linear operations along with successive interference cancellation to attain effective parallel single-antenna wiretap channels. By using independent scalar Gaussian wiretap codebooks over the resulting parallel channels, the capacity of the MIMO wiretap channel is achieved. The derivation of the schemes is based upon joint triangularization of the channel matrices. We find that the same technique can be used to re-derive capacity expressions for the MIMO wiretap channel in a way that is simple and closely connected to a transmission scheme. This technique allows to extend the previously proven strong security for scalar Gaussian channels to the MIMO case. We further consider the problem of transmitting confidential messages over a two-user broadcast MIMO channel. For that problem, we find that derivation of both the capacity and a transmission scheme is a direct corollary of the proposed analysis for the MIMO wiretap channel.
△ Less
Submitted 27 October, 2016; v1 submitted 6 September, 2015;
originally announced September 2015.
-
Convolutional Codes with Maximum Column Sum Rank for Network Streaming
Authors:
Rafid Mahmood,
Ahmed Badr,
Ashish Khisti
Abstract:
The column Hamming distance of a convolutional code determines the error correction capability when streaming over a class of packet erasure channels. We introduce a metric known as the column sum rank, that parallels column Hamming distance when streaming over a network with link failures. We prove rank analogues of several known column Hamming distance properties and introduce a new family of co…
▽ More
The column Hamming distance of a convolutional code determines the error correction capability when streaming over a class of packet erasure channels. We introduce a metric known as the column sum rank, that parallels column Hamming distance when streaming over a network with link failures. We prove rank analogues of several known column Hamming distance properties and introduce a new family of convolutional codes that maximize the column sum rank up to the code memory. Our construction involves finding a class of super-regular matrices that preserve this property after multiplication with non-singular block diagonal matrices in the ground field.
△ Less
Submitted 16 April, 2016; v1 submitted 11 June, 2015;
originally announced June 2015.
-
The Degraded Gaussian Diamond-Wiretap Channel
Authors:
Si-Hyeon Lee,
Ashish Khisti
Abstract:
In this paper, we present nontrivial upper and lower bounds on the secrecy capacity of the degraded Gaussian diamond-wiretap channel and identify several ranges of channel parameters where these bounds coincide with useful intuitions. Furthermore, we investigate the effect of the presence of an eavesdropper on the capacity. We consider the following two scenarios regarding the availability of rand…
▽ More
In this paper, we present nontrivial upper and lower bounds on the secrecy capacity of the degraded Gaussian diamond-wiretap channel and identify several ranges of channel parameters where these bounds coincide with useful intuitions. Furthermore, we investigate the effect of the presence of an eavesdropper on the capacity. We consider the following two scenarios regarding the availability of randomness: 1) a common randomness is available at the source and the two relays and 2) a randomness is available only at the source and there is no available randomness at the relays. We obtain the upper bound by taking into account the correlation between the two relay signals and the availability of randomness at each encoder. For the lower bound, we propose two types of coding schemes: 1) a decode-and-forward scheme where the relays cooperatively transmit the message and the fictitious message and 2) a partial DF scheme incorporated with multicoding in which each relay sends an independent partial message and the whole or partial fictitious message using dependent codewords.
△ Less
Submitted 22 April, 2015;
originally announced April 2015.