Search | arXiv e-print repository

Sparsity-Constrained Community-Based Group Testing

Authors: Sarthak Jain, Martina Cardone, Soheil Mohajer

Abstract: In this work, we consider the sparsity-constrained community-based group testing problem, where the population follows a community structure. In particular, the community consists of $F$ families, each with $M$ members. A number $k_f$ out of the $F$ families are infected, and a family is said to be infected if $k_m$ out of its $M$ members are infected. Furthermore, the sparsity constraint allows a… ▽ More In this work, we consider the sparsity-constrained community-based group testing problem, where the population follows a community structure. In particular, the community consists of $F$ families, each with $M$ members. A number $k_f$ out of the $F$ families are infected, and a family is said to be infected if $k_m$ out of its $M$ members are infected. Furthermore, the sparsity constraint allows at most $ρ_T$ individuals to be grouped in each test. For this sparsity-constrained community model, we propose a probabilistic group testing algorithm that can identify the infected population with a vanishing probability of error and we provide an upper-bound on the number of tests. When $k_m = Θ(M)$ and $M \gg \log(FM)$, our bound outperforms the existing sparsity-constrained group testing results trivially applied to the community model. If the sparsity constraint is relaxed, our achievable bound reduces to existing bounds for community-based group testing. Moreover, our scheme can also be applied to the classical dilution model, where it outperforms existing noise-level-independent schemes in the literature. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2402.00465 [pdf, ps, other]

Coded Multi-User Information Retrieval with a Multi-Antenna Helper Node

Authors: Milad Abolpour, MohammadJavad Salehi, Soheil Mohajer, Seyed Pooya Shariatpanahi, Antti Tölli

Abstract: A novel coding design is proposed to enhance information retrieval in a wireless network of users with partial access to the data, in the sense of observation, measurement, computation, or storage. Information exchange in the network is assisted by a multi-antenna base station (BS), with no direct access to the data. Accordingly, the missing parts of data are exchanged among users through an uplin… ▽ More A novel coding design is proposed to enhance information retrieval in a wireless network of users with partial access to the data, in the sense of observation, measurement, computation, or storage. Information exchange in the network is assisted by a multi-antenna base station (BS), with no direct access to the data. Accordingly, the missing parts of data are exchanged among users through an uplink (UL) step followed by a downlink (DL) step. In this paper, new coding strategies, inspired by coded caching (CC) techniques, are devised to enhance both UL and DL steps. In the UL step, users transmit encoded and properly combined parts of their accessible data to the BS. Then, during the DL step, the BS carries out the required processing on its received signals and forwards a proper combination of the resulting signal terms back to the users, enabling each user to retrieve the desired information. Using the devised coded data retrieval strategy, the data exchange in both UL and DL steps requires the same communication delay, measured by normalized delivery time (NDT). Furthermore, the NDT of the UL/DL step is shown to coincide with the optimal NDT of the original DL multi-input single-output CC scheme, in which the BS is connected to a centralized data library. △ Less

Submitted 1 February, 2024; originally announced February 2024.

Comments: 10 pages

arXiv:2401.17419 [pdf, other]

Few-Shot Channel-Agnostic Analog Coding: A Near-Optimal Scheme

Authors: Mohammad Ali Maddah-Ali, Soheil Mohajer

Abstract: In this paper, we investigate the problem of transmitting an analog source to a destination over $N$ uses of an additive-white-Gaussian-noise (AWGN) channel, where $N$ is very small (in the order of 10 or even less). The proposed coding scheme is based on representing the source symbol using a novel progressive expansion technique, partitioning the digits of expansion into $N$ ordered sets, and fi… ▽ More In this paper, we investigate the problem of transmitting an analog source to a destination over $N$ uses of an additive-white-Gaussian-noise (AWGN) channel, where $N$ is very small (in the order of 10 or even less). The proposed coding scheme is based on representing the source symbol using a novel progressive expansion technique, partitioning the digits of expansion into $N$ ordered sets, and finally map** the symbols in each set to a real number by applying the reverse progressive expansion. In the last step, we introduce some gaps between the signal levels to prevent the carry-over of the additive noise from propagation to other levels. This shields the most significant levels of the signal from an additive noise, hitting the signal at a less significant level. The parameters of the progressive expansion and the shielding procedure are opportunistically independent of the $\SNR$ so that the proposed scheme achieves a distortion $D$, where $-\log(D)$ is within $O(\log\log(\SNR))$ of the optimal performance for all values of $\SNR$, leading to a channel-agnostic scheme. △ Less

Submitted 30 January, 2024; originally announced January 2024.

arXiv:2305.06526 [pdf, ps, other]

Probabilistic Group Testing in Distributed Computing with Attacked Workers

Authors: Sarthak Jain, Martina Cardone, Soheil Mohajer

Abstract: The problem of distributed matrix-vector product is considered, where the server distributes the task of the computation among $n$ worker nodes, out of which $L$ are compromised (but non-colluding) and may return incorrect results. Specifically, it is assumed that the compromised workers are unreliable, that is, at any given time, each compromised worker may return an incorrect and correct result… ▽ More The problem of distributed matrix-vector product is considered, where the server distributes the task of the computation among $n$ worker nodes, out of which $L$ are compromised (but non-colluding) and may return incorrect results. Specifically, it is assumed that the compromised workers are unreliable, that is, at any given time, each compromised worker may return an incorrect and correct result with probabilities $α$ and $1-α$, respectively. Thus, the tests are noisy. This work proposes a new probabilistic group testing approach to identify the unreliable/compromised workers with $O\left(\frac{L\log(n)}α\right)$ tests. Moreover, using the proposed group testing method, sparse parity-check codes are constructed and used in the considered distributed computing framework for encoding, decoding and identifying the unreliable workers. This methodology has two distinct features: (i) the cost of identifying the set of $L$ unreliable workers at the server can be shown to be considerably lower than existing distributed computing methods, and (ii) the encoding and decoding functions are easily implementable and computationally efficient. △ Less

Submitted 10 May, 2023; originally announced May 2023.

arXiv:2301.04753 [pdf, other]

Cache-Aided $K$-User Broadcast Channels with State Information at Receivers

Authors: Hadi Reisizadeh, Mohammad Ali Maddah-Ali, Soheil Mohajer

Abstract: We study a $K$-user coded-caching broadcast problem in a joint source-channel coding framework. The transmitter observes a database of files that are being generated at a certain rate per channel use, and each user has a cache, which can store a fixed fraction of the generated symbols. In the delivery phase, the transmitter broadcasts a message so that the users can decode their desired files usin… ▽ More We study a $K$-user coded-caching broadcast problem in a joint source-channel coding framework. The transmitter observes a database of files that are being generated at a certain rate per channel use, and each user has a cache, which can store a fixed fraction of the generated symbols. In the delivery phase, the transmitter broadcasts a message so that the users can decode their desired files using the received signal and their cache content. The communication between the transmitter and the receivers happens over a (deterministic) \textit{time-varying} erasure broadcast channel, and the channel state information is only available to the users. We characterize the maximum achievable source rate for the $2$-user and the degraded $K$-user problems. We provide an upper bound for any caching strategy's achievable source rates. Finally, we present a linear programming formulation to show that the upper bound is not a sharp characterization. Closing the gap between the achievable rate and the optimum rate remains open. △ Less

Submitted 11 October, 2023; v1 submitted 11 January, 2023; originally announced January 2023.

arXiv:2201.01728 [pdf, other]

Matrix Completion with Hierarchical Graph Side Information

Authors: Adel Elmahdy, Junhyung Ahn, Changho Suh, Soheil Mohajer

Abstract: We consider a matrix completion problem that exploits social or item similarity graphs as side information. We develop a universal, parameter-free, and computationally efficient algorithm that starts with hierarchical graph clustering and then iteratively refines estimates both on graph clustering and matrix ratings. Under a hierarchical stochastic block model that well respects practically-releva… ▽ More We consider a matrix completion problem that exploits social or item similarity graphs as side information. We develop a universal, parameter-free, and computationally efficient algorithm that starts with hierarchical graph clustering and then iteratively refines estimates both on graph clustering and matrix ratings. Under a hierarchical stochastic block model that well respects practically-relevant social graphs and a low-rank rating matrix model (to be detailed), we demonstrate that our algorithm achieves the information-theoretic limit on the number of observed matrix entries (i.e., optimal sample complexity) that is derived by maximum likelihood estimation together with a lower-bound impossibility result. One consequence of this result is that exploiting the hierarchical structure of social graphs yields a substantial gain in sample complexity relative to the one that simply identifies different groups without resorting to the relational structure across them. We conduct extensive experiments both on synthetic and real-world datasets to corroborate our theoretical results as well as to demonstrate significant performance improvements over other matrix completion algorithms that leverage graph side information. △ Less

Submitted 1 January, 2022; originally announced January 2022.

Comments: 53 pages, 3 figures, 1 table. Published in NeurIPS 2020. The first two authors contributed equally to this work. In this revision, achievability proof technique is updated and typos are corrected. arXiv admin note: substantial text overlap with arXiv:2109.05408

Journal ref: Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

arXiv:2201.00313 [pdf, other]

Secure Determinant Codes for Distributed Storage Systems

Authors: Adel Elmahdy, Michelle Kleckler, Soheil Mohajer

Abstract: The information-theoretic secure exact-repair regenerating codes for distributed storage systems (DSSs) with parameters $(n,k=d,d,\ell)$ are studied in this paper. We consider distributed storage systems with $n$ nodes, in which the original data can be recovered from any subset of $k=d$ nodes, and the content of any node can be retrieved from those of any $d$ helper nodes. Moreover, we consider t… ▽ More The information-theoretic secure exact-repair regenerating codes for distributed storage systems (DSSs) with parameters $(n,k=d,d,\ell)$ are studied in this paper. We consider distributed storage systems with $n$ nodes, in which the original data can be recovered from any subset of $k=d$ nodes, and the content of any node can be retrieved from those of any $d$ helper nodes. Moreover, we consider two secrecy constraints, namely, Type-I, where the message remains secure against an eavesdropper with access to the content of any subset of up to $\ell$ nodes, and Type-II, in which the message remains secure against an eavesdropper who can observe the incoming repair data from all possible nodes to a fixed but unknown subset of up to $\ell$ compromised nodes. Two classes of secure determinant codes are proposed for Type-I and Type-II secrecy constraints. Each proposed code can be designed for a range of per-node storage capacity and repair bandwidth for any system parameters. They lead to two achievable secrecy trade-offs, for Type-I and Type-II security. △ Less

Submitted 29 December, 2022; v1 submitted 2 January, 2022; originally announced January 2022.

Comments: 22 pages, 8 figures. The first two authors contributed equally to this work. Accepted for publication at IEEE Transactions on Information Theory

arXiv:2109.05408 [pdf, other]

On the Fundamental Limits of Matrix Completion: Leveraging Hierarchical Similarity Graphs

Authors: Junhyung Ahn, Adel Elmahdy, Soheil Mohajer, Changho Suh

Abstract: We study the matrix completion problem that leverages hierarchical similarity graphs as side information in the context of recommender systems. Under a hierarchical stochastic block model that well respects practically-relevant social graphs and a low-rank rating matrix model, we characterize the exact information-theoretic limit on the number of observed matrix entries (i.e., optimal sample compl… ▽ More We study the matrix completion problem that leverages hierarchical similarity graphs as side information in the context of recommender systems. Under a hierarchical stochastic block model that well respects practically-relevant social graphs and a low-rank rating matrix model, we characterize the exact information-theoretic limit on the number of observed matrix entries (i.e., optimal sample complexity) by proving sharp upper and lower bounds on the sample complexity. In the achievability proof, we demonstrate that probability of error of the maximum likelihood estimator vanishes for sufficiently large number of users and items, if all sufficient conditions are satisfied. On the other hand, the converse (impossibility) proof is based on the genie-aided maximum likelihood estimator. Under each necessary condition, we present examples of a genie-aided estimator to prove that the probability of error does not vanish for sufficiently large number of users and items. One important consequence of this result is that exploiting the hierarchical structure of social graphs yields a substantial gain in sample complexity relative to the one that simply identifies different groups without resorting to the relational structure across them. More specifically, we analyze the optimal sample complexity and identify different regimes whose characteristics rely on quality metrics of side information of the hierarchical similarity graph. Finally, we present simulation results to corroborate our theoretical findings and show that the characterized information-theoretic limit can be asymptotically achieved. △ Less

Submitted 11 September, 2021; originally announced September 2021.

Comments: The first two authors contributed equally to this work. A preliminary version of this work was presented at the 2020 Advances in Neural Information Processing Systems Conference (NeurIPS 2020)

arXiv:2105.03064 [pdf, other]

When an Energy-Efficient Scheduling is Optimal for Half-Duplex Relay Networks?

Authors: Sarthak Jain, Martina Cardone, Soheil Mohajer

Abstract: This paper considers a diamond network with $n$ interconnected relays, namely a network where a source communicates with a destination by hop** information through $n$ communicating/interconnected relays. Specifically, the main focus of the paper is on characterizing sufficient conditions under which the $n+1$ states (out of the $2^{n}$ possible ones) in which at most one relay is transmitting s… ▽ More This paper considers a diamond network with $n$ interconnected relays, namely a network where a source communicates with a destination by hop** information through $n$ communicating/interconnected relays. Specifically, the main focus of the paper is on characterizing sufficient conditions under which the $n+1$ states (out of the $2^{n}$ possible ones) in which at most one relay is transmitting suffice to characterize the approximate capacity, that is the Shannon capacity up to an additive gap that only depends on $n$. Furthermore, under these sufficient conditions, closed form expressions for the approximate capacity and scheduling (that is, the fraction of time each relay should receive and transmit) are provided. A similar result is presented for the dual case, where in each state at most one relay is in receive mode. △ Less

Submitted 12 August, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

arXiv:2001.02851 [pdf, other]

Best Relay Selection in Gaussian Half-Duplex Diamond Networks

Authors: Sarthak Jain, Soheil Mohajer, Martina Cardone

Abstract: This paper considers Gaussian half-duplex diamond $n$-relay networks, where a source communicates with a destination by hop** information through one layer of $n$ non-communicating relays that operate in half-duplex. The main focus consists of investigating the following question: What is the contribution of a single relay on the approximate capacity of the entire network? In particular, approxi… ▽ More This paper considers Gaussian half-duplex diamond $n$-relay networks, where a source communicates with a destination by hop** information through one layer of $n$ non-communicating relays that operate in half-duplex. The main focus consists of investigating the following question: What is the contribution of a single relay on the approximate capacity of the entire network? In particular, approximate capacity refers to a quantity that approximates the Shannon capacity within an additive gap which only depends on $n$, and is independent of the channel parameters. This paper answers the above question by providing a fundamental bound on the ratio between the approximate capacity of the highest-performing single relay and the approximate capacity of the entire network, for any number $n$. Surprisingly, it is shown that such a ratio guarantee is $f = 1/(2+2\cos(2π/(n+2)))$, that is a sinusoidal function of $n$, which decreases as $n$ increases. It is also shown that the aforementioned ratio guarantee is tight, i.e., there exist Gaussian half-duplex diamond $n$-relay networks, where the highest-performing relay has an approximate capacity equal to an $f$ fraction of the approximate capacity of the entire network. △ Less

Submitted 9 January, 2020; originally announced January 2020.

Comments: 30 pages, 5 figures, journal

arXiv:1901.00911 [pdf, other]

doi 10.1109/TIT.2020.3033338

Cascade Codes For Distributed Storage Systems

Authors: Mehran Elyasi, Soheil Mohajer

Abstract: A novel coding scheme for exact repair-regenerating codes is presented in this paper. The codes proposed in this work can trade between the repair bandwidth of nodes (number of downloaded symbols from each surviving node in a repair process) and the required storage overhead of the system. These codes work for general system parameters $(n,k,d)$, which are the total number of nodes, the number of… ▽ More A novel coding scheme for exact repair-regenerating codes is presented in this paper. The codes proposed in this work can trade between the repair bandwidth of nodes (number of downloaded symbols from each surviving node in a repair process) and the required storage overhead of the system. These codes work for general system parameters $(n,k,d)$, which are the total number of nodes, the number of nodes suffice for data recovery, and the number of helper nodes in a repair process, respectively. The proposed construction offers a unified scheme to develop exact-repair regenerating codes for the entire trade-off, including the MBR and MSR points. We conjecture that the new storage-vs.-bandwidth trade-off achieved by the proposed codes is optimum. Some other key features of this code include: the construction is linear; the required field size is only $Θ(n)$; and the code parameters and in particular sub-packetization level is at most $(d-k+1)^k$; which is independent of the number of the parity nodes. Moreover, the proposed repair mechanism is \emph{helper-independent}, that is the data sent from each helper only depends on the identity of the helper and failed nodes, but independent of the identity of other helper nodes participating in the repair process. △ Less

Submitted 19 October, 2020; v1 submitted 3 January, 2019; originally announced January 2019.

arXiv:1812.01142 [pdf, other]

doi 10.1109/TIT.2019.2904294

Determinant Codes with Helper-Independent Repair for Single and Multiple Failures

Authors: Mehran Elyasi, Soheil Mohajer

Abstract: Determinant codes are a class of exact-repair regenerating codes for distributed storage systems with parameters (n, k = d, d). These codes cover the entire trade-off between per-node storage and repair-bandwidth. In an earlier work of the authors, the repair data of the determinant code sent by a helper node to repair a failed node depends on the identity of the other helper nodes participating i… ▽ More Determinant codes are a class of exact-repair regenerating codes for distributed storage systems with parameters (n, k = d, d). These codes cover the entire trade-off between per-node storage and repair-bandwidth. In an earlier work of the authors, the repair data of the determinant code sent by a helper node to repair a failed node depends on the identity of the other helper nodes participating in the process, which is practically undesired. In this work, a new repair mechanism is proposed for determinant codes, which relaxes this dependency, while preserving all other properties of the code. Moreover, it is shown that the determinant codes are capable of repairing multiple failures, with a per-node repair-bandwidth which scales sub-linearly with the number of failures. △ Less

Submitted 7 March, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

arXiv:1808.02780 [pdf, other]

Cache Aided Communications with Multiple Antennas at Finite SNR

Authors: Itsik Bergel, Soheil Mohajer

Abstract: We study the problem of cache-aided communication for cellular networks with multi-user and multiple antennas at finite signal-to-noise ratio. Users are assumed to have non-symmetric links, modeled by wideband fading channels. We show that the problem can be formulated as a linear program, whose solution provides a joint cache allocation along with pre-fetching and fetching schemes that minimize t… ▽ More We study the problem of cache-aided communication for cellular networks with multi-user and multiple antennas at finite signal-to-noise ratio. Users are assumed to have non-symmetric links, modeled by wideband fading channels. We show that the problem can be formulated as a linear program, whose solution provides a joint cache allocation along with pre-fetching and fetching schemes that minimize the duration of the communication in the delivery phase. The suggested scheme uses zero-forcing and cached interference subtraction and hence allow each user to be served at the rate of its own channel. Thus, this scheme is better than the previously published schemes that are compromised by the poorest user in the communication group. We also consider a special case of the parameters for which we can derive a closed form solution and formulate the optimal power, rate and cache optimization. This special case shows that the gain of MIMO coded caching goes beyond the throughput. In particular, it is shown that in this case, the cache is used to balance the users such that fairness and throughput are no longer contradicting. More specifically, in this case, strict fairness is achieved jointly with maximizing the network throughput. △ Less

Submitted 8 August, 2018; originally announced August 2018.

Comments: 19 pages, 4 figures, submitted to IEEE Journal on Selected Areas in Communications

arXiv:1807.04255 [pdf, other]

doi 10.1109/TIT.2020.2964547

On the Fundamental Limits of Coded Data Shuffling for Distributed Machine Learning

Authors: Adel Elmahdy, Soheil Mohajer

Abstract: We consider the data shuffling problem in a distributed learning system, in which a master node is connected to a set of worker nodes, via a shared link, in order to communicate a set of files to the worker nodes. The master node has access to a database of files. In every shuffling iteration, each worker node processes a new subset of files, and has excess storage to partially cache the remaining… ▽ More We consider the data shuffling problem in a distributed learning system, in which a master node is connected to a set of worker nodes, via a shared link, in order to communicate a set of files to the worker nodes. The master node has access to a database of files. In every shuffling iteration, each worker node processes a new subset of files, and has excess storage to partially cache the remaining files, assuming the cached files are uncoded. The caches of the worker nodes are updated every iteration, and they should be designed to satisfy any possible unknown permutation of the files in subsequent iterations. For this problem, we characterize the exact load-memory trade-off for worst-case shuffling by deriving the minimum communication load for a given storage capacity per worker node. As a byproduct, the exact load-memory trade-off for any shuffling is characterized when the number of files is equal to the number of worker nodes. We propose a novel deterministic coded shuffling scheme, which improves the state of the art, by exploiting the cache memories to create coded functions that can be decoded by several worker nodes. Then, we prove the optimality of our proposed scheme by deriving a matching lower bound and showing that the placement phase of the proposed coded shuffling scheme is optimal over all shuffles. △ Less

Submitted 20 June, 2020; v1 submitted 11 July, 2018; originally announced July 2018.

Comments: This work has been published in IEEE Transactions on Information Theory. A preliminary version of this work was presented at IEEE International Symposium on Information Theory (ISIT), Jun. 2018

Journal ref: IEEE Transactions on Information Theory, vol. 66, no. 5, pp. 3098-3131, May 2020

arXiv:1711.02770 [pdf, ps, other]

Bandwidth Adaptive & Error Resilient MBR Exact Repair Regenerating Codes

Authors: Kaveh Mahdaviani, Ashish Khisti, Soheil Mohajer

Abstract: Regenerating codes are efficient methods for distributed storage in storage networks, where node failures are common. They guarantee low cost data reconstruction and repair through accessing only a predefined number of arbitrarily chosen storage nodes in the network. In this work we consider two simultaneous extensions to the original regenerating codes framework introduced in [1]; i) both data re… ▽ More Regenerating codes are efficient methods for distributed storage in storage networks, where node failures are common. They guarantee low cost data reconstruction and repair through accessing only a predefined number of arbitrarily chosen storage nodes in the network. In this work we consider two simultaneous extensions to the original regenerating codes framework introduced in [1]; i) both data reconstruction and repair are resilient to the presence of a certain number of erroneous nodes in the network and ii) the number of helper nodes in every repair is not fixed, but is a flexible parameter that can be selected during the runtime. We study the fundamental limits of required total repair bandwidth and provide an upper bound for the storage capacity of these codes under these assumptions. We then focus on the minimum repair bandwidth (MBR) case and derive the exact storage capacity by presenting explicit coding schemes with exact repair, which achieve the upper bound of the storage capacity in the considered setup. To this end, we first provide a more natural extension of the well-known Product Matrix (PM) MBR codes [2], modified to provide flexibility in the choice of number of helpers in each repair, and simultaneously be robust to erroneous nodes in the network. This is achieved by proving the non-singularity of family of matrices in large enough finite fields. We next provide another extension of the PM codes, based on novel repair schemes which enable flexibility in the number of helpers and robustness against erroneous nodes without any extra cost in field size compared to the original PM codes. △ Less

Submitted 7 November, 2017; originally announced November 2017.

Comments: This manuscript is submitted to the IEEE Transactions on Information Theory

arXiv:1708.06012 [pdf, ps, other]

Product Matrix Minimum Storage Regenerating Codes with Flexible Number of Helpers

Authors: Kaveh Mahdaviani, Soheil Mohajer, Ashish Khisti

Abstract: In coding for distributed storage systems, efficient data reconstruction and repair through accessing a predefined number of arbitrarily chosen storage nodes is guaranteed by regenerating codes. Traditionally, code parameters, specially the number of helper nodes participating in a repair process, are predetermined. However, depending on the state of the system and network traffic, it is desirable… ▽ More In coding for distributed storage systems, efficient data reconstruction and repair through accessing a predefined number of arbitrarily chosen storage nodes is guaranteed by regenerating codes. Traditionally, code parameters, specially the number of helper nodes participating in a repair process, are predetermined. However, depending on the state of the system and network traffic, it is desirable to adapt such parameters accordingly in order to minimize the cost of repair. In this work a class of regenerating codes with minimum storage is introduced that can simultaneously operate at the optimal repair bandwidth, for a wide range of exact repair mechanisms, based on different number of helper nodes. △ Less

Submitted 28 December, 2017; v1 submitted 20 August, 2017; originally announced August 2017.

Comments: IEEE Information Theory Workshop (ITW) 2017

arXiv:1708.03402 [pdf, ps, other]

Product Matrix MSR Codes with Bandwidth Adaptive Exact Repair

Authors: Kaveh Mahdaviani, Soheil Mohajer, Ashish Khisti

Abstract: In a distributed storage systems (DSS) with $k$ systematic nodes, robustness against node failure is commonly provided by storing redundancy in a number of other nodes and performing repair mechanism to reproduce the content of the failed nodes. Efficiency is then achieved by minimizing the storage overhead and the amount of data transmission required for data reconstruction and repair, provided b… ▽ More In a distributed storage systems (DSS) with $k$ systematic nodes, robustness against node failure is commonly provided by storing redundancy in a number of other nodes and performing repair mechanism to reproduce the content of the failed nodes. Efficiency is then achieved by minimizing the storage overhead and the amount of data transmission required for data reconstruction and repair, provided by coding solutions such as regenerating codes [1]. Common explicit regenerating code constructions enable efficient repair through accessing a predefined number, $d$, of arbitrary chosen available nodes, namely helpers. In practice, however, the state of the system dynamically changes based on the request load, the link traffic, etc., and the parameters which optimize system's performance vary accordingly. It is then desirable to have coding schemes which are able to operate optimally under a range of different parameters simultaneously. Specifically, adaptivity in the number of helper nodes for repair is of interest. While robustness requires capability of performing repair with small number of helpers, it is desirable to use as many helpers as available to reduce the transmission delay and total repair traffic. In this work we focus on the minimum storage regenerating (MSR) codes, where each node is supposed to store $α$ information units, and the source data of size $kα$ could be recovered from any arbitrary set of $k$ nodes. We introduce a class MSR codes that realize optimal repair bandwidth simultaneously with a set of different choices for the number of helpers, namely $D=\{d_{1}, \cdots, d_δ\}$. Our coding scheme follows the Product Matrix (PM) framework introduced in [2], and could be considered as a generalization of the PM MSR code presented in [2], such that any $d_{i} = (i+1)(k-1)$ helpers can perform an optimal repair. ... △ Less

Submitted 28 December, 2017; v1 submitted 10 August, 2017; originally announced August 2017.

arXiv:1604.04961 [pdf, other]

Role of a Relay in Bursty Multiple Access Channels

Authors: Sunghyun Kim, Soheil Mohajer, Changho Suh

Abstract: We investigate the role of a relay in multiple access channels (MACs) with bursty user traffic, where intermittent data traffic restricts the users to bursty transmissions. As our main result, we characterize the degrees of freedom (DoF) region of a $K$-user bursty multi-input multi-output (MIMO) Gaussian MAC with a relay, where Bernoulli random states are introduced to govern bursty user transmis… ▽ More We investigate the role of a relay in multiple access channels (MACs) with bursty user traffic, where intermittent data traffic restricts the users to bursty transmissions. As our main result, we characterize the degrees of freedom (DoF) region of a $K$-user bursty multi-input multi-output (MIMO) Gaussian MAC with a relay, where Bernoulli random states are introduced to govern bursty user transmissions. To that end, we extend the noisy network coding scheme to achieve the cut-set bound. Our main contribution is in exploring the role of a relay from various perspectives. First, we show that a relay can provide a DoF gain in bursty channels, unlike in conventional non-bursty channels. Interestingly, we find that the relaying gain can scale with additional antennas at the relay to some extent. Moreover, observing that a relay can help achieve collision-free performances, we establish the necessary and sufficient condition for attaining collision-free DoF. Lastly, we consider scenarios in which some physical perturbation shared around the users may generate data traffic simultaneously, causing transmission patterns across them to be correlated. We demonstrate that for most cases in such scenarios, the relaying gain is greater when the users' transmission patterns are more correlated, hence when more severe collisions take place. Our results have practical implications in various scenarios of wireless networks such as device-to-device systems and random media access control protocols. △ Less

Submitted 17 April, 2016; originally announced April 2016.

Comments: 26 pages, 13 figures, submitted to the IEEE Transactions on Information Theory

arXiv:1510.02032 [pdf, other]

A Probabilistic Approach Towards Exact-Repair Regeneration Codes

Authors: Mehran Elyasi, Soheil Mohajer

Abstract: Regeneration codes with exact-repair property for distributed storage systems is studied in this paper. For exact- repair problem, the achievable points of (α,β) tradeoff match with the outer bound only for minimum storage regenerating (MSR), minimum bandwidth regenerating (MBR), and some specific values of n, k, and d. Such tradeoff is characterized in this work for general (n, k, k), (i.e., k =… ▽ More Regeneration codes with exact-repair property for distributed storage systems is studied in this paper. For exact- repair problem, the achievable points of (α,β) tradeoff match with the outer bound only for minimum storage regenerating (MSR), minimum bandwidth regenerating (MBR), and some specific values of n, k, and d. Such tradeoff is characterized in this work for general (n, k, k), (i.e., k = d) for some range of per-node storage (α) and repair-bandwidth (β). Rather than explicit code construction, achievability of these tradeoff points is shown by proving existence of exact-repair regeneration codes for any (n,k,k). More precisely, it is shown that an (n, k, k) system can be extended by adding a new node, which is randomly picked from some ensemble, and it is proved that, with high probability, the existing nodes together with the newly added one maintain properties of exact-repair regeneration codes. The new achievable region improves upon the existing code constructions. In particular, this result provides a complete tradeoff characterization for an (n,3,3) distributed storage system for any value of n. △ Less

Submitted 7 October, 2015; originally announced October 2015.

arXiv:1201.6313 [pdf, ps, other]

doi 10.1109/ISIT.2012.6283623

On X-Channels with Feedback and Delayed CSI

Authors: Ravi Tandon, Soheil Mohajer, H. Vincent Poor, Shlomo Shamai

Abstract: The sum degrees of freedom (DoF) of the two-user MIMO X-channel is characterized in the presence of output feedback and delayed channel state information (CSI). The number of antennas at each transmitters is assumed to be M and the number of antennas at each of the receivers is assumed to be N. It is shown that the sum DoF of the two-user MIMO X-channel is the same as the sum DoF of a two-user MIM… ▽ More The sum degrees of freedom (DoF) of the two-user MIMO X-channel is characterized in the presence of output feedback and delayed channel state information (CSI). The number of antennas at each transmitters is assumed to be M and the number of antennas at each of the receivers is assumed to be N. It is shown that the sum DoF of the two-user MIMO X-channel is the same as the sum DoF of a two-user MIMO broadcast channel with 2M transmit antennas, and N antennas at each receiver. Hence, for this symmetric antenna configuration, there is no performance loss in the sum degrees of freedom due to the distributed nature of the transmitters. This result highlights the usefulness of feedback and delayed CSI for the MIMO X-channel. The K-user X-channel with single antenna at each transmitter and each receiver is also studied. In this network, each transmitter has a message intended for each receiver. For this network, it is shown that the sum DoF with partial output feedback alone is at least 2K/(K+1). This lower bound is strictly better than the best lower bound known for the case of delayed CSI assumption for all values of K. △ Less

Submitted 30 January, 2012; originally announced January 2012.

Comments: Submitted to IEEE ISIT 2012 on Jan 22, 2012

arXiv:1110.6487 [pdf, ps, other]

On the Feedback Capacity of the Fully Connected $K$-User Interference Channel

Authors: Soheil Mohajer, Ravi Tandon, H. Vincent Poor

Abstract: The symmetric K user interference channel with fully connected topology is considered, in which (a) each receiver suffers interference from all other (K-1) transmitters, and (b) each transmitter has causal and noiseless feedback from its respective receiver. The number of generalized degrees of freedom (GDoF) is characterized in terms of α, where the interference-to-noise ratio (INR) is given by I… ▽ More The symmetric K user interference channel with fully connected topology is considered, in which (a) each receiver suffers interference from all other (K-1) transmitters, and (b) each transmitter has causal and noiseless feedback from its respective receiver. The number of generalized degrees of freedom (GDoF) is characterized in terms of α, where the interference-to-noise ratio (INR) is given by INR=SNR^α. It is shown that the per-user GDoF of this network is the same as that of the 2-user interference channel with feedback, except for α=1, for which existence of feedback does not help in terms of GDoF. The coding scheme proposed for this network, termed cooperative interference alignment, is based on two key ingredients, namely, interference alignment and interference decoding. Moreover, an approximate characterization is provided for the symmetric feedback capacity of the network, when the SNR and INR are far apart from each other. △ Less

Submitted 21 December, 2012; v1 submitted 28 October, 2011; originally announced October 2011.

Comments: 20 pages, 4 figures, to appear in IEEE Transactions on Information Theory

arXiv:1109.5373 [pdf, ps, other]

doi 10.1109/TIT.2012.2226700

Degrees of Freedom Region of the MIMO Interference Channel with Output Feedback and Delayed CSIT

Authors: Ravi Tandon, Soheil Mohajer, H. Vincent Poor, Shlomo Shamai

Abstract: The two-user multiple-input multiple-output (MIMO) interference channel (IC) with arbitrary number of antennas at each terminal is considered and the degrees of freedom (DoF) region is characterized in the presence of noiseless channel output feedback from each receiver to its respective transmitter and availability of delayed channel state information at the transmitters (CSIT). It is shown that… ▽ More The two-user multiple-input multiple-output (MIMO) interference channel (IC) with arbitrary number of antennas at each terminal is considered and the degrees of freedom (DoF) region is characterized in the presence of noiseless channel output feedback from each receiver to its respective transmitter and availability of delayed channel state information at the transmitters (CSIT). It is shown that having output feedback and delayed CSIT can strictly enlarge the DoF region of the MIMO IC when compared to the case in which only delayed CSIT is present. The proposed coding schemes that achieve the corresponding DoF region with feedback and delayed CSIT utilize both resources, i.e., feedback and delayed CSIT in a non-trivial manner. It is also shown that the DoF region with local feedback and delayed CSIT is equal to the DoF region with global feedback and delayed CSIT, i.e., local feedback and delayed CSIT is equivalent to global feedback and delayed CSIT from the perspective of the degrees of freedom region. The converse is proved for a stronger setting in which the channels to the two receivers need not be statistically equivalent. △ Less

Submitted 25 October, 2012; v1 submitted 25 September, 2011; originally announced September 2011.

Comments: Accepted for publication in IEEE Transactions on Information Theory

arXiv:1109.1507 [pdf, other]

doi 10.1109/Allerton.2011.6120256

On the Symmetric Feedback Capacity of the K-user Cyclic Z-Interference Channel

Authors: Ravi Tandon, Soheil Mohajer, H. Vincent Poor

Abstract: The K-user cyclic Z-interference channel models a situation in which the kth transmitter causes interference only to the (k-1)th receiver in a cyclic manner, e.g., the first transmitter causes interference only to the Kth receiver. The impact of noiseless feedback on the capacity of this channel is studied by focusing on the Gaussian cyclic Z-interference channel. To this end, the symmetric feedba… ▽ More The K-user cyclic Z-interference channel models a situation in which the kth transmitter causes interference only to the (k-1)th receiver in a cyclic manner, e.g., the first transmitter causes interference only to the Kth receiver. The impact of noiseless feedback on the capacity of this channel is studied by focusing on the Gaussian cyclic Z-interference channel. To this end, the symmetric feedback capacity of the linear shift deterministic cyclic Z-interference channel (LD-CZIC) is completely characterized for all interference regimes. Using insights from the linear deterministic channel model, the symmetric feedback capacity of the Gaussian cyclic Z-interference channel is characterized up to within a constant number of bits. As a byproduct of the constant gap result, the symmetric generalized degrees of freedom with feedback for the Gaussian cyclic Z-interference channel are also characterized. These results highlight that the symmetric feedback capacities for both linear and Gaussian channel models are in general functions of K, the number of users. Furthermore, the capacity gain obtained due to feedback decreases as K increases. △ Less

Submitted 21 December, 2012; v1 submitted 7 September, 2011; originally announced September 2011.

Comments: Accepted for publication in IEEE Transactions on Information Theory

arXiv:1108.2234 [pdf, ps, other]

doi 10.1109/SmartGridComm.2011.6102315

Smart Meter Privacy: A Utility-Privacy Framework

Authors: S. Raj Rajagopalan, Lalitha Sankar, Soheil Mohajer, H. Vincent Poor

Abstract: End-user privacy in smart meter measurements is a well-known challenge in the smart grid. The solutions offered thus far have been tied to specific technologies such as batteries or assumptions on data usage. Existing solutions have also not quantified the loss of benefit (utility) that results from any such privacy-preserving approach. Using tools from information theory, a new framework is prese… ▽ More End-user privacy in smart meter measurements is a well-known challenge in the smart grid. The solutions offered thus far have been tied to specific technologies such as batteries or assumptions on data usage. Existing solutions have also not quantified the loss of benefit (utility) that results from any such privacy-preserving approach. Using tools from information theory, a new framework is presented that abstracts both the privacy and the utility requirements of smart meter data. This leads to a novel privacy-utility tradeoff problem with minimal assumptions that is tractable. Specifically for a stationary Gaussian Markov model of the electricity load, it is shown that the optimal utility-and-privacy preserving solution requires filtering out frequency components that are low in power, and this approach appears to encompass most of the proposed privacy approaches. △ Less

Submitted 10 August, 2011; originally announced August 2011.

Comments: Accepted for publication and presentation at the IEEE SmartGridComm. 2011

arXiv:1005.0404 [pdf, ps, other]

Approximate Capacity of Gaussian Interference-Relay Networks with Weak Cross Links

Authors: Soheil Mohajer, Suhas N. Diggavi, Christina Fragouli, David N. C. Tse

Abstract: In this paper we study a Gaussian relay-interference network, in which relay (helper) nodes are to facilitate competing information flows over a wireless network. We focus on a two-stage relay-interference network where there are weak cross-links, causing the networks to behave like a chain of Z Gaussian channels. For these Gaussian ZZ and ZS networks, we establish an approximate characterization… ▽ More In this paper we study a Gaussian relay-interference network, in which relay (helper) nodes are to facilitate competing information flows over a wireless network. We focus on a two-stage relay-interference network where there are weak cross-links, causing the networks to behave like a chain of Z Gaussian channels. For these Gaussian ZZ and ZS networks, we establish an approximate characterization of the rate region. The outer bounds to the capacity region are established using genie-aided techniques that yield bounds sharper than the traditional cut-set outer bounds. For the inner bound of the ZZ network, we propose a new interference management scheme, termed interference neutralization, which is implemented using structured lattice codes. This technique allows for over-the-air interference removal, without the transmitters having complete access the interfering signals. For both the ZZ and ZS networks, we establish a new network decomposition technique that (approximately) achieves the capacity region. We use insights gained from an exact characterization of the corresponding linear deterministic version of the problems, in order to establish the approximate characterization for Gaussian networks. △ Less

Submitted 3 May, 2010; originally announced May 2010.

Comments: 66 pages, 19 figures, submitted to IEEE Trans. on Information Theory

arXiv:1001.1658 [pdf, ps, other]

On the Capacity of Non-Coherent Network Coding

Authors: Mahdi Jafari Siavoshani, Soheil Mohajer, Christina Fragouli, Suhas Diggavi

Abstract: We consider the problem of multicasting information from a source to a set of receivers over a network where intermediate network nodes perform randomized network coding operations on the source packets. We propose a channel model for the non-coherent network coding introduced by Koetter and Kschischang in [6], that captures the essence of such a network operation, and calculate the capacity as a… ▽ More We consider the problem of multicasting information from a source to a set of receivers over a network where intermediate network nodes perform randomized network coding operations on the source packets. We propose a channel model for the non-coherent network coding introduced by Koetter and Kschischang in [6], that captures the essence of such a network operation, and calculate the capacity as a function of network parameters. We prove that use of subspace coding is optimal, and show that, in some cases, the capacity-achieving distribution uses subspaces of several dimensions, where the employed dimensions depend on the packet length. This model and the results also allow us to give guidelines on when subspace coding is beneficial for the proposed model and by how much, in comparison to a coding vector approach, from a capacity viewpoint. We extend our results to the case of multiple source multicast that creates a virtual multiple access channel. △ Less

Submitted 16 November, 2010; v1 submitted 11 January, 2010; originally announced January 2010.

arXiv:1001.1445 [pdf, other]

Graph-Constrained Group Testing

Authors: Mahdi Cheraghchi, Amin Karbasi, Soheil Mohajer, Venkatesh Saligrama

Abstract: Non-adaptive group testing involves grou** arbitrary subsets of $n$ items into different pools. Each pool is then tested and defective items are identified. A fundamental question involves minimizing the number of pools required to identify at most $d$ defective items. Motivated by applications in network tomography, sensor networks and infection propagation, a variation of group testing problem… ▽ More Non-adaptive group testing involves grou** arbitrary subsets of $n$ items into different pools. Each pool is then tested and defective items are identified. A fundamental question involves minimizing the number of pools required to identify at most $d$ defective items. Motivated by applications in network tomography, sensor networks and infection propagation, a variation of group testing problems on graphs is formulated. Unlike conventional group testing problems, each group here must conform to the constraints imposed by a graph. For instance, items can be associated with vertices and each pool is any set of nodes that must be path connected. In this paper, a test is associated with a random walk. In this context, conventional group testing corresponds to the special case of a complete graph on $n$ vertices. For interesting classes of graphs a rather surprising result is obtained, namely, that the number of tests required to identify $d$ defective items is substantially similar to what is required in conventional group testing problems, where no such constraints on pooling is imposed. Specifically, if T(n) corresponds to the mixing time of the graph $G$, it is shown that with $m=O(d^2T^2(n)\log(n/d))$ non-adaptive tests, one can identify the defective items. Consequently, for the Erdos-Renyi random graph $G(n,p)$, as well as expander graphs with constant spectral gap, it follows that $m=O(d^2\log^3n)$ non-adaptive tests are sufficient to identify $d$ defective items. Next, a specific scenario is considered that arises in network tomography, for which it is shown that $m=O(d^3\log^3n)$ non-adaptive tests are sufficient to identify $d$ defective items. Noisy counterparts of the graph constrained group testing problem are considered, for which parallel results are developed. We also briefly discuss extensions to compressive sensing on graphs. △ Less

Submitted 22 July, 2011; v1 submitted 9 January, 2010; originally announced January 2010.

Comments: Full version to appear in IEEE Transactions on Information Theory. A preliminary summary of this work appeared (under the same title) in proceedings of the 2010 IEEE International Symposium on Information Theory

arXiv:0911.4880 [pdf, ps, other]

An Estimation Theoretic Approach for Sparsity Pattern Recovery in the Noisy Setting

Authors: Ali Hormati, Amin Karbasi, Soheil Mohajer, Martin Vetterli

Abstract: Compressed sensing deals with the reconstruction of sparse signals using a small number of linear measurements. One of the main challenges in compressed sensing is to find the support of a sparse signal. In the literature, several bounds on the scaling law of the number of measurements for successful support recovery have been derived where the main focus is on random Gaussian measurement matric… ▽ More Compressed sensing deals with the reconstruction of sparse signals using a small number of linear measurements. One of the main challenges in compressed sensing is to find the support of a sparse signal. In the literature, several bounds on the scaling law of the number of measurements for successful support recovery have been derived where the main focus is on random Gaussian measurement matrices. In this paper, we investigate the noisy support recovery problem from an estimation theoretic point of view, where no specific assumption is made on the underlying measurement matrix. The linear measurements are perturbed by additive white Gaussian noise. We define the output of a support estimator to be a set of position values in increasing order. We set the error between the true and estimated supports as the $\ell_2$-norm of their difference. On the one hand, this choice allows us to use the machinery behind the $\ell_2$-norm error metric and on the other hand, converts the support recovery into a more intuitive and geometrical problem. First, by using the Hammersley-Chapman-Robbins (HCR) bound, we derive a fundamental lower bound on the performance of any \emph{unbiased} estimator of the support set. This lower bound provides us with necessary conditions on the number of measurements for reliable $\ell_2$-norm support recovery, which we specifically evaluate for uniform Gaussian measurement matrices. Then, we analyze the maximum likelihood estimator and derive conditions under which the HCR bound is achievable. This leads us to the number of measurements for the optimum decoder which is sufficient for reliable $\ell_2$-norm support recovery. Using this framework, we specifically evaluate sufficient conditions for uniform Gaussian measurement matrices. △ Less

Submitted 25 November, 2009; originally announced November 2009.

arXiv:0911.2346 [pdf, ps, other]

Asymmetric Multilevel Diversity Coding and Asymmetric Gaussian Multiple Descriptions

Authors: Soheil Mohajer, Chao Tian, Suhas N. Diggavi

Abstract: We consider the asymmetric multilevel diversity (A-MLD) coding problem, where a set of $2^K-1$ information sources, ordered in a decreasing level of importance, is encoded into $K$ messages (or descriptions). There are $2^K-1$ decoders, each of which has access to a non-empty subset of the encoded messages. Each decoder is required to reproduce the information sources up to a certain importance… ▽ More We consider the asymmetric multilevel diversity (A-MLD) coding problem, where a set of $2^K-1$ information sources, ordered in a decreasing level of importance, is encoded into $K$ messages (or descriptions). There are $2^K-1$ decoders, each of which has access to a non-empty subset of the encoded messages. Each decoder is required to reproduce the information sources up to a certain importance level depending on the combination of descriptions available to it. We obtain a single letter characterization of the achievable rate region for the 3-description problem. In contrast to symmetric multilevel diversity coding, source-separation coding is not sufficient in the asymmetric case, and ideas akin to network coding need to be used strategically. Based on the intuitions gained in treating the A-MLD problem, we derive inner and outer bounds for the rate region of the asymmetric Gaussian multiple description (MD) problem with three descriptions. Both the inner and outer bounds have a similar geometric structure to the rate region template of the A-MLD coding problem, and moreover, we show that the gap between them is small, which results in an approximate characterization of the asymmetric Gaussian three description rate region. △ Less

Submitted 12 November, 2009; originally announced November 2009.

Comments: 42 pages, 9 figures, submitted to IEEE transactions on Information Theory

arXiv:0812.1597 [pdf, ps, other]

Transmission Techniques for Relay-Interference Networks

Authors: Soheil Mohajer, Suhas N. Diggavi, Christina Fragouli, David N. C. Tse

Abstract: In this paper we study the relay-interference wireless network, in which relay (helper) nodes are to facilitate competing information flows over a wireless network. We examine this in the context of a deterministic wireless interaction model, which eliminates the channel noise and focuses on the signal interactions. Using this model, we show that almost all the known schemes such as interference… ▽ More In this paper we study the relay-interference wireless network, in which relay (helper) nodes are to facilitate competing information flows over a wireless network. We examine this in the context of a deterministic wireless interaction model, which eliminates the channel noise and focuses on the signal interactions. Using this model, we show that almost all the known schemes such as interference suppression, interference alignment and interference separation are necessary for relay-interference networks. In addition, we discover a new interference management technique, which we call interference neutralization, which allows for over-the-air interference removal, without the transmitters having complete access the interfering signals. We show that interference separation, suppression, and neutralization arise in a fundamental manner, since we show complete characterizations for special configurations of the relay-interference network. △ Less

Submitted 8 December, 2008; originally announced December 2008.

Comments: 8 pages, 8 figures, presented at 46. Allerton Conf. On Comm., Control, and Computing 2008

arXiv:0810.3631 [pdf, ps, other]

doi 10.1109/TIT.2009.2023704

Approximating the Gaussian Multiple Description Rate Region Under Symmetric Distortion Constraints

Authors: Chao Tian, Soheil Mohajer, Suhas N. Diggavi

Abstract: We consider multiple description coding for the Gaussian source with K descriptions under the symmetric mean squared error distortion constraints, and provide an approximate characterization of the rate region. We show that the rate region can be sandwiched between two polytopes, between which the gap can be upper bounded by constants dependent on the number of descriptions, but independent of t… ▽ More We consider multiple description coding for the Gaussian source with K descriptions under the symmetric mean squared error distortion constraints, and provide an approximate characterization of the rate region. We show that the rate region can be sandwiched between two polytopes, between which the gap can be upper bounded by constants dependent on the number of descriptions, but independent of the exact distortion constraints. Underlying this result is an exact characterization of the lossless multi-level diversity source coding problem: a lossless counterpart of the MD problem. This connection provides a polytopic template for the inner and outer bounds to the rate region. In order to establish the outer bound, we generalize Ozarow's technique to introduce a strategic expansion of the original probability space by more than one random variables. For the symmetric rate case with any number of descriptions, we show that the gap between the upper bound and the lower bound for the individual description rate is no larger than 0.92 bit. The results developed in this work also suggest the "separation" approach of combining successive refinement quantization and lossless multi-level diversity coding is a competitive one, since it is only a constant away from the optimum. The results are further extended to general sources under the mean squared error distortion measure, where a similar but looser bound on the gap holds. △ Less

Submitted 20 October, 2008; originally announced October 2008.

Comments: 46 pages, 5 figures, submitted to IEEE Trans. on Information Theory

arXiv:0706.3480 [pdf, ps, other]

Tight Bounds on the Average Length, Entropy, and Redundancy of Anti-Uniform Huffman Codes

Authors: Soheil Mohajer, Ali Kakhbod

Abstract: In this paper we consider the class of anti-uniform Huffman codes and derive tight lower and upper bounds on the average length, entropy, and redundancy of such codes in terms of the alphabet size of the source. The Fibonacci distributions are introduced which play a fundamental role in AUH codes. It is shown that such distributions maximize the average length and the entropy of the code for a g… ▽ More In this paper we consider the class of anti-uniform Huffman codes and derive tight lower and upper bounds on the average length, entropy, and redundancy of such codes in terms of the alphabet size of the source. The Fibonacci distributions are introduced which play a fundamental role in AUH codes. It is shown that such distributions maximize the average length and the entropy of the code for a given alphabet size. Another previously known bound on the entropy for given average length follows immediately from our results. △ Less

Submitted 23 June, 2007; originally announced June 2007.

Comments: 9 pages, 2 figures

Journal ref: IET Communications, vol. 5, no. 9, pp. 1213-1219, 2011

arXiv:cs/0508039 [pdf, ps, other]

doi 10.1109/ITW.2006.1633796

Tight Bounds on the Redundancy of Huffman Codes

Authors: Soheil Mohajer, Payam Pakzad, Ali Kakhbod

Abstract: In this paper we study the redundancy of Huffman codes. In particular, we consider sources for which the probability of one of the source symbols is known. We prove a conjecture of Ye and Yeung regarding the upper bound on the redundancy of such Huffman codes, which yields in a tight upper bound. We also derive a tight lower bound for the redundancy under the same assumption. We further apply th… ▽ More In this paper we study the redundancy of Huffman codes. In particular, we consider sources for which the probability of one of the source symbols is known. We prove a conjecture of Ye and Yeung regarding the upper bound on the redundancy of such Huffman codes, which yields in a tight upper bound. We also derive a tight lower bound for the redundancy under the same assumption. We further apply the method introduced in this paper to other related problems. It is shown that several other previously known bounds with different constraints follow immediately from our results. △ Less

Submitted 14 June, 2012; v1 submitted 4 August, 2005; originally announced August 2005.

Comments: 23 pages, 7 figures, accepted for publication in IEEE Transaction on Information Theory

Showing 1–33 of 33 results for author: Mohajer, S