Search | arXiv e-print repository

Memory Complexity of Entropy Estimation

Authors: Tomer Berg, Or Ordentlich, Ofer Shayevitz

Abstract: We observe an infinite sequence of independent identically distributed random variables $X_1,X_2,\ldots$ drawn from an unknown distribution $p$ over $[n]$, and our goal is to estimate the entropy $H(p)=-\mathbb{E}[\log p(X)]$ within an $\varepsilon$-additive error. To that end, at each time point we are allowed to update a finite-state machine with $S$ states, using a possibly randomized but time-… ▽ More We observe an infinite sequence of independent identically distributed random variables $X_1,X_2,\ldots$ drawn from an unknown distribution $p$ over $[n]$, and our goal is to estimate the entropy $H(p)=-\mathbb{E}[\log p(X)]$ within an $\varepsilon$-additive error. To that end, at each time point we are allowed to update a finite-state machine with $S$ states, using a possibly randomized but time-invariant rule, where each state of the machine is assigned an entropy estimate. Our goal is to characterize the minimax memory complexity $S^*$ of this problem, which is the minimal number of states for which the estimation task is feasible with probability at least $1-δ$ asymptotically, uniformly in $p$. Specifically, we show that there exist universal constants $C_1$ and $C_2$ such that $ S^* \leq C_1\cdot\frac{n (\log n)^4}{\varepsilon^2δ}$ for $\varepsilon$ not too small, and $S^* \geq C_2 \cdot \max \{n, \frac{\log n}{\varepsilon}\}$ for $\varepsilon$ not too large. The upper bound is proved using approximate counting to estimate the logarithm of $p$, and a finite memory bias estimation machine to estimate the expectation operation. The lower bound is proved via a reduction of entropy estimation to uniformity testing. We also apply these results to derive bounds on the memory complexity of mutual information estimation. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.03427 [pdf, ps, other]

The strong data processing inequality under the heat flow

Authors: Bo'az Klartag, Or Ordentlich

Abstract: Let $ν$ and $μ$ be probability distributions on $\mathbb{R}^n$, and $ν_s,μ_s$ be their evolution under the heat flow, that is, the probability distributions resulting from convolving their density with the density of an isotropic Gaussian random vector with variance $s$ in each entry. This paper studies the rate of decay of $s\mapsto D(ν_s\|μ_s)$ for various divergences, including the $χ^2$ and Ku… ▽ More Let $ν$ and $μ$ be probability distributions on $\mathbb{R}^n$, and $ν_s,μ_s$ be their evolution under the heat flow, that is, the probability distributions resulting from convolving their density with the density of an isotropic Gaussian random vector with variance $s$ in each entry. This paper studies the rate of decay of $s\mapsto D(ν_s\|μ_s)$ for various divergences, including the $χ^2$ and Kullback-Leibler (KL) divergences. We prove upper and lower bounds on the strong data-processing inequality (SDPI) coefficients corresponding to the source $μ$ and the Gaussian channel. We also prove generalizations of de Bru**'s identity, and Costa's result on the concavity in $s$ of the differential entropy of $ν_s$. As a byproduct of our analysis, we obtain new lower bounds on the mutual information between $X$ and $Y=X+\sqrt{s} Z$, where $Z$ is a standard Gaussian vector in $\mathbb{R}^n$, independent of $X$, and on the minimum mean-square error (MMSE) in estimating $X$ from $Y$, in terms of the Poincaré constant of $X$. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2401.14710 [pdf, ps, other]

Lower Bounds on Mutual Information for Linear Codes Transmitted over Binary Input Channels, and for Information Combining

Authors: Uri Erez, Or Ordentlich, Shlomo Shamai

Abstract: It has been known for a long time that the mutual information between the input sequence and output of a binary symmetric channel (BSC) is upper bounded by the mutual information between the same input sequence and the output of a binary erasure channel (BEC) with the same capacity. Recently, Samorodintsky discovered that one may also lower bound the BSC mutual information in terms of the mutual i… ▽ More It has been known for a long time that the mutual information between the input sequence and output of a binary symmetric channel (BSC) is upper bounded by the mutual information between the same input sequence and the output of a binary erasure channel (BEC) with the same capacity. Recently, Samorodintsky discovered that one may also lower bound the BSC mutual information in terms of the mutual information between the same input sequence and a more capable BEC. In this paper, we strengthen Samordnitsky's bound for the special case where the input to the channel is distributed uniformly over a linear code. Furthermore, for a general (not necessarily binary) input distribution $P_X$ and channel $W_{Y|X}$, we derive a new lower bound on the mutual information $I(X;Y^n)$ for $n$ transmissions of $X\sim P_X$ through the channel $W_{Y|X}$. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2312.15225 [pdf, ps, other]

Statistical Inference with Limited Memory: A Survey

Authors: Tomer Berg, Or Ordentlich, Ofer Shayevitz

Abstract: The problem of statistical inference in its various forms has been the subject of decades-long extensive research. Most of the effort has been focused on characterizing the behavior as a function of the number of available samples, with far less attention given to the effect of memory limitations on performance. Recently, this latter topic has drawn much interest in the engineering and computer sc… ▽ More The problem of statistical inference in its various forms has been the subject of decades-long extensive research. Most of the effort has been focused on characterizing the behavior as a function of the number of available samples, with far less attention given to the effect of memory limitations on performance. Recently, this latter topic has drawn much interest in the engineering and computer science literature. In this survey paper, we attempt to review the state-of-the-art of statistical inference under memory constraints in several canonical problems, including hypothesis testing, parameter estimation, and distribution property testing/estimation. We discuss the main results in this develo** field, and by identifying recurrent themes, we extract some fundamental building blocks for algorithmic construction, as well as useful techniques for lower bound derivations. △ Less

Submitted 23 December, 2023; originally announced December 2023.

Comments: Submitted to JSAIT Special Issue

arXiv:2311.04644 [pdf, ps, other]

Bounds on the density of smooth lattice coverings

Authors: Or Ordentlich, Oded Regev, Barak Weiss

Abstract: Let $K$ be a convex body in $\mathbb{R}^n$, let $L$ be a lattice with covolume one, and let $η>0$. We say that $K$ and $L$ form an $η$-smooth cover if each point $x \in \mathbb{R}^n$ is covered by $(1 \pm η) vol(K)$ translates of $K$ by $L$. We prove that for any positive $σ, η$, asymptotically as $n \to \infty$, for any $K$ of volume $n^{3+σ}$, one can find a lattice $L$ for which $L, K$ form an… ▽ More Let $K$ be a convex body in $\mathbb{R}^n$, let $L$ be a lattice with covolume one, and let $η>0$. We say that $K$ and $L$ form an $η$-smooth cover if each point $x \in \mathbb{R}^n$ is covered by $(1 \pm η) vol(K)$ translates of $K$ by $L$. We prove that for any positive $σ, η$, asymptotically as $n \to \infty$, for any $K$ of volume $n^{3+σ}$, one can find a lattice $L$ for which $L, K$ form an $η$-smooth cover. Moreover, this property is satisfied with high probability for a lattice chosen randomly, according to the Haar-Siegel measure on the space of lattices. Similar results hold for random construction A lattices, albeit with a worse power law, provided the ratio between the covering and packing radii of $\mathbb{Z}^n$ with respect to $K$ is at most polynomial in $n$. Our proofs rely on a recent breakthrough by Dhar and Dvir on the discrete Kakeya problem. △ Less

Submitted 8 November, 2023; originally announced November 2023.

arXiv:2206.09395 [pdf, ps, other]

On The Memory Complexity of Uniformity Testing

Authors: Tomer Berg, Or Ordentlich, Ofer Shayevitz

Abstract: In this paper we consider the problem of uniformity testing with limited memory. We observe a sequence of independent identically distributed random variables drawn from a distribution $p$ over $[n]$, which is either uniform or is $\varepsilon$-far from uniform under the total variation distance, and our goal is to determine the correct hypothesis. At each time point we are allowed to update the s… ▽ More In this paper we consider the problem of uniformity testing with limited memory. We observe a sequence of independent identically distributed random variables drawn from a distribution $p$ over $[n]$, which is either uniform or is $\varepsilon$-far from uniform under the total variation distance, and our goal is to determine the correct hypothesis. At each time point we are allowed to update the state of a finite-memory machine with $S$ states, where each state of the machine is assigned one of the hypotheses, and we are interested in obtaining an asymptotic probability of error at most $0<δ<1/2$ uniformly under both hypotheses. The main contribution of this paper is deriving upper and lower bounds on the number of states $S$ needed in order to achieve a constant error probability $δ$, as a function of $n$ and $\varepsilon$, where our upper bound is $O(\frac{n\log n}{\varepsilon})$ and our lower bound is $Ω(n+\frac{1}{\varepsilon})$. Prior works in the field have almost exclusively used collision counting for upper bounds, and the Paninski mixture for lower bounds. Somewhat surprisingly, in the limited memory with unlimited samples setup, the optimal solution does not involve counting collisions, and the Paninski prior is not hard. Thus, different proof techniques are needed in order to attain our bounds. △ Less

Submitted 19 June, 2022; originally announced June 2022.

Comments: To be presented in COLT 2022

arXiv:2206.09390 [pdf, ps, other]

Deterministic Finite-Memory Bias Estimation

Authors: Tomer Berg, Or Ordentlich, Ofer Shayevitz

Abstract: In this paper we consider the problem of estimating a Bernoulli parameter using finite memory. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables with expectation $θ$, where $θ\in [0,1]$. Consider a finite-memory deterministic machine with $S$ states, that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time according to the rule… ▽ More In this paper we consider the problem of estimating a Bernoulli parameter using finite memory. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables with expectation $θ$, where $θ\in [0,1]$. Consider a finite-memory deterministic machine with $S$ states, that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time according to the rule $M_n = f(M_{n-1},X_n)$, where $f$ is a deterministic time-invariant function. Assume that the machine outputs an estimate at each time point according to some fixed map** from the state space to the unit interval. The quality of the estimation procedure is measured by the asymptotic risk, which is the long-term average of the instantaneous quadratic risk. The main contribution of this paper is an upper bound on the smallest worst-case asymptotic risk any such machine can attain. This bound coincides with a lower bound derived by Leighton and Rivest, to imply that $Θ(1/S)$ is the minimax asymptotic risk for deterministic $S$-state machines. In particular, our result disproves a longstanding $Θ(\log S/S)$ conjecture for this quantity, also posed by Leighton and Rivest. △ Less

Submitted 19 June, 2022; originally announced June 2022.

Comments: Presented in COLT 2021

arXiv:2202.07707 [pdf, ps, other]

On the Role of Channel Capacity in Learning Gaussian Mixture Models

Authors: Elad Romanov, Tamir Bendory, Or Ordentlich

Abstract: This paper studies the sample complexity of learning the $k$ unknown centers of a balanced Gaussian mixture model (GMM) in $\mathbb{R}^d$ with spherical covariance matrix $σ^2\mathbf{I}$. In particular, we are interested in the following question: what is the maximal noise level $σ^2$, for which the sample complexity is essentially the same as when estimating the centers from labeled measurements?… ▽ More This paper studies the sample complexity of learning the $k$ unknown centers of a balanced Gaussian mixture model (GMM) in $\mathbb{R}^d$ with spherical covariance matrix $σ^2\mathbf{I}$. In particular, we are interested in the following question: what is the maximal noise level $σ^2$, for which the sample complexity is essentially the same as when estimating the centers from labeled measurements? To that end, we restrict attention to a Bayesian formulation of the problem, where the centers are uniformly distributed on the sphere $\sqrt{d}\mathcal{S}^{d-1}$. Our main results characterize the exact noise threshold $σ^2$ below which the GMM learning problem, in the large system limit $d,k\to\infty$, is as easy as learning from labeled observations, and above which it is substantially harder. The threshold occurs at $\frac{\log k}{d} = \frac12\log\left( 1+\frac{1}{σ^2} \right)$, which is the capacity of the additive white Gaussian noise (AWGN) channel. Thinking of the set of $k$ centers as a code, this noise threshold can be interpreted as the largest noise level for which the error probability of the code over the AWGN channel is small. Previous works on the GMM learning problem have identified the minimum distance between the centers as a key parameter in determining the statistical difficulty of learning the corresponding GMM. While our results are only proved for GMMs whose centers are uniformly distributed over the sphere, they hint that perhaps it is the decoding error probability associated with the center constellation as a channel code that determines the statistical difficulty of learning the corresponding GMM, rather than just the minimum distance. △ Less

Submitted 14 June, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

Comments: COLT 2022

arXiv:2110.06183 [pdf, other]

Blind Modulo Analog-to-Digital Conversion of Vector Processes

Authors: Amir Weiss, Everest Huang, Or Ordentlich, Gregory W. Wornell

Abstract: In a growing number of applications, there is a need to digitize a (possibly high) number of correlated signals whose spectral characteristics are challenging for traditional analog-to-digital converters (ADCs). Examples, among others, include multiple-input multiple-output systems where the ADCs must acquire at once several signals at a very wide but sparsely and dynamically occupied bandwidth su… ▽ More In a growing number of applications, there is a need to digitize a (possibly high) number of correlated signals whose spectral characteristics are challenging for traditional analog-to-digital converters (ADCs). Examples, among others, include multiple-input multiple-output systems where the ADCs must acquire at once several signals at a very wide but sparsely and dynamically occupied bandwidth supporting diverse services. In such scenarios, the resolution requirements can be prohibitively high. As an alternative, the recently proposed modulo-ADC architecture can in principle require dramatically fewer bits in the conversion to obtain the target fidelity, but requires that spatiotemporal information be known and explicitly taken into account by the analog and digital processing in the converter, which is frequently impractical. Building on our recent work, we address this limitation and develop a blind version of the architecture that requires no such knowledge in the converter. In particular, it features an automatic modulo-level adjustment and a fully adaptive modulo-decoding mechanism, allowing it to asymptotically match the characteristics of the unknown input signal. Simulation results demonstrate the successful operation of the proposed algorithm. △ Less

Submitted 12 October, 2021; originally announced October 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2108.08937

arXiv:2110.01150 [pdf, other]

Spiked Covariance Estimation from Modulo-Reduced Measurements

Authors: Elad Romanov, Or Ordentlich

Abstract: Consider the rank-1 spiked model: $\bf{X}=\sqrtνξ\bf{u}+ \bf{Z}$, where $ν$ is the spike intensity, $\bf{u}\in\mathbb{S}^{k-1}$ is an unknown direction and $ξ\sim \mathcal{N}(0,1),\bf{Z}\sim \mathcal{N}(\bf{0},\bf{I})$. Motivated by recent advances in analog-to-digital conversion, we study the problem of recovering $\bf{u}\in \mathbb{S}^{k-1}$ from $n$ i.i.d. modulo-reduced measurements… ▽ More Consider the rank-1 spiked model: $\bf{X}=\sqrtνξ\bf{u}+ \bf{Z}$, where $ν$ is the spike intensity, $\bf{u}\in\mathbb{S}^{k-1}$ is an unknown direction and $ξ\sim \mathcal{N}(0,1),\bf{Z}\sim \mathcal{N}(\bf{0},\bf{I})$. Motivated by recent advances in analog-to-digital conversion, we study the problem of recovering $\bf{u}\in \mathbb{S}^{k-1}$ from $n$ i.i.d. modulo-reduced measurements $\bf{Y}=[\bf{X}]\mod Δ$, focusing on the high-dimensional regime ($k\gg 1$). We develop and analyze an algorithm that, for most directions $\bf{u}$ and $ν=\mathrm{poly}(k)$, estimates $\bf{u}$ to high accuracy using $n=\mathrm{poly}(k)$ measurements, provided that $Δ\gtrsim \sqrt{\log k}$. Up to constants, our algorithm accurately estimates $\bf{u}$ at the smallest possible $Δ$ that allows (in an information-theoretic sense) to recover $\bf{X}$ from $\bf{Y}$. A key step in our analysis involves estimating the probability that a line segment of length $\approx\sqrtν$ in a random direction $\bf{u}$ passes near a point in the lattice $Δ\mathbb{Z}^k$. Numerical experiments show that the developed algorithm performs well even in a non-asymptotic setting. △ Less

Submitted 19 May, 2022; v1 submitted 3 October, 2021; originally announced October 2021.

Comments: AISTATS, 2022

Journal ref: Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:1298-1320, 2022

arXiv:2105.05350 [pdf, other]

doi 10.3390/e23050605

On Compressed Sensing of Binary Signals for the Unsourced Random Access Channel

Authors: Elad Romanov, Or Ordentlich

Abstract: Motivated by applications in unsourced random access, this paper develops a novel scheme for the problem of compressed sensing of binary signals. In this problem, the goal is to design a sensing matrix $A$ and a recovery algorithm, such that the sparse binary vector $\mathbf{x}$ can be recovered reliably from the measurements $\mathbf{y}=A\mathbf{x}+σ\mathbf{z}$, where $\mathbf{z}$ is additive whi… ▽ More Motivated by applications in unsourced random access, this paper develops a novel scheme for the problem of compressed sensing of binary signals. In this problem, the goal is to design a sensing matrix $A$ and a recovery algorithm, such that the sparse binary vector $\mathbf{x}$ can be recovered reliably from the measurements $\mathbf{y}=A\mathbf{x}+σ\mathbf{z}$, where $\mathbf{z}$ is additive white Gaussian noise. We propose to design $A$ as a parity check matrix of a low-density parity-check code (LDPC), and to recover $\mathbf{x}$ from the measurements $\mathbf{y}$ using a Markov chain Monte Carlo algorithm, which runs relatively fast due to the sparse structure of $A$. The performance of our scheme is comparable to state-of-the-art schemes, which use dense sensing matrices, while enjoying the advantages of using a sparse sensing matrix. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: Accepted to Entropy Special Issue on "Information-Theoretic Aspects of Non-Orthogonal and Massive Access for Future Wireless Networks"

Journal ref: Entropy 23.5 (2021): 605

arXiv:2103.02646 [pdf, other]

doi 10.1109/ISIT45174.2021.9517956

Critical Slowing Down Near Topological Transitions in Rate-Distortion Problems

Authors: Shlomi Agmon, Etam Benger, Or Ordentlich, Naftali Tishby

Abstract: In rate-distortion (RD) problems one seeks reduced representations of a source that meet a target distortion constraint. Such optimal representations undergo topological transitions at some critical rate values, when their cardinality or dimensionality change. We study the convergence time of the Arimoto-Blahut alternating projection algorithms, used to solve such problems, near those critical poi… ▽ More In rate-distortion (RD) problems one seeks reduced representations of a source that meet a target distortion constraint. Such optimal representations undergo topological transitions at some critical rate values, when their cardinality or dimensionality change. We study the convergence time of the Arimoto-Blahut alternating projection algorithms, used to solve such problems, near those critical points, both for the rate-distortion and information bottleneck settings. We argue that they suffer from critical slowing down -- a diverging number of iterations for convergence -- near the critical points. This phenomenon can have theoretical and practical implications for both machine learning and data compression problems. △ Less

Submitted 9 May, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

Comments: 10 pages, 2 figures, ISIT 2021 submission

Journal ref: 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, 2021, pp. 2625-2630

arXiv:2102.08184 [pdf, other]

Constructing Multiclass Classifiers using Binary Classifiers Under Log-Loss

Authors: Assaf Ben-Yishai, Or Ordentlich

Abstract: The construction of multiclass classifiers from binary elements is studied in this paper, and performance is quantified by the regret, defined with respect to the Bayes optimal log-loss. We discuss two known methods. The first is one vs. all (OVA), for which we prove that the multiclass regret is upper bounded by the sum of binary regrets of the constituent classifiers. The second is hierarchical… ▽ More The construction of multiclass classifiers from binary elements is studied in this paper, and performance is quantified by the regret, defined with respect to the Bayes optimal log-loss. We discuss two known methods. The first is one vs. all (OVA), for which we prove that the multiclass regret is upper bounded by the sum of binary regrets of the constituent classifiers. The second is hierarchical classification, based on a binary tree. For this method we prove that the multiclass regret is exactly a weighted sum of constituent binary regrets where the weighing is determined by the tree structure. We also introduce a leverage-hierarchical classification method, which potentially yields smaller log-loss and regret. The advantages of these classification methods are demonstrated by simulation on both synthetic and real-life datasets. △ Less

Submitted 12 August, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

Comments: A shorter version of this contribution was presented in ISIT 2021

arXiv:2010.01987 [pdf, ps, other]

Strong data processing constant is achieved by binary inputs

Authors: Or Ordentlich, Yury Polyanskiy

Abstract: For any channel $P_{Y|X}$ the strong data processing constant is defined as the smallest number $η_{KL}\in[0,1]$ such that $I(U;Y)\le η_{KL} I(U;X)$ holds for any Markov chain $U-X-Y$. It is shown that the value of $η_{KL}$ is given by that of the best binary-input subchannel of $P_{Y|X}$. The same result holds for any $f$-divergence, verifying a conjecture of Cohen, Kemperman and Zbaganu (1998). For any channel $P_{Y|X}$ the strong data processing constant is defined as the smallest number $η_{KL}\in[0,1]$ such that $I(U;Y)\le η_{KL} I(U;X)$ holds for any Markov chain $U-X-Y$. It is shown that the value of $η_{KL}$ is given by that of the best binary-input subchannel of $P_{Y|X}$. The same result holds for any $f$-divergence, verifying a conjecture of Cohen, Kemperman and Zbaganu (1998). △ Less

Submitted 3 June, 2021; v1 submitted 15 September, 2020; originally announced October 2020.

Comments: 1 page

arXiv:2007.11482 [pdf, other]

doi 10.1137/20M1354994

Multi-reference alignment in high dimensions: sample complexity and phase transition

Authors: Elad Romanov, Tamir Bendory, Or Ordentlich

Abstract: Multi-reference alignment entails estimating a signal in $\mathbb{R}^L$ from its circularly-shifted and noisy copies. This problem has been studied thoroughly in recent years, focusing on the finite-dimensional setting (fixed $L$). Motivated by single-particle cryo-electron microscopy, we analyze the sample complexity of the problem in the high-dimensional regime $L\to\infty$. Our analysis uncover… ▽ More Multi-reference alignment entails estimating a signal in $\mathbb{R}^L$ from its circularly-shifted and noisy copies. This problem has been studied thoroughly in recent years, focusing on the finite-dimensional setting (fixed $L$). Motivated by single-particle cryo-electron microscopy, we analyze the sample complexity of the problem in the high-dimensional regime $L\to\infty$. Our analysis uncovers a phase transition phenomenon governed by the parameter $α= L/(σ^2\log L)$, where $σ^2$ is the variance of the noise. When $α>2$, the impact of the unknown circular shifts on the sample complexity is minor. Namely, the number of measurements required to achieve a desired accuracy $\varepsilon$ approaches $σ^2/\varepsilon$ for small $\varepsilon$; this is the sample complexity of estimating a signal in additive white Gaussian noise, which does not involve shifts. In sharp contrast, when $α\leq 2$, the problem is significantly harder and the sample complexity grows substantially quicker with $σ^2$. △ Less

Submitted 30 September, 2021; v1 submitted 22 July, 2020; originally announced July 2020.

Journal ref: SIAM Journal on Mathematics of Data Science 3.2 (2021): 494-523

arXiv:2006.00340 [pdf, ps, other]

New bounds on the density of lattice coverings

Authors: Or Ordentlich, Oded Regev, Barak Weiss

Abstract: We obtain new upper bounds on the minimal density of lattice coverings of Euclidean space by dilates of a convex body K. We also obtain bounds on the probability (with respect to the natural Haar-Siegel measure on the space of lattices) that a randomly chosen lattice L satisfies that L+K is all of space. As a step in the proof, we utilize and strengthen results on the discrete Kakeya problem. We obtain new upper bounds on the minimal density of lattice coverings of Euclidean space by dilates of a convex body K. We also obtain bounds on the probability (with respect to the natural Haar-Siegel measure on the space of lattices) that a randomly chosen lattice L satisfies that L+K is all of space. As a step in the proof, we utilize and strengthen results on the discrete Kakeya problem. △ Less

Submitted 30 May, 2020; originally announced June 2020.

MSC Class: 11H31; 94B75; 11T30

arXiv:2005.07445 [pdf, ps, other]

Binary Hypothesis Testing with Deterministic Finite-Memory Decision Rules

Authors: Tomer Berg, Ofer Shayevitz, Or Ordentlich

Abstract: In this paper we consider the problem of binary hypothesis testing with finite memory systems. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables, with expectation $p$ under $\mathcal{H}_0$ and $q$ under $\mathcal{H}_1$. Consider a finite-memory deterministic machine with $S$ states that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time acc… ▽ More In this paper we consider the problem of binary hypothesis testing with finite memory systems. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables, with expectation $p$ under $\mathcal{H}_0$ and $q$ under $\mathcal{H}_1$. Consider a finite-memory deterministic machine with $S$ states that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time according to the rule $M_n = f(M_{n-1},X_n)$, where $f$ is a deterministic time-invariant function. Assume that we let the process run for a very long time ($n\rightarrow \infty)$, and then make our decision according to some map** from the state space to the hypothesis space. The main contribution of this paper is a lower bound on the Bayes error probability $P_e$ of any such machine. In particular, our findings show that the ratio between the maximal exponential decay rate of $P_e$ with $S$ for a deterministic machine and for a randomized one, can become unbounded, complementing a result by Hellman. △ Less

Submitted 15 May, 2020; originally announced May 2020.

Comments: To be presented at ISIT 2020

arXiv:2004.09935 [pdf, ps, other]

An Information-Theoretic Proof of the Streaming Switching Lemma for Symmetric Encryption

Authors: Ido Shahaf, Or Ordentlich, Gil Segev

Abstract: Motivated by a fundamental paradigm in cryptography, we consider a recent variant of the classic problem of bounding the distinguishing advantage between a random function and a random permutation. Specifically, we consider the problem of deciding whether a sequence of $q$ values was sampled uniformly with or without replacement from $[N]$, where the decision is made by a streaming algorithm restr… ▽ More Motivated by a fundamental paradigm in cryptography, we consider a recent variant of the classic problem of bounding the distinguishing advantage between a random function and a random permutation. Specifically, we consider the problem of deciding whether a sequence of $q$ values was sampled uniformly with or without replacement from $[N]$, where the decision is made by a streaming algorithm restricted to using at most $s$ bits of internal memory. In this work, the distinguishing advantage of such an algorithm is measured by the KL divergence between the distributions of its output as induced under the two cases. We show that for any $s=Ω(\log N)$ the distinguishing advantage is upper bounded by $O(q \cdot s / N)$, and even by $O(q \cdot s / N \log N)$ when $q \leq N^{1 - ε}$ for any constant $ε> 0$ where it is nearly tight with respect to the KL divergence. △ Less

Submitted 21 April, 2020; originally announced April 2020.

arXiv:2004.00869 [pdf, other]

An Upgrading Algorithm with Optimal Power Law

Authors: Or Ordentlich, Ido Tal

Abstract: Consider a channel $W$ along with a given input distribution $P_X$. In certain settings, such as in the construction of polar codes, the output alphabet of $W$ is `too large', and hence we replace $W$ by a channel $Q$ having a smaller output alphabet. We say that $Q$ is upgraded with respect to $W$ if $W$ is obtained from $Q$ by processing its output. In this case, the mutual information… ▽ More Consider a channel $W$ along with a given input distribution $P_X$. In certain settings, such as in the construction of polar codes, the output alphabet of $W$ is `too large', and hence we replace $W$ by a channel $Q$ having a smaller output alphabet. We say that $Q$ is upgraded with respect to $W$ if $W$ is obtained from $Q$ by processing its output. In this case, the mutual information $I(P_X,W)$ between the input and output of $W$ is upper-bounded by the mutual information $I(P_X,Q)$ between the input and output of $Q$. In this paper, we present an algorithm that produces an upgraded channel $Q$ from $W$, as a function of $P_X$ and the required output alphabet size of $Q$, denoted $L$. We show that the difference in mutual informations is `small'. Namely, it is $O(L^{-2/(|\mathcal{X}|-1)})$, where $|\mathcal{X}|$ is the size of the input alphabet. This power law of $L$ is optimal. We complement our analysis with numerical experiments which show that the developed algorithm improves upon the existing state-of-the-art algorithms also in non-asymptotic setups. △ Less

Submitted 28 May, 2021; v1 submitted 2 April, 2020; originally announced April 2020.

arXiv:1909.01221 [pdf, ps, other]

A Note on the Probability of Rectangles for Correlated Binary Strings

Authors: Or Ordentlich, Yury Polyanskiy, Ofer Shayevitz

Abstract: Consider two sequences of $n$ independent and identically distributed fair coin tosses, $X=(X_1,\ldots,X_n)$ and $Y=(Y_1,\ldots,Y_n)$, which are $ρ$-correlated for each $j$, i.e. $\mathbb{P}[X_j=Y_j] = {1+ρ\over 2}$. We study the question of how large (small) the probability $\mathbb{P}[X \in A, Y\in B]$ can be among all sets $A,B\subset\{0,1\}^n$ of a given cardinality. For sets… ▽ More Consider two sequences of $n$ independent and identically distributed fair coin tosses, $X=(X_1,\ldots,X_n)$ and $Y=(Y_1,\ldots,Y_n)$, which are $ρ$-correlated for each $j$, i.e. $\mathbb{P}[X_j=Y_j] = {1+ρ\over 2}$. We study the question of how large (small) the probability $\mathbb{P}[X \in A, Y\in B]$ can be among all sets $A,B\subset\{0,1\}^n$ of a given cardinality. For sets $|A|,|B| = Θ(2^n)$ it is well known that the largest (smallest) probability is approximately attained by concentric (anti-concentric) Hamming balls, and this can be proved via the hypercontractive inequality (reverse hypercontractivity). Here we consider the case of $|A|,|B| = 2^{Θ(n)}$. By applying a recent extension of the hypercontractive inequality of Polyanskiy-Samorodnitsky (J. Functional Analysis, 2019), we show that Hamming balls of the same size approximately maximize $\mathbb{P}[X \in A, Y\in B]$ in the regime of $ρ\to 1$. We also prove a similar tight lower bound, i.e. show that for $ρ\to 0$ the pair of opposite Hamming balls approximately minimizes the probability $\mathbb{P}[X \in A, Y\in B]$. △ Less

Submitted 18 August, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

arXiv:1908.07367 [pdf, ps, other]

A Lower Bound on the Essential Interactive Capacity of Binary Memoryless Symmetric Channels

Authors: Assaf Ben-Yishai, Young-Han Kim, Or Ordentlich, Ofer Shayevitz

Abstract: The essential interactive capacity of a discrete memoryless channel is defined in this paper as the maximal rate at which the transcript of any interactive protocol can be reliably simulated over the channel, using a deterministic coding scheme. In contrast to other interactive capacity definitions in the literature, this definition makes no assumptions on the order of speakers (which can be adapt… ▽ More The essential interactive capacity of a discrete memoryless channel is defined in this paper as the maximal rate at which the transcript of any interactive protocol can be reliably simulated over the channel, using a deterministic coding scheme. In contrast to other interactive capacity definitions in the literature, this definition makes no assumptions on the order of speakers (which can be adaptive) and does not allow any use of private / public randomness; hence, the essential interactive capacity is a function of the channel model only. It is shown that the essential interactive capacity of any binary memoryless symmetric (BMS) channel is at least $0.0302$ its Shannon capacity. To that end, we present a simple coding scheme, based on extended-Hamming codes combined with error detection, that achieves the lower bound in the special case of the binary symmetric channel (BSC). We then adapt the scheme to the entire family of BMS channels, and show that it achieves the same lower bound using extremes of the Bhattacharyya parameter. △ Less

Submitted 12 August, 2021; v1 submitted 20 August, 2019; originally announced August 2019.

arXiv:1903.12289 [pdf, ps, other]

doi 10.1109/LSP.2019.2923835

Above the Nyquist Rate, Modulo Folding Does Not Hurt

Authors: Elad Romanov, Or Ordentlich

Abstract: We consider the problem of recovering a continuous-time bandlimited signal from the discrete-time signal obtained from sampling it every $T_s$ seconds and reducing the result modulo $Δ$, for some $Δ>0$. For $Δ=\infty$ the celebrated Shannon-Nyquist sampling theorem guarantees that perfect recovery is possible provided that the sampling rate $1/T_s$ exceeds the so-called Nyquist rate. Recent work b… ▽ More We consider the problem of recovering a continuous-time bandlimited signal from the discrete-time signal obtained from sampling it every $T_s$ seconds and reducing the result modulo $Δ$, for some $Δ>0$. For $Δ=\infty$ the celebrated Shannon-Nyquist sampling theorem guarantees that perfect recovery is possible provided that the sampling rate $1/T_s$ exceeds the so-called Nyquist rate. Recent work by Bhandari et al. has shown that for any $Δ>0$ perfect reconstruction is still possible if the sampling rate exceeds the Nyquist rate by a factor of $πe$. In this letter we improve upon this result and show that for finite energy signals, perfect recovery is possible for any $Δ>0$ and any sampling rate above the Nyquist rate. Thus, modulo folding does not degrade the signal, provided that the sampling rate exceeds the Nyquist rate. This claim is proved by establishing a connection between the recovery problem of a discrete-time signal from its modulo reduced version and the problem of predicting the next sample of a discrete-time signal from its past, and leveraging the fact that for a bandlimited signal the prediction error can be made arbitrarily small. △ Less

Submitted 28 March, 2019; originally announced March 2019.

Journal ref: IEEE Signal Processing Letters 26.8 (2019): 1167-1171

arXiv:1902.07979 [pdf, ps, other]

A Lower Bound on the Expected Distortion of Joint Source-Channel Coding

Authors: Yuval Kochman, Or Ordentlich, Yury Polyanskiy

Abstract: We consider the classic joint source-channel coding problem of transmitting a memoryless source over a memoryless channel. The focus of this work is on the long-standing open problem of finding the rate of convergence of the smallest attainable expected distortion to its asymptotic value, as a function of blocklength $n$. Our main result is that in general the convergence rate is not faster than… ▽ More We consider the classic joint source-channel coding problem of transmitting a memoryless source over a memoryless channel. The focus of this work is on the long-standing open problem of finding the rate of convergence of the smallest attainable expected distortion to its asymptotic value, as a function of blocklength $n$. Our main result is that in general the convergence rate is not faster than $n^{-1/2}$. In particular, we show that for the problem of transmitting i.i.d uniform bits over a binary symmetric channels with Hamming distortion, the smallest attainable distortion (bit error rate) is at least $Ω(n^{-1/2})$ above the asymptotic value, if the ``bandwidth expansion ratio'' is above $1$. △ Less

Submitted 26 August, 2019; v1 submitted 21 February, 2019; originally announced February 2019.

arXiv:1901.10396 [pdf, other]

doi 10.1109/TIT.2021.3053426

Blind Unwrap** of Modulo Reduced Gaussian Vectors: Recovering MSBs from LSBs

Authors: Elad Romanov, Or Ordentlich

Abstract: We consider the problem of recovering $n$ i.i.d samples from a zero mean multivariate Gaussian distribution with an unknown covariance matrix, from their modulo wrapped measurements, i.e., measurement where each coordinate is reduced modulo $Δ$, for some $Δ>0$. For this setup, which is motivated by quantization and analog-to-digital conversion, we develop a low-complexity iterative decoding algori… ▽ More We consider the problem of recovering $n$ i.i.d samples from a zero mean multivariate Gaussian distribution with an unknown covariance matrix, from their modulo wrapped measurements, i.e., measurement where each coordinate is reduced modulo $Δ$, for some $Δ>0$. For this setup, which is motivated by quantization and analog-to-digital conversion, we develop a low-complexity iterative decoding algorithm. We show that if a benchmark informed decoder that knows the covariance matrix can recover each sample with small error probability, and $n$ is large enough, the performance of the proposed blind recovery algorithm closely follows that of the informed one. We complement the analysis with numeric results that show that the algorithm performs well even in non-asymptotic conditions. △ Less

Submitted 18 September, 2021; v1 submitted 29 January, 2019; originally announced January 2019.

Journal ref: IEEE Transactions on Information Theory 67.3 (2021): 1897-1919

arXiv:1812.03031 [pdf, ps, other]

Information-Distilling Quantizers

Authors: Alankrita Bhatt, Bobak Nazer, Or Ordentlich, Yury Polyanskiy

Abstract: Let $X$ and $Y$ be dependent random variables. This paper considers the problem of designing a scalar quantizer for $Y$ to maximize the mutual information between the quantizer's output and $X$, and develops fundamental properties and bounds for this form of quantization, which is connected to the log-loss distortion criterion. The main focus is the regime of low $I(X;Y)$, where it is shown that,… ▽ More Let $X$ and $Y$ be dependent random variables. This paper considers the problem of designing a scalar quantizer for $Y$ to maximize the mutual information between the quantizer's output and $X$, and develops fundamental properties and bounds for this form of quantization, which is connected to the log-loss distortion criterion. The main focus is the regime of low $I(X;Y)$, where it is shown that, if $X$ is binary, a constant fraction of the mutual information can always be preserved using $\mathcal{O}(\log(1/I(X;Y)))$ quantization levels, and there exist distributions for which this many quantization levels are necessary. Furthermore, for larger finite alphabets $2 < |\mathcal{X}| < \infty$, it is established that an $η$-fraction of the mutual information can be preserved using roughly $(\log(| \mathcal{X} | /I(X;Y)))^{η\cdot(|\mathcal{X}| - 1)}$ quantization levels. △ Less

Submitted 29 October, 2019; v1 submitted 7 December, 2018; originally announced December 2018.

arXiv:1806.08968 [pdf, ps, other]

doi 10.1109/JSTSP.2018.2863189

A Modulo-Based Architecture for Analog-to-Digital Conversion

Authors: Or Ordentlich, Gizem Tabak, Pavan Kumar Hanumolu, Andrew C. Singer, Gregory W. Wornell

Abstract: Systems that capture and process analog signals must first acquire them through an analog-to-digital converter. While subsequent digital processing can remove statistical correlations present in the acquired data, the dynamic range of the converter is typically scaled to match that of the input analog signal. The present paper develops an approach for analog-to-digital conversion that aims at mini… ▽ More Systems that capture and process analog signals must first acquire them through an analog-to-digital converter. While subsequent digital processing can remove statistical correlations present in the acquired data, the dynamic range of the converter is typically scaled to match that of the input analog signal. The present paper develops an approach for analog-to-digital conversion that aims at minimizing the number of bits per sample at the output of the converter. This is attained by reducing the dynamic range of the analog signal by performing a modulo operation on its amplitude, and then quantizing the result. While the converter itself is universal and agnostic of the statistics of the signal, the decoder operation on the output of the quantizer can exploit the statistical structure in order to unwrap the modulo folding. The performance of this method is shown to approach information theoretical limits, as captured by the rate-distortion function, in various settings. An architecture for modulo analog-to-digital conversion via ring oscillators is suggested, and its merits are numerically demonstrated. △ Less

Submitted 23 June, 2018; originally announced June 2018.

arXiv:1801.09481 [pdf, ps, other]

Almost Optimal Scaling of Reed-Muller Codes on BEC and BSC Channels

Authors: Hamed Hassani, Shrinivas Kudekar, Or Ordentlich, Yury Polyanskiy, Rüdiger Urbanke

Abstract: Consider a binary linear code of length $N$, minimum distance $d_{\text{min}}$, transmission over the binary erasure channel with parameter $0 < ε< 1$ or the binary symmetric channel with parameter $0 < ε< \frac12$, and block-MAP decoding. It was shown by Tillich and Zemor that in this case the error probability of the block-MAP decoder transitions "quickly" from $δ$ to $1-δ$ for any $δ>0$ if the… ▽ More Consider a binary linear code of length $N$, minimum distance $d_{\text{min}}$, transmission over the binary erasure channel with parameter $0 < ε< 1$ or the binary symmetric channel with parameter $0 < ε< \frac12$, and block-MAP decoding. It was shown by Tillich and Zemor that in this case the error probability of the block-MAP decoder transitions "quickly" from $δ$ to $1-δ$ for any $δ>0$ if the minimum distance is large. In particular the width of the transition is of order $O(1/\sqrt{d_{\text{min}}})$. We strengthen this result by showing that under suitable conditions on the weight distribution of the code, the transition width can be as small as $Θ(1/N^{\frac12-κ})$, for any $κ>0$, even if the minimum distance of the code is not linear. This condition applies e.g., to Reed-Mueller codes. Since $Θ(1/N^{\frac12})$ is the smallest transition possible for any code, we speak of "almost" optimal scaling. We emphasize that the width of the transition says nothing about the location of the transition. Therefore this result has no bearing on whether a code is capacity-achieving or not. As a second contribution, we present a new estimate on the derivative of the EXIT function, the proof of which is based on the Blowing-Up Lemma. △ Less

Submitted 29 January, 2018; originally announced January 2018.

Comments: Submitted to ISIT 2018

arXiv:1701.03119 [pdf, ps, other]

How to Quantize $n$ Outputs of a Binary Symmetric Channel to $n-1$ Bits?

Authors: Wasim Huleihel, Or Ordentlich

Abstract: Suppose that $Y^n$ is obtained by observing a uniform Bernoulli random vector $X^n$ through a binary symmetric channel with crossover probability $α$. The "most informative Boolean function" conjecture postulates that the maximal mutual information between $Y^n$ and any Boolean function $\mathrm{b}(X^n)$ is attained by a dictator function. In this paper, we consider the "complementary" case in whi… ▽ More Suppose that $Y^n$ is obtained by observing a uniform Bernoulli random vector $X^n$ through a binary symmetric channel with crossover probability $α$. The "most informative Boolean function" conjecture postulates that the maximal mutual information between $Y^n$ and any Boolean function $\mathrm{b}(X^n)$ is attained by a dictator function. In this paper, we consider the "complementary" case in which the Boolean function is replaced by $f:\left\{0,1\right\}^n\to\left\{0,1\right\}^{n-1}$, namely, an $n-1$ bit quantizer, and show that $I(f(X^n);Y^n)\leq (n-1)\cdot\left(1-h(α)\right)$ for any such $f$. Thus, in this case, the optimal function is of the form $f(x^n)=(x_1,\ldots,x_{n-1})$. △ Less

Submitted 2 May, 2017; v1 submitted 11 January, 2017; originally announced January 2017.

Comments: 5 pages, accepted ISIT 2017

arXiv:1601.06453 [pdf, ps, other]

Novel Lower Bounds on the Entropy Rate of Binary Hidden Markov Processes

Authors: Or Ordentlich

Abstract: Recently, Samorodnitsky proved a strengthened version of Mrs. Gerber's Lemma, where the output entropy of a binary symmetric channel is bounded in terms of the average entropy of the input projected on a random subset of coordinates. Here, this result is applied for deriving novel lower bounds on the entropy rate of binary hidden Markov processes. For symmetric underlying Markov processes, our bou… ▽ More Recently, Samorodnitsky proved a strengthened version of Mrs. Gerber's Lemma, where the output entropy of a binary symmetric channel is bounded in terms of the average entropy of the input projected on a random subset of coordinates. Here, this result is applied for deriving novel lower bounds on the entropy rate of binary hidden Markov processes. For symmetric underlying Markov processes, our bound improves upon the best known bound in the very noisy regime. The nonsymmetric case is also considered, and explicit bounds are derived for Markov processes that satisfy the $(1,\infty)$-RLL constraint. △ Less

Submitted 9 May, 2016; v1 submitted 24 January, 2016; originally announced January 2016.

arXiv:1507.06296 [pdf, ps, other]

doi 10.1109/TIT.2016.2609390

Mutual Information Bounds via Adjacency Events

Authors: Yanjun Han, Or Ordentlich, Ofer Shayevitz

Abstract: The mutual information between two jointly distributed random variables $X$ and $Y$ is a functional of the joint distribution $P_{XY},$ which is sometimes difficult to handle or estimate. A coarser description of the statistical behavior of $(X,Y)$ is given by the marginal distributions $P_X, P_Y$ and the adjacency relation induced by the joint distribution, where $x$ and $y$ are adjacent if… ▽ More The mutual information between two jointly distributed random variables $X$ and $Y$ is a functional of the joint distribution $P_{XY},$ which is sometimes difficult to handle or estimate. A coarser description of the statistical behavior of $(X,Y)$ is given by the marginal distributions $P_X, P_Y$ and the adjacency relation induced by the joint distribution, where $x$ and $y$ are adjacent if $P(x,y)>0$. We derive a lower bound on the mutual information in terms of these entities. The bound is obtained by viewing the channel from $X$ to $Y$ as a probability distribution on a set of possible actions, where an action determines the output for any possible input, and is independently drawn. We also provide an alternative proof based on convex optimization, that yields a generally tighter bound. Finally, we derive an upper bound on the mutual information in terms of adjacency events between the action and the pair $(X,Y)$, where in this case an action $a$ and a pair $(x,y)$ are adjacent if $y=a(x)$. As an example, we apply our bounds to the binary deletion channel and show that for the special case of an i.i.d. input distribution and a range of deletion probabilities, our lower and upper bounds both outperform the best known bounds for the mutual information. △ Less

Submitted 8 September, 2016; v1 submitted 22 July, 2015; originally announced July 2015.

Comments: Accepted for publication in the IEEE Transactions on Information Theory

arXiv:1506.00253 [pdf, ps, other]

Minimum MS. E. Gerber's Lemma

Authors: Or Ordentlich, Ofer Shayevitz

Abstract: Mrs. Gerber's Lemma lower bounds the entropy at the output of a binary symmetric channel in terms of the entropy of the input process. In this paper, we lower bound the output entropy via a different measure of input uncertainty, pertaining to the minimum mean squared error (MMSE) prediction cost of the input process. We show that in many cases our bound is tighter than the one obtained from Mrs.… ▽ More Mrs. Gerber's Lemma lower bounds the entropy at the output of a binary symmetric channel in terms of the entropy of the input process. In this paper, we lower bound the output entropy via a different measure of input uncertainty, pertaining to the minimum mean squared error (MMSE) prediction cost of the input process. We show that in many cases our bound is tighter than the one obtained from Mrs. Gerber's Lemma. As an application, we evaluate the bound for binary hidden Markov processes, and obtain new estimates for the entropy rate. △ Less

Submitted 31 May, 2015; originally announced June 2015.

arXiv:1505.05794 [pdf, ps, other]

An Improved Upper Bound for the Most Informative Boolean Function Conjecture

Authors: Or Ordentlich, Ofer Shayevitz, Omri Weinstein

Abstract: Suppose $X$ is a uniformly distributed $n$-dimensional binary vector and $Y$ is obtained by passing $X$ through a binary symmetric channel with crossover probability $α$. A recent conjecture by Courtade and Kumar postulates that $I(f(X);Y)\leq 1-h(α)$ for any Boolean function $f$. So far, the best known upper bound was $I(f(X);Y)\leq (1-2α)^2$. In this paper, we derive a new upper bound that holds… ▽ More Suppose $X$ is a uniformly distributed $n$-dimensional binary vector and $Y$ is obtained by passing $X$ through a binary symmetric channel with crossover probability $α$. A recent conjecture by Courtade and Kumar postulates that $I(f(X);Y)\leq 1-h(α)$ for any Boolean function $f$. So far, the best known upper bound was $I(f(X);Y)\leq (1-2α)^2$. In this paper, we derive a new upper bound that holds for all balanced functions, and improves upon the best known bound for all $\tfrac{1}{3}<α<\tfrac{1}{2}$. △ Less

Submitted 31 May, 2015; v1 submitted 21 May, 2015; originally announced May 2015.

arXiv:1501.01829 [pdf, ps, other]

Performance Analysis and Optimal Filter Design for Sigma-Delta Modulation via Duality with DPCM

Authors: Or Ordentlich, Uri Erez

Abstract: Sampling above the Nyquist rate is at the heart of sigma-delta modulation, where the increase in sampling rate is translated to a reduction in the overall (mean-squared-error) reconstruction distortion. This is attained by using a feedback filter at the encoder, in conjunction with a low-pass filter at the decoder. The goal of this work is to characterize the optimal trade-off between the per-samp… ▽ More Sampling above the Nyquist rate is at the heart of sigma-delta modulation, where the increase in sampling rate is translated to a reduction in the overall (mean-squared-error) reconstruction distortion. This is attained by using a feedback filter at the encoder, in conjunction with a low-pass filter at the decoder. The goal of this work is to characterize the optimal trade-off between the per-sample quantization rate and the resulting mean-squared-error distortion, under various restrictions on the feedback filter. To this end, we establish a duality relation between the performance of sigma-delta modulation, and that of differential pulse-code modulation when applied to (discrete-time) band-limited inputs. As the optimal trade-off for the latter scheme is fully understood, the full characterization for sigma-delta modulation, as well as the optimal feedback filters, immediately follow. △ Less

Submitted 9 June, 2015; v1 submitted 8 January, 2015; originally announced January 2015.

arXiv:1412.8670 [pdf, ps, other]

A VC-dimension-based Outer Bound on the Zero-Error Capacity of the Binary Adder Channel

Authors: Or Ordentlich, Ofer Shayevitz

Abstract: The binary adder is a two-user multiple access channel whose inputs are binary and whose output is the real sum of the inputs. While the Shannon capacity region of this channel is well known, little is known regarding its zero-error capacity region, and a large gap remains between the best inner and outer bounds. In this paper, we provide an improved outer bound for this problem. To that end, we i… ▽ More The binary adder is a two-user multiple access channel whose inputs are binary and whose output is the real sum of the inputs. While the Shannon capacity region of this channel is well known, little is known regarding its zero-error capacity region, and a large gap remains between the best inner and outer bounds. In this paper, we provide an improved outer bound for this problem. To that end, we introduce a soft variation of the Saur-Perles-Shelah Lemma, that is then used in conjunction with an outer bound for the Shannon capacity region with an additional common message. △ Less

Submitted 30 December, 2014; originally announced December 2014.

Comments: Submitted to ISIT 2015. An extended version titled "An Upper Bound on the Sizes of Multiset-Union-Free Families" is available online at arXiv:1412.8415

arXiv:1412.8415 [pdf, ps, other]

An Upper Bound on the Sizes of Multiset-Union-Free Families

Authors: Or Ordentlich, Ofer Shayevitz

Abstract: Let $\mathcal{F}_1$ and $\mathcal{F}_2$ be two families of subsets of an $n$-element set. We say that $\mathcal{F}_1$ and $\mathcal{F}_2$ are multiset-union-free if for any $A,B\in \mathcal{F}_1$ and $C,D\in \mathcal{F}_2$ the multisets $A\uplus C$ and $B\uplus D$ are different, unless both $A = B$ and $C= D$. We derive a new upper bound on the maximal sizes of multiset-union-free pairs, improving… ▽ More Let $\mathcal{F}_1$ and $\mathcal{F}_2$ be two families of subsets of an $n$-element set. We say that $\mathcal{F}_1$ and $\mathcal{F}_2$ are multiset-union-free if for any $A,B\in \mathcal{F}_1$ and $C,D\in \mathcal{F}_2$ the multisets $A\uplus C$ and $B\uplus D$ are different, unless both $A = B$ and $C= D$. We derive a new upper bound on the maximal sizes of multiset-union-free pairs, improving a result of Urbanke and Li. △ Less

Submitted 29 December, 2014; originally announced December 2014.

Comments: A shorter ISIT conference version titled "VC-Dimension Based Outer Bound on the Zero-Error Capacity of the Binary Adder Channel" is available

arXiv:1411.0443 [pdf, ps, other]

Subset-Universal Lossy Compression

Authors: Or Ordentlich, Ofer Shayevitz

Abstract: A lossy source code $\mathcal{C}$ with rate $R$ for a discrete memoryless source $S$ is called subset-universal if for every $0<R'< R$, almost every subset of $2^{nR'}$ of its codewords achieves average distortion close to the source's distortion-rate function $D(R')$. In this paper we prove the asymptotic existence of such codes. Moreover, we show the asymptotic existence of a code that is subset… ▽ More A lossy source code $\mathcal{C}$ with rate $R$ for a discrete memoryless source $S$ is called subset-universal if for every $0<R'< R$, almost every subset of $2^{nR'}$ of its codewords achieves average distortion close to the source's distortion-rate function $D(R')$. In this paper we prove the asymptotic existence of such codes. Moreover, we show the asymptotic existence of a code that is subset-universal with respect to all sources with the same alphabet. △ Less

Submitted 12 March, 2015; v1 submitted 3 November, 2014; originally announced November 2014.

Comments: To be presented at the 2015 IEEE Information Theory Workshop

arXiv:1308.6552 [pdf, ps, other]

Integer-Forcing Source Coding

Authors: Or Ordentlich, Uri Erez

Abstract: Integer-Forcing (IF) is a new framework, based on compute-and-forward, for decoding multiple integer linear combinations from the output of a Gaussian multiple-input multiple-output channel. This work applies the IF approach to arrive at a new low-complexity scheme, IF source coding, for distributed lossy compression of correlated Gaussian sources under a minimum mean squared error distortion meas… ▽ More Integer-Forcing (IF) is a new framework, based on compute-and-forward, for decoding multiple integer linear combinations from the output of a Gaussian multiple-input multiple-output channel. This work applies the IF approach to arrive at a new low-complexity scheme, IF source coding, for distributed lossy compression of correlated Gaussian sources under a minimum mean squared error distortion measure. All encoders use the same nested lattice codebook. Each encoder quantizes its observation using the fine lattice as a quantizer and reduces the result modulo the coarse lattice, which plays the role of binning. Rather than directly recovering the individual quantized signals, the decoder first recovers a full-rank set of judiciously chosen integer linear combinations of the quantized signals, and then inverts it. In general, the linear combinations have smaller average powers than the original signals. This allows to increase the density of the coarse lattice, which in turn translates to smaller compression rates. We also propose and analyze a one-shot version of IF source coding, that is simple enough to potentially lead to a new design principle for analog-to-digital converters that can exploit spatial correlations between the sampled signals. △ Less

Submitted 29 August, 2013; originally announced August 2013.

Comments: Submitted to IEEE Transactions on Information Theory

arXiv:1307.2105 [pdf, ps, other]

Successive Integer-Forcing and its Sum-Rate Optimality

Authors: Or Ordentlich, Uri Erez, Bobak Nazer

Abstract: Integer-forcing receivers generalize traditional linear receivers for the multiple-input multiple-output channel by decoding integer-linear combinations of the transmitted streams, rather then the streams themselves. Previous works have shown that the additional degree of freedom in choosing the integer coefficients enables this receiver to approach the performance of maximum-likelihood decoding i… ▽ More Integer-forcing receivers generalize traditional linear receivers for the multiple-input multiple-output channel by decoding integer-linear combinations of the transmitted streams, rather then the streams themselves. Previous works have shown that the additional degree of freedom in choosing the integer coefficients enables this receiver to approach the performance of maximum-likelihood decoding in various scenarios. Nonetheless, even for the optimal choice of integer coefficients, the additive noise at the equalizer's output is still correlated. In this work we study a variant of integer-forcing, termed successive integer-forcing, that exploits these noise correlations to improve performance. This scheme is the integer-forcing counterpart of successive interference cancellation for traditional linear receivers. Similarly to the latter, we show that successive integer-forcing is capacity achieving when it is possible to optimize the rate allocation to the different streams. In comparison to standard successive interference cancellation receivers, the successive integer-forcing receiver offers more possibilities for capacity achieving rate tuples, and in particular, ones that are more balanced. △ Less

Submitted 8 July, 2013; originally announced July 2013.

Comments: A shorter version was submitted to the 51st Allerton Conference

arXiv:1301.6393 [pdf, ps, other]

Precoded Integer-Forcing Universally Achieves the MIMO Capacity to Within a Constant Gap

Authors: Or Ordentlich, Uri Erez

Abstract: An open-loop single-user multiple-input multiple-output communication scheme is considered where a transmitter, equipped with multiple antennas, encodes the data into independent streams all taken from the same linear code. The coded streams are then linearly precoded using the encoding matrix of a perfect linear dispersion space-time code. At the receiver side, integer-forcing equalization is app… ▽ More An open-loop single-user multiple-input multiple-output communication scheme is considered where a transmitter, equipped with multiple antennas, encodes the data into independent streams all taken from the same linear code. The coded streams are then linearly precoded using the encoding matrix of a perfect linear dispersion space-time code. At the receiver side, integer-forcing equalization is applied, followed by standard single-stream decoding. It is shown that this communication architecture achieves the capacity of any Gaussian multiple-input multiple-output channel up to a gap that depends only on the number of transmit antennas. △ Less

Submitted 3 November, 2014; v1 submitted 27 January, 2013; originally announced January 2013.

Comments: to appear in the IEEE Transactions on Information Theory

arXiv:1209.5083 [pdf, ps, other]

A Simple Proof for the Existence of "Good" Pairs of Nested Lattices

Authors: Or Ordentlich, Uri Erez

Abstract: This paper provides a simplified proof for the existence of nested lattice codebooks allowing to achieve the capacity of the additive white Gaussian noise channel, as well as the optimal rate-distortion trade-off for a Gaussian source. The proof is self-contained and relies only on basic probabilistic and geometrical arguments. An ensemble of nested lattices that is different, and more elementary,… ▽ More This paper provides a simplified proof for the existence of nested lattice codebooks allowing to achieve the capacity of the additive white Gaussian noise channel, as well as the optimal rate-distortion trade-off for a Gaussian source. The proof is self-contained and relies only on basic probabilistic and geometrical arguments. An ensemble of nested lattices that is different, and more elementary, than the one used in previous proofs is introduced. This ensemble is based on lifting different subcodes of a linear code to the Euclidean space using Construction A. In addition to being simpler, our analysis is less sensitive to the assumption that the additive noise is Gaussian. In particular, for additive ergodic noise channels it is shown that the achievable rates of the nested lattice coding scheme depend on the noise distribution only via its power. Similarly, the nested lattice source coding scheme attains the same rate-distortion trade-off for all ergodic sources with the same second moment. △ Less

Submitted 7 August, 2015; v1 submitted 23 September, 2012; originally announced September 2012.

arXiv:1206.0197 [pdf, ps, other]

doi 10.1109/ISIT.2012.6283726

The Approximate Sum Capacity of the Symmetric Gaussian K-User Interference Channel

Authors: Or Ordentlich, Uri Erez, Bobak Nazer

Abstract: Interference alignment has emerged as a powerful tool in the analysis of multi-user networks. Despite considerable recent progress, the capacity region of the Gaussian K-user interference channel is still unknown in general, in part due to the challenges associated with alignment on the signal scale using lattice codes. This paper develops a new framework for lattice interference alignment, based… ▽ More Interference alignment has emerged as a powerful tool in the analysis of multi-user networks. Despite considerable recent progress, the capacity region of the Gaussian K-user interference channel is still unknown in general, in part due to the challenges associated with alignment on the signal scale using lattice codes. This paper develops a new framework for lattice interference alignment, based on the compute-and-forward approach. Within this framework, each receiver decodes by first recovering two or more linear combinations of the transmitted codewords with integer-valued coefficients and then solving these equations for its desired codeword. For the special case of symmetric channel gains, this framework is used to derive the approximate sum capacity of the Gaussian interference channel, up to an explicitly defined outage set of the channel gains. The key contributions are the capacity lower bounds for the weak through strong interference regimes, where each receiver should jointly decode its own codeword along with part of the interfering codewords. As part of the analysis, it is shown that decoding K linear combinations of the codewords can approach the sum capacity of the K-user Gaussian multiple-access channel up to a gap of no more than K log(K)/2 bits. △ Less

Submitted 19 March, 2014; v1 submitted 1 June, 2012; originally announced June 2012.

Comments: Accepted for publication in the IEEE Transactions on Information Theory

arXiv:1104.5456 [pdf, ps, other]

Interference Alignment at Finite SNR for Time-Invariant Channels

Authors: Or Ordentlich, Uri Erez

Abstract: An achievable rate region, based on lattice interference alignment, is derived for a class of time-invariant Gaussian interference channels with more than two users. The result is established via a new coding theorem for the two-user Gaussian multiple-access channel where both users use a single linear code. The class of interference channels treated is such that all interference channel gains are… ▽ More An achievable rate region, based on lattice interference alignment, is derived for a class of time-invariant Gaussian interference channels with more than two users. The result is established via a new coding theorem for the two-user Gaussian multiple-access channel where both users use a single linear code. The class of interference channels treated is such that all interference channel gains are rational. For this class of interference channels, beyond recovering the known results on the degrees of freedom, an explicit rate region is derived for finite signal-to-noise ratios, shedding light on the nature of previously established asymptotic results. △ Less

Submitted 19 June, 2011; v1 submitted 28 April, 2011; originally announced April 2011.

arXiv:1012.5553 [pdf, ps, other]

Cyclic-Coded Integer-Forcing Equalization

Authors: Or Ordentlich, Uri Erez

Abstract: A discrete-time intersymbol interference channel with additive Gaussian noise is considered, where only the receiver has knowledge of the channel impulse response. An approach for combining decision-feedback equalization with channel coding is proposed, where decoding precedes the removal of intersymbol interference. This is accomplished by combining the recently proposed integer-forcing equalizat… ▽ More A discrete-time intersymbol interference channel with additive Gaussian noise is considered, where only the receiver has knowledge of the channel impulse response. An approach for combining decision-feedback equalization with channel coding is proposed, where decoding precedes the removal of intersymbol interference. This is accomplished by combining the recently proposed integer-forcing equalization approach with cyclic block codes. The channel impulse response is linearly equalized to an integer-valued response. This is then utilized by leveraging the property that a cyclic code is closed under (cyclic) integer-valued convolution. Explicit bounds on the performance of the proposed scheme are also derived. △ Less

Submitted 23 June, 2011; v1 submitted 26 December, 2010; originally announced December 2010.

Showing 1–43 of 43 results for author: Ordentlich, O