Search | arXiv e-print repository

arXiv:2405.20782 [pdf, other]

Universal Exact Compression of Differentially Private Mechanisms

Authors: Yanxiao Liu, Wei-Ning Chen, Ayfer Özgür, Cheuk Ting Li

Abstract: To reduce the communication cost of differential privacy mechanisms, we introduce a novel construction, called Poisson private representation (PPR), designed to compress and simulate any local randomizer while ensuring local differential privacy. Unlike previous simulation-based local differential privacy mechanisms, PPR exactly preserves the joint distribution of the data and the output of the or… ▽ More To reduce the communication cost of differential privacy mechanisms, we introduce a novel construction, called Poisson private representation (PPR), designed to compress and simulate any local randomizer while ensuring local differential privacy. Unlike previous simulation-based local differential privacy mechanisms, PPR exactly preserves the joint distribution of the data and the output of the original local randomizer. Hence, the PPR-compressed privacy mechanism retains all desirable statistical properties of the original privacy mechanism such as unbiasedness and Gaussianity. Moreover, PPR achieves a compression size within a logarithmic gap from the theoretical lower bound. Using the PPR, we give a new order-wise trade-off between communication, accuracy, central and local differential privacy for distributed mean estimation. Experiment results on distributed mean estimation show that PPR consistently gives a better trade-off between communication, accuracy and central differential privacy compared to the coordinate subsampled Gaussian mechanism, while also providing local differential privacy. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 30 pages, 3 figures

arXiv:2405.07493 [pdf, ps, other]

Variable-Length Secret Key Agreement via Random Stop** Time

Authors: Junda Zhou, Cheuk Ting Li

Abstract: We consider a key agreement setting where two parties observe correlated random sources, and want to agree on a secret key via public discussions. In order to allow the key length to adapt to the realizations of the random sources, we allow the key to be of variable length, subject to a novel variable-length version of the uniformity constraint based on random stop** time. We propose simple, com… ▽ More We consider a key agreement setting where two parties observe correlated random sources, and want to agree on a secret key via public discussions. In order to allow the key length to adapt to the realizations of the random sources, we allow the key to be of variable length, subject to a novel variable-length version of the uniformity constraint based on random stop** time. We propose simple, computationally efficient key agreement schemes under the new constraint. The proposed scheme can be considered as the key agreement analogue of variable-length source coding via Huffman coding, and the Knuth-Yao random number generator. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 8 pages

arXiv:2405.07107 [pdf, other]

A Pair of Bayesian Network Structures has Undecidable Conditional Independencies

Authors: Cheuk Ting Li

Abstract: Given a Bayesian network structure (directed acyclic graph), the celebrated d-separation algorithm efficiently determines whether the network structure implies a given conditional independence relation. We show that this changes drastically when we consider two Bayesian network structures instead. It is undecidable to determine whether two given network structures imply a given conditional indepen… ▽ More Given a Bayesian network structure (directed acyclic graph), the celebrated d-separation algorithm efficiently determines whether the network structure implies a given conditional independence relation. We show that this changes drastically when we consider two Bayesian network structures instead. It is undecidable to determine whether two given network structures imply a given conditional independency, that is, whether every collection of random variables satisfying both network structures must also satisfy the conditional independency. Although the approximate combination of two Bayesian networks is a well-studied topic, our result shows that it is fundamentally impossible to accurately combine the knowledge of two Bayesian network structures, in the sense that no algorithm can tell what conditional independencies are implied by the two network structures. We can also explicitly construct two Bayesian network structures, such that whether they imply a certain conditional independency is unprovable in the ZFC set theory, assuming ZFC is consistent. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: 13 pages, 2 figures

arXiv:2405.02700 [pdf, other]

Identification of Novel Modes in Generative Models via Fourier-based Differential Clustering

Authors: **gwei Zhang, Mohammad Jalali, Cheuk Ting Li, Farzan Farnia

Abstract: An interpretable comparison of generative models requires the identification of sample types produced more frequently by each of the involved models. While several quantitative scores have been proposed in the literature to rank different generative models, such score-based evaluations do not reveal the nuanced differences between the generative models in capturing various sample types. In this wo… ▽ More An interpretable comparison of generative models requires the identification of sample types produced more frequently by each of the involved models. While several quantitative scores have been proposed in the literature to rank different generative models, such score-based evaluations do not reveal the nuanced differences between the generative models in capturing various sample types. In this work, we attempt to solve a differential clustering problem to detect sample types expressed differently by two generative models. To solve the differential clustering problem, we propose a method called Fourier-based Identification of Novel Clusters (FINC) to identify modes produced by a generative model with a higher frequency in comparison to a reference distribution. FINC provides a scalable stochastic algorithm based on random Fourier features to estimate the eigenspace of kernel covariance matrices of two generative models and utilize the principal eigendirections to detect the sample types present more dominantly in each model. We demonstrate the application of the FINC method to large-scale computer vision datasets and generative model frameworks. Our numerical results suggest the scalability of the developed Fourier-based method in highlighting the sample types produced with different frequencies by widely-used generative models. Code is available at \url{https://github.com/buyeah1109/FINC} △ Less

Submitted 4 July, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

arXiv:2402.17287 [pdf, other]

An Interpretable Evaluation of Entropy-based Novelty of Generative Models

Authors: **gwei Zhang, Cheuk Ting Li, Farzan Farnia

Abstract: The massive developments of generative model frameworks require principled methods for the evaluation of a model's novelty compared to a reference dataset. While the literature has extensively studied the evaluation of the quality, diversity, and generalizability of generative models, the assessment of a model's novelty compared to a reference model has not been adequately explored in the machine… ▽ More The massive developments of generative model frameworks require principled methods for the evaluation of a model's novelty compared to a reference dataset. While the literature has extensively studied the evaluation of the quality, diversity, and generalizability of generative models, the assessment of a model's novelty compared to a reference model has not been adequately explored in the machine learning community. In this work, we focus on the novelty assessment for multi-modal distributions and attempt to address the following differential clustering task: Given samples of a generative model $P_\mathcal{G}$ and a reference model $P_\mathrm{ref}$, how can we discover the sample types expressed by $P_\mathcal{G}$ more frequently than in $P_\mathrm{ref}$? We introduce a spectral approach to the differential clustering task and propose the Kernel-based Entropic Novelty (KEN) score to quantify the mode-based novelty of $P_\mathcal{G}$ with respect to $P_\mathrm{ref}$. We analyze the KEN score for mixture distributions with well-separable components and develop a kernel-based method to compute the KEN score from empirical data. We support the KEN framework by presenting numerical results on synthetic and real image datasets, indicating the framework's effectiveness in detecting novel modes and comparing generative models. The paper's code is available at: www.github.com/buyeah1109/KEN △ Less

Submitted 13 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.06021 [pdf, other]

One-Shot Coding over General Noisy Networks

Authors: Yanxiao Liu, Cheuk Ting Li

Abstract: We present a unified one-shot coding framework designed for communication and compression of messages among multiple nodes across a general acyclic noisy network. Our setting can be seen as a one-shot version of the acyclic discrete memoryless network studied by Lee and Chung, and noisy network coding studied by Lim, Kim, El Gamal and Chung. We design a proof technique, called the exponential proc… ▽ More We present a unified one-shot coding framework designed for communication and compression of messages among multiple nodes across a general acyclic noisy network. Our setting can be seen as a one-shot version of the acyclic discrete memoryless network studied by Lee and Chung, and noisy network coding studied by Lim, Kim, El Gamal and Chung. We design a proof technique, called the exponential process refinement lemma, that is rooted in the Poisson matching lemma by Li and Anantharam, and can significantly simplify the analyses of one-shot coding over multi-hop networks. Our one-shot coding theorem not only recovers a wide range of existing asymptotic results, but also yields novel one-shot achievability results in different multi-hop network information theory problems. In a broader context, our framework provides a unified one-shot bound applicable to any combination of source coding, channel coding and coding for computing problems. △ Less

Submitted 8 February, 2024; originally announced February 2024.

arXiv:2402.03030 [pdf, other]

Rejection-Sampled Universal Quantization for Smaller Quantization Errors

Authors: Chih Wei Ling, Cheuk Ting Li

Abstract: We construct a randomized vector quantizer which has a smaller maximum error compared to all known lattice quantizers with the same entropy for dimensions 5, 6, ..., 48, and also has a smaller mean squared error compared to known lattice quantizers with the same entropy for dimensions 35, ..., 48, in the high resolution limit. Moreover, our randomized quantizer has a desirable property that the qu… ▽ More We construct a randomized vector quantizer which has a smaller maximum error compared to all known lattice quantizers with the same entropy for dimensions 5, 6, ..., 48, and also has a smaller mean squared error compared to known lattice quantizers with the same entropy for dimensions 35, ..., 48, in the high resolution limit. Moreover, our randomized quantizer has a desirable property that the quantization error is always uniform over the ball and independent of the input. Our construction is based on applying rejection sampling on universal quantization, which allows us to shape the error distribution to be any continuous distribution, not only uniform distributions over basic cells of a lattice as in conventional dithered quantization. We also characterize the high SNR limit of one-shot channel simulation for any additive noise channel under a mild assumption (e.g., the AWGN channel), up to an additive constant of 1.45 bits. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 15 pages, 2 figures

arXiv:2401.14805 [pdf, ps, other]

Pointwise Redundancy in One-Shot Lossy Compression via Poisson Functional Representation

Authors: Cheuk Ting Li

Abstract: We study different notions of pointwise redundancy in variable-length lossy source coding. We present a construction of one-shot variable-length lossy source coding schemes using the Poisson functional representation, and give bounds on its pointwise redundancy for various definitions of pointwise redundancy. This allows us to describe the distribution of the encoding length in a precise manner. W… ▽ More We study different notions of pointwise redundancy in variable-length lossy source coding. We present a construction of one-shot variable-length lossy source coding schemes using the Poisson functional representation, and give bounds on its pointwise redundancy for various definitions of pointwise redundancy. This allows us to describe the distribution of the encoding length in a precise manner. We also generalize the result to the one-shot lossy Gray-Wyner system. △ Less

Submitted 26 January, 2024; originally announced January 2024.

Comments: 9 pages, short version to be presented at 2024 International Zurich Seminar on Information and Communication

arXiv:2310.20682 [pdf, other]

Compression with Exact Error Distribution for Federated Learning

Authors: Mahmoud Hegazy, Rémi Leluc, Cheuk Ting Li, Aymeric Dieuleveut

Abstract: Compression schemes have been extensively used in Federated Learning (FL) to reduce the communication cost of distributed learning. While most approaches rely on a bounded variance assumption of the noise produced by the compressor, this paper investigates the use of compression and aggregation schemes that produce a specific error distribution, e.g., Gaussian or Laplace, on the aggregated data. W… ▽ More Compression schemes have been extensively used in Federated Learning (FL) to reduce the communication cost of distributed learning. While most approaches rely on a bounded variance assumption of the noise produced by the compressor, this paper investigates the use of compression and aggregation schemes that produce a specific error distribution, e.g., Gaussian or Laplace, on the aggregated data. We present and analyze different aggregation schemes based on layered quantizers achieving exact error distribution. We provide different methods to leverage the proposed compression schemes to obtain compression-for-free in differential privacy applications. Our general compression methods can recover and improve standard FL schemes with Gaussian perturbations such as Langevin dynamics and randomized smoothing. △ Less

Submitted 31 October, 2023; originally announced October 2023.

arXiv:2309.06982 [pdf, other]

Communication-Efficient Laplace Mechanism for Differential Privacy via Random Quantization

Authors: Ali Moradi Shahmiri, Chih Wei Ling, Cheuk Ting Li

Abstract: We propose the first method that realizes the Laplace mechanism exactly (i.e., a Laplace noise is added to the data) that requires only a finite amount of communication (whereas the original Laplace mechanism requires the transmission of a real number) while guaranteeing privacy against the server and database. Our mechanism can serve as a drop-in replacement for local or centralized differential… ▽ More We propose the first method that realizes the Laplace mechanism exactly (i.e., a Laplace noise is added to the data) that requires only a finite amount of communication (whereas the original Laplace mechanism requires the transmission of a real number) while guaranteeing privacy against the server and database. Our mechanism can serve as a drop-in replacement for local or centralized differential privacy applications where the Laplace mechanism is used. Our mechanism is constructed using a random quantization technique. Unlike the simple and prevalent Laplace-mechanism-then-quantize approach, the quantization in our mechanism does not result in any distortion or degradation of utility. Unlike existing dithered quantization and channel simulation schemes for simulating additive Laplacian noise, our mechanism guarantees privacy not only against the database and downstream, but also against the honest but curious server which attempts to decode the data using the dither signals. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: 11 pages, 3 figures, short version to be submitted at 2024 IEEE International Conference on Acoustics, Speech and Signal Processing

arXiv:2308.05742 [pdf, other]

A Characterization of Entropy as a Universal Monoidal Natural Transformation

Authors: Cheuk Ting Li

Abstract: We show that the essential properties of entropy (monotonicity, additivity and subadditivity) are consequences of entropy being a monoidal natural transformation from the under category functor $-/\mathsf{LProb}_ρ$ (where $\mathsf{LProb}_ρ$ is category of $ρ$-th-power-summable probability distributions, $0<ρ<1$) to $Δ_{\mathbb{R}}$. Moreover, the Shannon entropy can be characterized as the univers… ▽ More We show that the essential properties of entropy (monotonicity, additivity and subadditivity) are consequences of entropy being a monoidal natural transformation from the under category functor $-/\mathsf{LProb}_ρ$ (where $\mathsf{LProb}_ρ$ is category of $ρ$-th-power-summable probability distributions, $0<ρ<1$) to $Δ_{\mathbb{R}}$. Moreover, the Shannon entropy can be characterized as the universal monoidal natural transformation from $-/\mathsf{LProb}_ρ$ to the category of integrally closed partially ordered abelian groups (a reflective subcategory of the lax-slice 2-category over $\mathsf{MonCat}_{\ell}$ in the 2-category of monoidal categories), providing a succinct characterization of Shannon entropy as a reflection arrow. We can likewise define entropy for every monoidal category with a monoidal structure on its under categories (e.g. the category of finite abelian groups, the category of finite inhabited sets, the category of finite dimensional vector spaces, and the augmented simplex category) via the reflection arrow. This implies that all these entropies over different categories are components of a single natural transformation (the unit of the idempotent monad), allowing us to connect these entropies in a natural manner. We also provide a universal characterization of the conditional Shannon entropy based on the chain rule which, unlike the characterization of information loss by Baez, Fritz and Leinster, does not require any continuity assumption. △ Less

Submitted 14 April, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

Comments: 55 pages, 2 figures

arXiv:2307.07506 [pdf, other]

A Poisson Decomposition for Information and the Information-Event Diagram

Authors: Cheuk Ting Li

Abstract: Information diagram and the I-measure are useful mnemonics where random variables are treated as sets, and entropy and mutual information are treated as a signed measure. Although the I-measure has been successful in machine proofs of entropy inequalities, the theoretical underpinning of the ``random variables as sets'' analogy has been unclear until the recent works on map**s from random variab… ▽ More Information diagram and the I-measure are useful mnemonics where random variables are treated as sets, and entropy and mutual information are treated as a signed measure. Although the I-measure has been successful in machine proofs of entropy inequalities, the theoretical underpinning of the ``random variables as sets'' analogy has been unclear until the recent works on map**s from random variables to sets by Ellerman (recovering order-$2$ Tsallis entropy over general probability space), and Down and Mediano (recovering Shannon entropy over discrete probability space). We generalize these constructions by designing a map** which recovers the Shannon entropy (and the information density) over general probability space. Moreover, it has an intuitive interpretation based on the arrival time in a Poisson process, allowing us to understand the union, intersection and difference between (sets corresponding to) random variables and events. Cross entropy, KL divergence, and conditional entropy given an event, can be obtained as set intersections. We propose a generalization of the information diagram that also includes events, and demonstrate its usage by a diagrammatic proof of Fano's inequality. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: 18 pages, 6 figures

arXiv:2305.07593 [pdf, ps, other]

Unconditionally Secure Access Control Encryption

Authors: Cheuk Ting Li, Sherman S. M. Chow

Abstract: Access control encryption (ACE) enforces, through a sanitizer as the mediator, that only legitimate sender-receiver pairs can communicate, without the sanitizer knowing the communication metadata, including its sender and recipient identity, the policy over them, and the underlying plaintext. Any illegitimate transmission is indistinguishable from pure noise. Existing works focused on computationa… ▽ More Access control encryption (ACE) enforces, through a sanitizer as the mediator, that only legitimate sender-receiver pairs can communicate, without the sanitizer knowing the communication metadata, including its sender and recipient identity, the policy over them, and the underlying plaintext. Any illegitimate transmission is indistinguishable from pure noise. Existing works focused on computational security and require trapdoor functions and possibly other heavyweight primitives. We present the first ACE scheme with information-theoretic security (unconditionally against unbounded adversaries). Our novel randomization techniques over matrices realize sanitization (traditionally via homomorphism over a fixed randomness space) such that the secret message in the hidden message subspace remains intact if and only if there is no illegitimate transmission. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: 10 pages. This is the long version of a paper to be presented at 2023 IEEE International Symposium on Information Theory

arXiv:2305.06788 [pdf, other]

Vector Quantization with Error Uniformly Distributed over an Arbitrary Set

Authors: Chih Wei Ling, Cheuk Ting Li

Abstract: For uniform scalar quantization, the error distribution is approximately a uniform distribution over an interval (which is also a 1-dimensional ball). Nevertheless, for lattice vector quantization, the error distribution is uniform not over a ball, but over the basic cell of the quantization lattice. In this paper, we construct vector quantizers with periodic properties, where the error is uniform… ▽ More For uniform scalar quantization, the error distribution is approximately a uniform distribution over an interval (which is also a 1-dimensional ball). Nevertheless, for lattice vector quantization, the error distribution is uniform not over a ball, but over the basic cell of the quantization lattice. In this paper, we construct vector quantizers with periodic properties, where the error is uniformly distributed over the n-ball, or any other prescribed set. We then prove upper and lower bounds on the entropy of the quantized signals. We also discuss how our construction can be applied to give a randomized quantization scheme with a nonuniform error distribution. △ Less

Submitted 24 January, 2024; v1 submitted 11 May, 2023; originally announced May 2023.

Comments: 22 pages, 3 figures. Short version presented at 2023 IEEE International Symposium on Information Theory

arXiv:2205.11461 [pdf, other]

doi 10.1109/TIT.2023.3247570

Undecidability of Network Coding, Conditional Information Inequalities, and Conditional Independence Implication

Authors: Cheuk Ting Li

Abstract: We resolve three long-standing open problems, namely the (algorithmic) decidability of network coding, the decidability of conditional information inequalities, and the decidability of conditional independence implication among random variables, by showing that these problems are undecidable. The proof utilizes a construction inspired by Herrmann's arguments on embedded multivalued database depend… ▽ More We resolve three long-standing open problems, namely the (algorithmic) decidability of network coding, the decidability of conditional information inequalities, and the decidability of conditional independence implication among random variables, by showing that these problems are undecidable. The proof utilizes a construction inspired by Herrmann's arguments on embedded multivalued database dependencies, a network studied by Dougherty, Freiling and Zeger, together with a novel construction to represent group automorphisms on top of the network. △ Less

Submitted 29 May, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

Comments: 20 pages, 8 figures

Journal ref: Published in IEEE Transactions on Information Theory (Volume: 69, Issue: 6, June 2023)

arXiv:2201.10171 [pdf, other]

Weighted Parity-Check Codes for Channels with State and Asymmetric Channels

Authors: Chih Wei Ling, Yanxiao Liu, Cheuk Ting Li

Abstract: In this paper, we introduce a new class of codes, called weighted parity-check codes, where each parity-check bit has a weight that indicates its likelihood to be one (instead of fixing each parity-check bit to be zero). It is applicable to a wide range of settings, e.g. asymmetric channels, channels with state and/or cost constraints, and the Wyner-Ziv problem, and can provably achieve the capaci… ▽ More In this paper, we introduce a new class of codes, called weighted parity-check codes, where each parity-check bit has a weight that indicates its likelihood to be one (instead of fixing each parity-check bit to be zero). It is applicable to a wide range of settings, e.g. asymmetric channels, channels with state and/or cost constraints, and the Wyner-Ziv problem, and can provably achieve the capacity. For the channels with state (Gelfand-Pinsker) setting, the proposed coding scheme has two advantages compared to the nested linear code. First, it achieves the capacity of any channel with state (e.g. asymmetric channels). Second, simulation results show that the proposed code achieves a smaller error rate compared to the nested linear code. We also discuss a sparse construction where the belief propagation algorithm can be applied to improve the coding efficiency. △ Less

Submitted 30 May, 2023; v1 submitted 25 January, 2022; originally announced January 2022.

Comments: 17 pages, 4 figure. This is the full version of a paper presented at 2022 IEEE International Symposium on Information Theory (ISIT)

arXiv:2201.03032 [pdf, other]

Arithmetic Network Coding for Secret Sum Computation

Authors: Sijie Li, Cheuk Ting Li

Abstract: We consider a network coding problem where the destination wants to recover the sum of the signals (Gaussian random variables or random finite field elements) at all the source nodes, but the sum must be kept secret from an eavesdropper that can wiretap on a subset of edges. This setting arises naturally in sensor networks and federated learning, where the secrecy of the sum of the signals (e.g. w… ▽ More We consider a network coding problem where the destination wants to recover the sum of the signals (Gaussian random variables or random finite field elements) at all the source nodes, but the sum must be kept secret from an eavesdropper that can wiretap on a subset of edges. This setting arises naturally in sensor networks and federated learning, where the secrecy of the sum of the signals (e.g. weights, gradients) may be desired. While the case for finite field can be solved, the case for Gaussian random variables is surprisingly difficult. We give a simple conjecture on the necessary and sufficient condition under which such secret computation is possible for the Gaussian case, and prove the conjecture when the number of wiretapped edges is at most 2. △ Less

Submitted 9 January, 2022; originally announced January 2022.

arXiv:2109.08991 [pdf, other]

The Undecidability of Network Coding with some Fixed-Size Messages and Edges

Authors: Cheuk Ting Li

Abstract: We consider a network coding setting where some of the messages and edges have fixed alphabet sizes, that do not change when we increase the common alphabet size of the rest of the messages and edges. We prove that the problem of deciding whether such network admits a coding scheme is undecidable. This can be considered as a partial solution to the conjecture that network coding (without fixed-siz… ▽ More We consider a network coding setting where some of the messages and edges have fixed alphabet sizes, that do not change when we increase the common alphabet size of the rest of the messages and edges. We prove that the problem of deciding whether such network admits a coding scheme is undecidable. This can be considered as a partial solution to the conjecture that network coding (without fixed-size messages/edges) is undecidable. The proof, which makes heavy use of analogies with digital circuits, is essentially constructing a digital circuit of logic gates and flip-flops within a network coding model that is capable of simulating an arbitrary Turing machine. △ Less

Submitted 10 February, 2022; v1 submitted 18 September, 2021; originally announced September 2021.

Comments: 12 pages, 8 figures

arXiv:2108.07324 [pdf, ps, other]

First-Order Theory of Probabilistic Independence and Single-Letter Characterizations of Capacity Regions

Authors: Cheuk Ting Li

Abstract: We consider the first-order theory of random variables with the probabilistic independence relation, which concerns statements consisting of random variables, the probabilistic independence symbol, logical operators, and existential and universal quantifiers. Although probabilistic independence is the only non-logical relation included, this theory is surprisingly expressive, and is able to interp… ▽ More We consider the first-order theory of random variables with the probabilistic independence relation, which concerns statements consisting of random variables, the probabilistic independence symbol, logical operators, and existential and universal quantifiers. Although probabilistic independence is the only non-logical relation included, this theory is surprisingly expressive, and is able to interpret the true first-order arithmetic over natural numbers (and hence is undecidable). We also construct a single-letter characterization of the capacity region for a general class of multiuser coding settings (including broadcast channel, interference channel and relay channel), using a first-order formula. We then introduce the linear entropy hierarchy to classify single-letter characterizations according to their complexity. △ Less

Submitted 16 August, 2021; originally announced August 2021.

Comments: 23 pages

arXiv:2105.01045 [pdf, ps, other]

Multiple-Output Channel Simulation and Lossy Compression of Probability Distributions

Authors: Chak Fung Choi, Cheuk Ting Li

Abstract: We consider a variant of the channel simulation problem with a single input and multiple outputs, where Alice observes a probability distribution $P$ from a set of prescribed probability distributions $\mathbb{\mathcal{P}}$, and sends a prefix-free codeword $W$ to Bob to allow him to generate $n$ i.i.d. random variables $X_{1},X_{2,}...,X_{n}$ which follow the distribution $P$. This can also be re… ▽ More We consider a variant of the channel simulation problem with a single input and multiple outputs, where Alice observes a probability distribution $P$ from a set of prescribed probability distributions $\mathbb{\mathcal{P}}$, and sends a prefix-free codeword $W$ to Bob to allow him to generate $n$ i.i.d. random variables $X_{1},X_{2,}...,X_{n}$ which follow the distribution $P$. This can also be regarded as a lossy compression setting for probability distributions. This paper describes encoding schemes for three cases of $P$: $P$ is a distribution over positive integers, $P$ is a continuous distribution over $[0,1]$ with a non-increasing pdf, and $P$ is a continuous distribution over $[0,\infty)$ with a non-increasing pdf. We show that the growth rate of the expected codeword length is sub-linear in $n$ when a power law bound is satisfied. An application of multiple-outputs channel simulation is the compression of probability distributions. △ Less

Submitted 4 September, 2021; v1 submitted 3 May, 2021; originally announced May 2021.

Comments: 11 pages, 3 figures

arXiv:2104.05634 [pdf, other]

The Undecidability of Conditional Affine Information Inequalities and Conditional Independence Implication with a Binary Constraint

Authors: Cheuk Ting Li

Abstract: We establish the undecidability of conditional affine information inequalities, the undecidability of the conditional independence implication problem with a constraint that one random variable is binary, and the undecidability of the problem of deciding whether the intersection of the entropic region and a given affine subspace is empty. This is a step towards the conjecture on the undecidability… ▽ More We establish the undecidability of conditional affine information inequalities, the undecidability of the conditional independence implication problem with a constraint that one random variable is binary, and the undecidability of the problem of deciding whether the intersection of the entropic region and a given affine subspace is empty. This is a step towards the conjecture on the undecidability of conditional independence implication. The undecidability is proved via a reduction from the periodic tiling problem (a variant of the domino problem). Hence, one can construct examples of the aforementioned problems that are independent of ZFC (assuming ZFC is consistent). △ Less

Submitted 10 February, 2022; v1 submitted 12 April, 2021; originally announced April 2021.

Comments: 19 pages, 7 figures, presented in part at the 2021 IEEE Information Theory Workshop

arXiv:2101.12370 [pdf, ps, other]

An Automated Theorem Proving Framework for Information-Theoretic Results

Authors: Cheuk Ting Li

Abstract: We present a versatile automated theorem proving framework capable of automated discovery, simplification and proofs of inner and outer bounds in network information theory, deduction of properties of information-theoretic quantities (e.g. Wyner and Gács-Körner common information), and discovery of non-Shannon-type inequalities, under a unified framework. Our implementation successfully generated… ▽ More We present a versatile automated theorem proving framework capable of automated discovery, simplification and proofs of inner and outer bounds in network information theory, deduction of properties of information-theoretic quantities (e.g. Wyner and Gács-Körner common information), and discovery of non-Shannon-type inequalities, under a unified framework. Our implementation successfully generated proofs for 32 out of 56 theorems in Chapters 1-14 of the book Network Information Theory by El Gamal and Kim. Our framework is based on the concept of existential information inequalities, which provides an axiomatic framework for a wide range of problems in information theory. △ Less

Submitted 11 July, 2022; v1 submitted 28 January, 2021; originally announced January 2021.

Comments: 27 pages, presented in part at the IEEE International Symposium on Information Theory 2021

arXiv:2011.07270 [pdf, other]

doi 10.1214/22-EJS2072

Species Abundance Distribution and Species Accumulation Curve: A General Framework and Results

Authors: Cheuk Ting Li, Kim-Hung Li

Abstract: We build a general framework which establishes a one-to-one correspondence between species abundance distribution (SAD) and species accumulation curve (SAC). The appearance rates of the species and the appearance times of individuals of each species are modeled as Poisson processes. The number of species can be finite or infinite. Hill numbers are extended to the framework. We introduce a linear d… ▽ More We build a general framework which establishes a one-to-one correspondence between species abundance distribution (SAD) and species accumulation curve (SAC). The appearance rates of the species and the appearance times of individuals of each species are modeled as Poisson processes. The number of species can be finite or infinite. Hill numbers are extended to the framework. We introduce a linear derivative ratio family of models, $\mathrm{LDR}_1$, of which the ratio of the first and the second derivatives of the expected SAC is a linear function. A D1/D2 plot is proposed to detect this linear pattern in the data. By extrapolation of the curve in the D1/D2 plot, a species richness estimator that extends Chao1 estimator is introduced. The SAD of $\mathrm{LDR}_1$ is the Engen's extended negative binomial distribution, and the SAC encompasses several popular parametric forms including the power law. Family $\mathrm{LDR}_1$ is extended in two ways: $\mathrm{LDR}_2$ which allows species with zero detection probability, and $\mathrm{RDR}_1$ where the derivative ratio is a rational function. Real data are analyzed to demonstrate the proposed methods. We also consider the scenario where we record only a few leading appearance times of each species. We show how maximum likelihood inference can be performed when only the empirical SAC is observed, and elucidate its advantages over the traditional curve-fitting method. △ Less

Submitted 23 October, 2022; v1 submitted 14 November, 2020; originally announced November 2020.

Comments: 49 pages, 5 figures

Journal ref: Electron. J. Statist. 16 (2) 5488 - 5533, 2022

arXiv:2008.06092 [pdf, ps, other]

doi 10.1109/TMTT.2020.3008784

Infinite Divisibility of Information

Authors: Cheuk Ting Li

Abstract: We study an information analogue of infinitely divisible probability distributions, where the i.i.d. sum is replaced by the joint distribution of an i.i.d. sequence. A random variable $X$ is called informationally infinitely divisible if, for any $n\ge1$, there exists an i.i.d. sequence of random variables $Z_{1},\ldots,Z_{n}$ that contains the same information as $X$, i.e., there exists an inject… ▽ More We study an information analogue of infinitely divisible probability distributions, where the i.i.d. sum is replaced by the joint distribution of an i.i.d. sequence. A random variable $X$ is called informationally infinitely divisible if, for any $n\ge1$, there exists an i.i.d. sequence of random variables $Z_{1},\ldots,Z_{n}$ that contains the same information as $X$, i.e., there exists an injective function $f$ such that $X=f(Z_{1},\ldots,Z_{n})$. While there does not exist informationally infinitely divisible discrete random variable, we show that any discrete random variable $X$ has a bounded multiplicative gap to infinite divisibility, that is, if we remove the injectivity requirement on $f$, then there exists i.i.d. $Z_{1},\ldots,Z_{n}$ and $f$ satisfying $X=f(Z_{1},\ldots,Z_{n})$, and the entropy satisfies $H(X)/n\le H(Z_{1})\le1.59H(X)/n+2.43$. We also study a new class of discrete probability distributions, called spectral infinitely divisible distributions, where we can remove the multiplicative gap $1.59$. Furthermore, we study the case where $X=(Y_{1},\ldots,Y_{m})$ is itself an i.i.d. sequence, $m\ge2$, for which the multiplicative gap $1.59$ can be replaced by $1+5\sqrt{(\log m)/m}$. This means that as $m$ increases, $(Y_{1},\ldots,Y_{m})$ becomes closer to being spectral infinitely divisible in a uniform manner. This can be regarded as an information analogue of Kolmogorov's uniform theorem. Applications of our result include independent component analysis, distributed storage with a secrecy constraint, and distributed random number generation. △ Less

Submitted 13 August, 2020; originally announced August 2020.

Comments: 22 pages

MSC Class: 94A15; 60F05

Journal ref: in IEEE Transactions on Information Theory, vol. 68, no. 7, pp. 4257-4271, July 2022

arXiv:2006.07955 [pdf, other]

doi 10.1109/TIT.2021.3076986

Efficient Approximate Minimum Entropy Coupling of Multiple Probability Distributions

Authors: Cheuk Ting Li

Abstract: Given a collection of probability distributions $p_{1},\ldots,p_{m}$, the minimum entropy coupling is the coupling $X_{1},\ldots,X_{m}$ ($X_{i}\sim p_{i}$) with the smallest entropy $H(X_{1},\ldots,X_{m})$. While this problem is known to be NP-hard, we present an efficient algorithm for computing a coupling with entropy within 2 bits from the optimal value. More precisely, we construct a coupling… ▽ More Given a collection of probability distributions $p_{1},\ldots,p_{m}$, the minimum entropy coupling is the coupling $X_{1},\ldots,X_{m}$ ($X_{i}\sim p_{i}$) with the smallest entropy $H(X_{1},\ldots,X_{m})$. While this problem is known to be NP-hard, we present an efficient algorithm for computing a coupling with entropy within 2 bits from the optimal value. More precisely, we construct a coupling with entropy within 2 bits from the entropy of the greatest lower bound of $p_{1},\ldots,p_{m}$ with respect to majorization. This construction is also valid when the collection of distributions is infinite, and when the supports of the distributions are infinite. Potential applications of our results include random number generation, entropic causal inference, and functional representation of random variables. △ Less

Submitted 17 January, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

Comments: 13 pages, 1 figure, 1 table

Journal ref: Published in IEEE Transactions on Information Theory (Volume: 67, Issue: 8, Aug. 2021)

arXiv:2006.01949 [pdf, other]

doi 10.1109/TIT.2021.3107217

Asymptotically Scale-invariant Multi-resolution Quantization

Authors: Cheuk Ting Li

Abstract: A multi-resolution quantizer is a sequence of quantizers where the output of a coarser quantizer can be deduced from the output of a finer quantizer. In this paper, we propose an asymptotically scale-invariant multi-resolution quantizer, which performs uniformly across any choice of average quantization step, when the length of the range of input numbers is large. Scale invariance is especially us… ▽ More A multi-resolution quantizer is a sequence of quantizers where the output of a coarser quantizer can be deduced from the output of a finer quantizer. In this paper, we propose an asymptotically scale-invariant multi-resolution quantizer, which performs uniformly across any choice of average quantization step, when the length of the range of input numbers is large. Scale invariance is especially useful in worst case or adversarial settings, ensuring that the performance of the quantizer would not be affected greatly by small changes of storage or error requirements. We also show that the proposed quantizer achieves a tradeoff between rate and error that is arbitrarily close to the optimum. △ Less

Submitted 2 June, 2020; originally announced June 2020.

Comments: 12 pages, 2 figures. This paper is the extended version of a paper submitted to the IEEE International Symposium on Information Theory 2020

Journal ref: in IEEE Transactions on Information Theory, vol. 67, no. 11, pp. 7616-7626, Nov. 2021

arXiv:1912.06956 [pdf, other]

Pairwise Near-maximal Grand Coupling of Brownian Motions

Authors: Cheuk Ting Li, Venkat Anantharam

Abstract: The well-known reflection coupling gives a maximal coupling of two one-dimensional Brownian motions with different starting points. Nevertheless, the reflection coupling does not generalize to more than two Brownian motions. In this paper, we construct a coupling of all Brownian motions with all possible starting points (i.e., a grand coupling), such that the coupling for any pair of the coupled p… ▽ More The well-known reflection coupling gives a maximal coupling of two one-dimensional Brownian motions with different starting points. Nevertheless, the reflection coupling does not generalize to more than two Brownian motions. In this paper, we construct a coupling of all Brownian motions with all possible starting points (i.e., a grand coupling), such that the coupling for any pair of the coupled processes is close to being maximal, that is, the distribution of the coupling time of the pair approaches that of the maximal coupling as the time tends to $0$ or $\infty$, and the coupling time of the pair is always within a multiplicative factor $2e^{2}$ from the maximal one. We also show that a grand coupling that is pairwise exactly maximal does not exist. △ Less

Submitted 14 December, 2019; originally announced December 2019.

Comments: 18 pages, 3 figures

Journal ref: Ann. Inst. H. Poincaré Probab. Statist. 58 (3) 1621 - 1639, August 2022

arXiv:1908.01388 [pdf, other]

Pairwise Multi-marginal Optimal Transport and Embedding for Earth Mover's Distance

Authors: Cheuk Ting Li, Venkat Anantharam

Abstract: We investigate the problem of pairwise multi-marginal optimal transport, that is, given a collection of probability distributions $\{P_α\}$ on a Polish space $\mathcal{X}$, to find a coupling $\{X_α\}$, $X_α\sim P_α$, such that $\mathbf{E}[c(X_α,X_β)]\le r\inf_{X\sim P_α,Y\sim P_β}\mathbf{E}[c(X,Y)]$ for all $α,β$, where $c$ is a cost function and $r\ge1$. In other words, every pair $(X_α,X_β)$ ha… ▽ More We investigate the problem of pairwise multi-marginal optimal transport, that is, given a collection of probability distributions $\{P_α\}$ on a Polish space $\mathcal{X}$, to find a coupling $\{X_α\}$, $X_α\sim P_α$, such that $\mathbf{E}[c(X_α,X_β)]\le r\inf_{X\sim P_α,Y\sim P_β}\mathbf{E}[c(X,Y)]$ for all $α,β$, where $c$ is a cost function and $r\ge1$. In other words, every pair $(X_α,X_β)$ has an expected cost at most a factor of $r$ from its lowest possible value. This can be regarded as a locality sensitive hash function for probability distributions, and has applications such as robust and distributed computation of transport plans. It can also be considered as a bi-Lipschitz embedding of the collection of probability distributions into the space of random variables taking values on $\mathcal{X}$. For $c(x,y)=\Vert x-y\Vert_2^q$ on $\mathbb{R}^n$, where $q>0$, we show that a finite $r$ is attainable if and only if either $n=1$ or $0<q<1$. As $n\to\infty$, the growth rate of the smallest possible $r$ is exactly $Θ(n^{q/2})$ if $0<q<1$. Hence, the metric space of probability distributions on $\mathbb{R}^n$ with finite $q$-th absolute moments, $0<q<1$, with the earth mover's distance (or 1-Wasserstein distance) with respect to the snowflake metric $c(x,y)=\Vert x-y\Vert_2^q$, is bi-Lipschitz embeddable into $L_1$ with distortion $O(n^{q/2})$. If we consider $c(x,y)=\Vert x-y\Vert_2$ (i.e., $q=1$) on the grid $[0..s]^n$ instead of $\mathbb{R}^n$, then $r=O(\sqrt{n}\log s)$ is attainable, which implies the embeddability of the space of probability distributions on $[0..s]^n$ into $L_1$ with distortion $O(\sqrt{n}\log s)$, and improves upon the $O(n\log s)$ result by Indyk and Thaper. The case of the discrete metric cost $c(x,y)=\mathbf{1}\{x\neq y\}$ and more general metric and ultrametric costs are also investigated. △ Less

Submitted 20 October, 2019; v1 submitted 4 August, 2019; originally announced August 2019.

Comments: 91 pages, 3 figures

arXiv:1812.03616 [pdf, other]

doi 10.1109/TIT.2021.3058842

A Unified Framework for One-shot Achievability via the Poisson Matching Lemma

Authors: Cheuk Ting Li, Venkat Anantharam

Abstract: We introduce a fundamental lemma called the Poisson matching lemma, and apply it to prove one-shot achievability results for various settings, namely channels with state information at the encoder, lossy source coding with side information at the decoder, joint source-channel coding, broadcast channels, distributed lossy source coding, multiple access channels, channel resolvability and wiretap ch… ▽ More We introduce a fundamental lemma called the Poisson matching lemma, and apply it to prove one-shot achievability results for various settings, namely channels with state information at the encoder, lossy source coding with side information at the decoder, joint source-channel coding, broadcast channels, distributed lossy source coding, multiple access channels, channel resolvability and wiretap channels. Our one-shot bounds improve upon the best known one-shot bounds in most of the aforementioned settings (except multiple access channels, channel resolvability and wiretap channels, where we recover bounds comparable to the best known bounds), with shorter proofs in some settings even when compared to the conventional asymptotic approach using typicality. The Poisson matching lemma replaces both the packing and covering lemmas, greatly simplifying the error analysis. This paper extends the work of Li and El Gamal on Poisson functional representation, which mainly considered variable-length source coding settings, whereas this paper studies fixed-length settings, and is not limited to source coding, showing that the Poisson functional representation is a viable alternative to typicality for most problems in network information theory. △ Less

Submitted 22 February, 2019; v1 submitted 9 December, 2018; originally announced December 2018.

Comments: 28 pages, 2 figures

Journal ref: Published in IEEE Transactions on Information Theory (Volume: 67, Issue: 5, May 2021)

arXiv:1809.01793 [pdf, ps, other]

doi 10.1109/TIT.2021.3087963

One-Shot Variable-Length Secret Key Agreement Approaching Mutual Information

Authors: Cheuk Ting Li, Venkat Anantharam

Abstract: This paper studies an information-theoretic one-shot variable-length secret key agreement problem with public discussion. Let $X$ and $Y$ be jointly distributed random variables, each taking values in some measurable space. Alice and Bob observe $X$ and $Y$ respectively, can communicate interactively through a public noiseless channel, and want to agree on a key length and a key that is approximat… ▽ More This paper studies an information-theoretic one-shot variable-length secret key agreement problem with public discussion. Let $X$ and $Y$ be jointly distributed random variables, each taking values in some measurable space. Alice and Bob observe $X$ and $Y$ respectively, can communicate interactively through a public noiseless channel, and want to agree on a key length and a key that is approximately uniformly distributed over all bit sequences with the agreed key length. The public discussion is observed by an eavesdropper, Eve. The key should be approximately independent of the public discussion, conditional on the key length. We show that the optimal expected key length is close to the mutual information $I(X;Y)$ within a logarithmic gap. Moreover, an upper bound and a lower bound on the optimal expected key length can be written down in terms of $I(X;Y)$ only. This means that the optimal one-shot performance is always within a small gap of the optimal asymptotic performance regardless of the distribution of the pair $(X,Y)$. This one-shot result may find applications in situations where the components of an i.i.d. pair source $(X^{n},Y^{n})$ are observed sequentially and the key is output bit by bit with small delay, or in situations where the random source is not an i.i.d. or ergodic process. △ Less

Submitted 20 September, 2018; v1 submitted 5 September, 2018; originally announced September 2018.

Comments: 16 pages, to be presented in part at 56th Annual Allerton Conference on Communication, Control, and Computing

Journal ref: Published in IEEE Transactions on Information Theory (Volume: 67, Issue: 8, Aug. 2021)

arXiv:1806.00071 [pdf, other]

Minimax Learning for Remote Prediction

Authors: Cheuk Ting Li, Xiugang Wu, Ayfer Ozgur, Abbas El Gamal

Abstract: The classical problem of supervised learning is to infer an accurate predictor of a target variable $Y$ from a measured variable $X$ by using a finite number of labeled training samples. Motivated by the increasingly distributed nature of data and decision making, in this paper we consider a variation of this classical problem in which the prediction is performed remotely based on a rate-constrain… ▽ More The classical problem of supervised learning is to infer an accurate predictor of a target variable $Y$ from a measured variable $X$ by using a finite number of labeled training samples. Motivated by the increasingly distributed nature of data and decision making, in this paper we consider a variation of this classical problem in which the prediction is performed remotely based on a rate-constrained description $M$ of $X$. Upon receiving $M$, the remote node computes an estimate $\hat Y$ of $Y$. We follow the recent minimax approach to study this learning problem and show that it corresponds to a one-shot minimax noisy source coding problem. We then establish information theoretic bounds on the risk-rate Lagrangian cost and a general method to design a near-optimal descriptor-estimator pair, which can be viewed as a rate-constrained analog to the maximum conditional entropy principle used in the classical minimax learning problem. Our results show that a naive estimate-compress scheme for rate-constrained prediction is not in general optimal. △ Less

Submitted 4 November, 2018; v1 submitted 31 May, 2018; originally announced June 2018.

Comments: 10 pages, 4 figures, presented in part at ISIT 2018

Journal ref: IEEE Transactions on Information Theory (Volume: 66, Issue: 12, Dec. 2020)

arXiv:1701.03207 [pdf, ps, other]

Extended Gray-Wyner System with Complementary Causal Side Information

Authors: Cheuk Ting Li, Abbas El Gamal

Abstract: We establish the rate region of an extended Gray-Wyner system for 2-DMS $(X,Y)$ with two additional decoders having complementary causal side information. This extension is interesting because in addition to the operationally significant extreme points of the Gray-Wyner rate region, which include Wyner's common information, G{á}cs-K{ö}rner common information and information bottleneck, the rate re… ▽ More We establish the rate region of an extended Gray-Wyner system for 2-DMS $(X,Y)$ with two additional decoders having complementary causal side information. This extension is interesting because in addition to the operationally significant extreme points of the Gray-Wyner rate region, which include Wyner's common information, G{á}cs-K{ö}rner common information and information bottleneck, the rate region for the extended system also includes the K{ö}rner graph entropy, the privacy funnel and excess functional information, as well as three new quantities of potential interest, as extreme points. To simplify the investigation of the 5-dimensional rate region of the extended Gray-Wyner system, we establish an equivalence of this region to a 3-dimensional mutual information region that consists of the set of all triples of the form $(I(X;U),\,I(Y;U),\,I(X,Y;U))$ for some $p_{U|X,Y}$. We further show that projections of this mutual information region yield the rate regions for many settings involving a 2-DMS, including lossless source coding with causal side information, distributed channel synthesis, and lossless source coding with a helper. △ Less

Submitted 11 January, 2017; originally announced January 2017.

Comments: 18 pages, 3 figures

Journal ref: IEEE Transactions on Information Theory, vol. 64, no. 8, pp. 5862-5878, Aug. 2018

arXiv:1701.02827 [pdf, other]

Strong Functional Representation Lemma and Applications to Coding Theorems

Authors: Cheuk Ting Li, Abbas El Gamal

Abstract: This paper shows that for any random variables $X$ and $Y$, it is possible to represent $Y$ as a function of $(X,Z)$ such that $Z$ is independent of $X$ and $I(X;Z|Y)\le\log(I(X;Y)+1)+4$ bits. We use this strong functional representation lemma (SFRL) to establish a bound on the rate needed for one-shot exact channel simulation for general (discrete or continuous) random variables, strengthening th… ▽ More This paper shows that for any random variables $X$ and $Y$, it is possible to represent $Y$ as a function of $(X,Z)$ such that $Z$ is independent of $X$ and $I(X;Z|Y)\le\log(I(X;Y)+1)+4$ bits. We use this strong functional representation lemma (SFRL) to establish a bound on the rate needed for one-shot exact channel simulation for general (discrete or continuous) random variables, strengthening the results by Harsha et al. and Braverman and Garg, and to establish new and simple achievability results for one-shot variable-length lossy source coding, multiple description coding and Gray-Wyner system. We also show that the SFRL can be used to reduce the channel with state noncausally known at the encoder to a point-to-point channel, which provides a simple achievability proof of the Gelfand-Pinsker theorem. △ Less

Submitted 19 January, 2018; v1 submitted 10 January, 2017; originally announced January 2017.

Comments: 15 pages, 1 figure, presented in part at the IEEE International Symposium on Information Theory, Aachen, Germany, June 2017

Journal ref: IEEE Transactions on Information Theory, vol. 64, no. 11, pp. 6967-6978, Nov. 2018

arXiv:1603.05238 [pdf, ps, other]

A Universal Coding Scheme for Remote Generation of Continuous Random Variables

Authors: Cheuk Ting Li, Abbas El Gamal

Abstract: We consider a setup in which Alice selects a pdf $f$ from a set of prescribed pdfs $\mathscr{P}$ and sends a prefix-free codeword $W$ to Bob in order to allow him to generate a single instance of the random variable $X\sim f$. We describe a universal coding scheme for this setup and establish an upper bound on the expected codeword length when the pdf $f$ is bounded, orthogonally concave (which in… ▽ More We consider a setup in which Alice selects a pdf $f$ from a set of prescribed pdfs $\mathscr{P}$ and sends a prefix-free codeword $W$ to Bob in order to allow him to generate a single instance of the random variable $X\sim f$. We describe a universal coding scheme for this setup and establish an upper bound on the expected codeword length when the pdf $f$ is bounded, orthogonally concave (which includes quasiconcave pdf), and has a finite first absolute moment. A dyadic decomposition scheme is used to express the pdf as a mixture of uniform pdfs over hypercubes. Alice randomly selects a hypercube according to its weight, encodes its position and size into $W$, and sends it to Bob who generates $X$ uniformly over the hypercube. Compared to previous results on channel simulation, our coding scheme applies to any continuous distribution and does not require two-way communication or shared randomness. We apply our coding scheme to classical simulation of quantum entanglement and obtain a better bound on the average codeword length than previously known. △ Less

Submitted 16 March, 2016; originally announced March 2016.

Comments: 13 pages, 5 figures

Journal ref: IEEE Transactions on Information Theory, vol. 64, no. 4, pp. 2583-2592, April 2018

arXiv:1601.05875 [pdf, ps, other]

Distributed Simulation of Continuous Random Variables

Authors: Cheuk Ting Li, Abbas El Gamal

Abstract: We establish the first known upper bound on the exact and Wyner's common information of $n$ continuous random variables in terms of the dual total correlation between them (which is a generalization of mutual information). In particular, we show that when the pdf of the random variables is log-concave, there is a constant gap of $n^{2}\log e+9n\log n$ between this upper bound and the dual total co… ▽ More We establish the first known upper bound on the exact and Wyner's common information of $n$ continuous random variables in terms of the dual total correlation between them (which is a generalization of mutual information). In particular, we show that when the pdf of the random variables is log-concave, there is a constant gap of $n^{2}\log e+9n\log n$ between this upper bound and the dual total correlation lower bound that does not depend on the distribution. The upper bound is obtained using a computationally efficient dyadic decomposition scheme for constructing a discrete common randomness variable $W$ from which the $n$ random variables can be simulated in a distributed manner. We then bound the entropy of $W$ using a new measure, which we refer to as the erosion entropy. △ Less

Submitted 20 November, 2016; v1 submitted 21 January, 2016; originally announced January 2016.

Comments: 21 pages, 6 figures, presented in part at IEEE International Symposium on Information Theory, Barcelona, July 2016

Journal ref: IEEE Transactions on Information Theory, vol. 63, no. 10, pp. 6329-6343, Oct. 2017

arXiv:1412.6741 [pdf, ps, other]

Locally Weighted Learning for Naive Bayes Classifier

Authors: Kim-Hung Li, Cheuk Ting Li

Abstract: As a consequence of the strong and usually violated conditional independence assumption (CIA) of naive Bayes (NB) classifier, the performance of NB becomes less and less favorable compared to sophisticated classifiers when the sample size increases. We learn from this phenomenon that when the size of the training data is large, we should either relax the assumption or apply NB to a "reduced" data… ▽ More As a consequence of the strong and usually violated conditional independence assumption (CIA) of naive Bayes (NB) classifier, the performance of NB becomes less and less favorable compared to sophisticated classifiers when the sample size increases. We learn from this phenomenon that when the size of the training data is large, we should either relax the assumption or apply NB to a "reduced" data set, say for example use NB as a local model. The latter approach trades the ignored information for the robustness to the model assumption. In this paper, we consider using NB as a model for locally weighted data. A special weighting function is designed so that if CIA holds for the unweighted data, it also holds for the weighted data. The new method is intuitive and capable of handling class imbalance. It is theoretically more sound than the locally weighted learners of naive Bayes that base classification only on the $k$ nearest neighbors. Empirical study shows that the new method with appropriate choice of parameter outperforms seven existing classifiers of similar nature. △ Less

Submitted 21 December, 2014; originally announced December 2014.

arXiv:1412.5374 [pdf, other]

Maximal Correlation Secrecy

Authors: Cheuk Ting Li, Abbas El Gamal

Abstract: This paper shows that the Hirschfeld-Gebelein-Rényi maximal correlation between the message and the ciphertext provides good secrecy guarantees for cryptosystems that use short keys. We first establish a bound on the eavesdropper's advantage in guessing functions of the message in terms of maximal correlation and the Rényi entropy of the message. This result implies that maximal correlation is str… ▽ More This paper shows that the Hirschfeld-Gebelein-Rényi maximal correlation between the message and the ciphertext provides good secrecy guarantees for cryptosystems that use short keys. We first establish a bound on the eavesdropper's advantage in guessing functions of the message in terms of maximal correlation and the Rényi entropy of the message. This result implies that maximal correlation is stronger than the notion of entropic security introduced by Russell and Wang. We then show that a small maximal correlation $ρ$ can be achieved via a randomly generated cipher with key length $\approx2\log(1/ρ)$, independent of the message length, and by a stream cipher with key length $2\log(1/ρ)+\log n+2$ for a message of length $n$. We establish a converse showing that these ciphers are close to optimal. This is in contrast to entropic security for which there is a gap between the lower and upper bounds. Finally, we show that a small maximal correlation implies secrecy with respect to several mutual information based criteria but is not necessarily implied by them. Hence, maximal correlation is a stronger and more practically relevant measure of secrecy than mutual information. △ Less

Submitted 6 October, 2016; v1 submitted 17 December, 2014; originally announced December 2014.

Comments: 15 pages, 2 figure, presented in part at IEEE International Symposium on Information Theory 2015

Journal ref: IEEE Transactions on Information Theory, vol. 64, no. 5, pp. 3916-3926, May 2018

arXiv:1402.5326 [pdf, other]

doi 10.1109/TIT.2016.2529844

Channel Diversity needed for Vector Space Interference Alignment

Authors: Cheuk Ting Li, Ayfer Özgür

Abstract: We consider vector space interference alignment strategies over the $K$-user interference channel and derive an upper bound on the achievable degrees of freedom as a function of the channel diversity $L$, where the channel diversity is modeled by $L$ real-valued parallel channels with coefficients drawn from a non-degenerate joint distribution. The seminal work of Cadambe and Jafar shows that when… ▽ More We consider vector space interference alignment strategies over the $K$-user interference channel and derive an upper bound on the achievable degrees of freedom as a function of the channel diversity $L$, where the channel diversity is modeled by $L$ real-valued parallel channels with coefficients drawn from a non-degenerate joint distribution. The seminal work of Cadambe and Jafar shows that when $L$ is unbounded, vector space interference alignment can achieve $1/2$ degrees of freedom per user independent of the number of users $K$. However wireless channels have limited diversity in practice, dictated by their coherence time and bandwidth, and an important question is the number of degrees of freedom achievable at finite $L$. When $K=3$ and if $L$ is finite, Bresler et al show that the number of degrees of freedom achievable with vector space interference alignment is bounded away from $1/2$, and the gap decreases inversely proportional to $L$. In this paper, we show that when $K\geq4$, the gap is significantly larger. In particular, the gap to the optimal $1/2$ degrees of freedom per user can decrease at most like $1/\sqrt{L}$, and when $L$ is smaller than the order of $2^{(K-2)(K-3)}$, it decays at most like $1/\sqrt[4]{L}$. △ Less

Submitted 7 September, 2016; v1 submitted 21 February, 2014; originally announced February 2014.

Comments: 22 pages, 4 figures. Presented in part at the IEEE International Symposium on Information Theory, Honolulu, USA, June 2014. Published in IEEE Transactions on Information Theory (Volume: 62, Issue: 4, April 2016)

Journal ref: IEEE Transactions on Information Theory (Volume: 62, Issue: 4, April 2016)

arXiv:1402.0062 [pdf, ps, other]

Exact Common Information

Authors: Gowtham Ramani Kumar, Cheuk Ting Li, Abbas El Gamal

Abstract: This paper introduces the notion of exact common information, which is the minimum description length of the common randomness needed for the exact distributed generation of two correlated random variables $(X,Y)$. We introduce the quantity $G(X;Y)=\min_{X\to W \to Y} H(W)$ as a natural bound on the exact common information and study its properties and computation. We then introduce the exact comm… ▽ More This paper introduces the notion of exact common information, which is the minimum description length of the common randomness needed for the exact distributed generation of two correlated random variables $(X,Y)$. We introduce the quantity $G(X;Y)=\min_{X\to W \to Y} H(W)$ as a natural bound on the exact common information and study its properties and computation. We then introduce the exact common information rate, which is the minimum description rate of the common randomness for the exact generation of a 2-DMS $(X,Y)$. We give a multiletter characterization for it as the limit $\bar{G}(X;Y)=\lim_{n\to \infty}(1/n)G(X^n;Y^n)$. While in general $\bar{G}(X;Y)$ is greater than or equal to the Wyner common information, we show that they are equal for the Symmetric Binary Erasure Source. We do not know, however, if the exact common information rate has a single letter characterization in general. △ Less

Submitted 1 February, 2014; originally announced February 2014.

arXiv:1311.0100 [pdf, other]

doi 10.1109/TIT.2015.2428234

An Efficient Feedback Coding Scheme with Low Error Probability for Discrete Memoryless Channels

Authors: Cheuk Ting Li, Abbas El Gamal

Abstract: Existing fixed-length feedback communication schemes are either specialized to particular channels (Schalkwijk--Kailath, Horstein), or apply to general channels but either have high coding complexity (block feedback schemes) or are difficult to analyze (posterior matching). This paper introduces a new fixed-length feedback coding scheme which achieves the capacity for all discrete memoryless chann… ▽ More Existing fixed-length feedback communication schemes are either specialized to particular channels (Schalkwijk--Kailath, Horstein), or apply to general channels but either have high coding complexity (block feedback schemes) or are difficult to analyze (posterior matching). This paper introduces a new fixed-length feedback coding scheme which achieves the capacity for all discrete memoryless channels, has an error exponent that approaches the sphere packing bound as the rate approaches the capacity, and has $O(n\log n)$ coding complexity. These benefits are achieved by judiciously combining features from previous schemes with new randomization technique and encoding/decoding rule. These new features make the analysis of the error probability for the new scheme easier than for posterior matching. △ Less

Submitted 7 September, 2016; v1 submitted 1 November, 2013; originally announced November 2013.

Comments: 16 pages, 7 figures. Presented in part at the IEEE International Symposium on Information Theory, Honolulu, USA, June 2014. Published in IEEE Transactions on Information Theory (Volume: 61, Issue: 6, June 2015)

Journal ref: IEEE Trans. Info. Theory, vol.61, no.6, pp.2953-2963, June 2015

arXiv:1210.3427 [pdf, other]

On Multi-rate Sequential Data Transmission

Authors: Cheuk Ting LI

Abstract: In this report, we investigate the data transmission model in which a sequence of data is broadcasted to a number of receivers. The receivers, which have different channel capacities, wish to decode the data sequentially at different rates. Our results are applicable to a wide range of scenarios. For instance, it can be employed in the broadcast streaming of a video clip through the internet, so t… ▽ More In this report, we investigate the data transmission model in which a sequence of data is broadcasted to a number of receivers. The receivers, which have different channel capacities, wish to decode the data sequentially at different rates. Our results are applicable to a wide range of scenarios. For instance, it can be employed in the broadcast streaming of a video clip through the internet, so that receivers with different bandwidths can play the video at different speed. Receivers with greater bandwidths can provide a smooth playback, while receivers with smaller bandwidths can play the video at a slower speed, or with short pauses or rebuffering. △ Less

Submitted 12 October, 2012; originally announced October 2012.

Comments: Final year project report for the degree of bachelor of Information Engineering, The Chinese University of Hong Kong

arXiv:chem-ph/9603006 [pdf, ps, other]

doi 10.1103/PhysRevA.54.1820

From Heisenberg matrix mechanics to EBK quantization: theory and first applications

Authors: Wm. R. Greenberg, Abraham Klein, Ivalyo Zlatev, C. T. Li

Abstract: Despite the seminal connection between classical multiply-periodic motion and Heisenberg matrix mechanics and the massive amount of work done on the associated problem of semiclassical (EBK) quantization of bound states, we show that there are, nevertheless, a number of previously unexploited aspects of this relationship that bear on the quantum-classical correspondence. In particular, we emphas… ▽ More Despite the seminal connection between classical multiply-periodic motion and Heisenberg matrix mechanics and the massive amount of work done on the associated problem of semiclassical (EBK) quantization of bound states, we show that there are, nevertheless, a number of previously unexploited aspects of this relationship that bear on the quantum-classical correspondence. In particular, we emphasize a quantum variational principle that implies the classical variational principle for invariant tori. We also expose the more indirect connection between commutation relations and quantization of action variables. With the help of several standard models with one or two degrees of freedom, we then illustrate how the methods of Heisenberg matrix mechanics described in this paper may be used to obtain quantum solutions with a modest increase in effort compared to semiclassical calculations. We also describe and apply a method for obtaining leading quantum corrections to EBK results. Finally, we suggest several new or modified applications of EBK quantization. △ Less

Submitted 29 March, 1996; originally announced March 1996.

Comments: 37 pages including 3 poscript figures, submitted to Phys. Rev. A

Showing 1–42 of 42 results for author: LI, C T