Efficient Near-Optimal Codes for General Repeat Channels

Pernice, Francisco; Li, Ray; Wootters, Mary

Computer Science > Information Theory

arXiv:2201.12746 (cs)

[Submitted on 30 Jan 2022 (v1), last revised 4 Feb 2022 (this version, v2)]

Title:Efficient Near-Optimal Codes for General Repeat Channels

Authors:Francisco Pernice, Ray Li, Mary Wootters

View PDF

Abstract:Given a probability distribution $\mathcal{D}$ over the non-negative integers, a $\mathcal{D}$-repeat channel acts on an input symbol by repeating it a number of times distributed as $\mathcal{D}$. For example, the binary deletion channel ($\mathcal{D}=Bernoulli$) and the Poisson repeat channel ($\mathcal{D}=Poisson$) are special cases. We say a $\mathcal{D}$-repeat channel is square-integrable if $\mathcal{D}$ has finite first and second moments. In this paper, we construct explicit codes for all square-integrable $\mathcal{D}$-repeat channels with rate arbitrarily close to the capacity, that are encodable and decodable in linear and quasi-linear time, respectively. We also consider possible extensions to the repeat channel model, and illustrate how our construction can be extended to an even broader class of channels capturing insertions, deletions, and substitutions.
Our work offers an alternative, simplified, and more general construction to the recent work of Rubinstein (arXiv:2111.00261), who attains similar results to ours in the cases of the deletion channel and the Poisson repeat channel. It also slightly improves the runtime and decoding failure probability of the polar codes constructions of Tal et al. (ISIT 2019) and of Pfister and Tal (arXiv:2102.02155) for the deletion channel and certain insertion/deletion/substitution channels. Our techniques follow closely the approaches of Guruswami and Li (IEEEToIT 2019) and Con and Shpilka (IEEEToIT 2020); what sets apart our work is that we show that a capacity-achieving code can be assumed to have an "approximate balance" in the frequency of zeros and ones of all sufficiently long substrings of all codewords. This allows us to attain near-capacity-achieving codes in a general setting. We consider this "approximate balance" result to be of independent interest, as it can be cast in much greater generality than repeat channels.

Comments:	The prior version incorrectly stated that the constructions of Tal, Pfister and Fazeli require $O(n^4)$ runtime complexity to achieve $e^{-Θ(n)}$ decoding failure probability; the corrected result now included achieves a better tradeoff, and is more subtle to state
Subjects:	Information Theory (cs.IT)
Cite as:	arXiv:2201.12746 [cs.IT]
	(or arXiv:2201.12746v2 [cs.IT] for this version)
	https://doi.org/10.48550/arXiv.2201.12746

Submission history

From: Francisco Pernice [view email]
[v1] Sun, 30 Jan 2022 07:47:08 UTC (25 KB)
[v2] Fri, 4 Feb 2022 23:12:14 UTC (25 KB)

Computer Science > Information Theory

Title:Efficient Near-Optimal Codes for General Repeat Channels

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Theory

Title:Efficient Near-Optimal Codes for General Repeat Channels

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators