Competitive Analysis of Arbitrary Varying Channels
Abstract
Arbitrary varying channels (AVC) are used to model communication settings in which a channel state may vary arbitrarily over time. Their primary objective is to circumvent statistical assumptions on channel variation. Traditional studies on AVCs optimize rate subject to the worst-case state sequence. While this approach is resilient to channel variations, it may result in low rates for state sequences that are associated with relatively good channels. This paper addresses the analysis of AVCs through the lens of competitive analysis, where solution quality is measured with respect to the optimal solution had the state sequence been known in advance. Our main result demonstrates that codes constructed by a single input distribution do not achieve optimal competitive performance over AVCs. This stands in contrast to the single-letter capacity formulae for AVCs, and it indicates, in our setting, that even though the encoder cannot predict the subsequent channel states, it benefits from varying its input distribution as time proceeds.
I Introduction
The arbitrarily varying channel (AVC) model, introduced by Blackwell, Breiman, and Thomasian [1], captures communication over a collection of memoryless channels, , where in each time instance the channel state that determines the channel in use may be arbitrarily chosen. The AVC model is broad in nature, and in its full generality can capture the setting in which the state sequence may depend on the transmitted codeword or may have a cost/type constraint. As such, the AVC model captures both adversarial and random noise models. The majority of previous studies on AVCs address the case in which the state does not depend on the transmitted codeword, e.g., [2, 3, 4, 5]. Traditionally, success criteria for communication in the context of AVCs require the design of a single coding scheme that allows communication at a fixed rate no matter which state sequence is realized; this approach forces code rates to be matched to the worst-case channel conditions. The fixed-rate setting is complemented by variable-rate performance criteria that no longer guarantee the delivery of a fixed rate under all channel state sequences; instead, they allow the rate to vary with the channel states in operation. Rateless codes, introduced in [6, 7, 8], achieve variable-rate coding by allowing the effective blocklength to vary with the channel state.
In this work, we consider rateless codes for AVCs. In particular, the message length is fixed in advance and the communication length is a random variable that depends on channel outputs, i.e., the block length is determined by a stop** time based on the channel output filtration. Rateless codes for the AVC model have seen a number of studies over the last decade. The majority of studies involve feedback[9, 10, 11, 12, 13, 14, 15, 16, 17, 18] and are less relevant to the work at hand. In the context of rateless codes for AVCs without feedback, prior works include [19, 20, 21] that study coding solutions and effective rate in the setting in which the decoder has full or partial access to state information. Beyond the assumptions on decoder state information (which is also central to our study as well), the major difference between the works above and the work at hand lies in the quality measures of the solutions suggested - the former, for a fixed input distribution, seek decoding rules that minimize the expected decoding time given the state sequence at hand, while the work at hand seeks coding solutions with a competitive quality guarantee.
Namely, in this work, we study rateless coding technologies for communication over AVCs through the lens of competitive analysis. In competitive analysis, one compares the achievable rates of solutions in which state sequence is not available to the encoder and decoder with those in which state information is known in advance to all parties. The objective is to design communication schemes that achieve rates that are close to that achievable when the state sequence is known in advance. Common metrics to compare these two rates include the competitive-ratio that measures the ratio between the (expected) rates achievable in the case of limited state knowledge and that with full state knowledge, and regret that measures the difference between the former and latter rates described above. The design of communication schemes with a competitive ratio approaching 1 (or with regret approaching 0) guarantees that even in the face of uncertainty, the quality of communication matches the best possible under the given conditions. A competitive ratio that is bounded away from acts as a quality measure for the communication scheme at hand, guaranteeing that no matter what state sequence is realized, the achievable rate is guaranteed to be within an multiplicative ratio of the best possible.
II Model and problem definition
Let , , and denote finite alphabets of the channel input, output and state, respectively. Consider a message uniformly distributed over and communication over channels taken from a family of discrete memoryless channels . The channel at time is determined by the state as . The state sequence does not depend on the message and should be viewed as a deterministic sequence that is chosen arbitrarily.
The message length is not parameterized with a blocklength or rate. Instead, we consider rateless communication where the fixed-size message should be decoded with the least number of channel uses. That is, the decoder observes the stream of channel outputs and decides at each time if it wants to proceed with communication or to abort it and to decode the message. We proceed to formally define rateless codes.
Rateless codes over AVCs: For a fixed message length , a rateless code is defined by three deterministic map**s. The encoder is defined by the map** . The second map** is a sequence of decoder-decision functions for ; if the decoder decodes at time , it sets and then the message is decoded with the last map** . In all time instances prior to the decoding, the decoder simply sets .
The stop** time is defined as . For a given code and a state sequence , the average probability of error is , and denotes the maximal error among all state sequences. The expected decoding time for isย (here, expectation is taken over the channel and message). For rateless codes, the effective rate is typically measured as , e.g, [22].
Competitive analysis: To define our competitive metrics, as our baseline, we consider codes where the encoder and the decoder map**s have access to the state sequence . Specifically, it is a rateless code as above, but the map**s depend on the state sequence. We denote such codes with .
The main idea is to compare the stop** time of a code with the stop** time of an optimal, clairvoyant, scheme specifically designed for . Setting , the competitive ratio is defined as
(1) |
The competitive ratio guarantees the largest multiplicative measure of quality with respect to the optimal code. That is, for any , whether this sequence implies low or high , the resulted stop** time of the oblivious code (that is not designed with the knowledge of ) should be the closest possible to the optimal stop** time.
For some applications, e.g., delay-sensitive systems [23, 24, 25, 26, 27, 28, 29], one may be interested in additive bounds on the normalized delay or the (effective) rate when compared to an optimal code given state knowledge, defined here as the regret:
(2) |
A family of channels is said to be -competitive if . Similarly for the regret objective, allows a regret of if .
In the asymptotic setting, which is the focus of this work, a family of channels is said to be -competitive if there exists a sequence such that The supremum over all such is defined as the optimal competitive ratio (also referred to as the competitive AVC capacity) and is denoted by . Namely,
(3) |
Similarly, a regret of is achievable if there exists such that , and the infimum over such is denoted byย In the remainder of this work, we focus on the quality measure; however, the results presented here apply also to .
III Main question
The main objective towards the design of practical codes that achieve the optimal competitive ratio is a single-letter characterization of . Single-letter characterizations provide efficient means to determine fundamental quality limits and, more importantly, they often provide a simple structure for optimal code design. In a previous work of the authors, [30], a single-letter characterization for the competitive analysis of the compound channel (in which the state sequence is unknown but does not change over time) was presented. In this work, in which we address AVCs, we do not derive a single-letter characterization for ; however, we make significant steps towards understanding this ultimate goal.
In this work, we seek to understand the structure of optimal codes for competitive metrics. Specifically, we ask if, similar to classical results in the context of communication, coding schemes designed and governed by a single optimizing distribution are optimal in the competitive setting as well. We note that, in the traditional study of AVCs, the capacity (up to symmetrizability) is characterized as , e.g., [31], by a single input distribution. Here are distributed according to . The same holds for the rateless study of such AVCs when one wishes to minimize the worst-case decoding time, with [19] and without [32] decoder state information.
For the compound setting, [30] show that single-distribution codes are suboptimal in the context of competitive analysis. Namely, for competitive quality measures in the compound setting, the encoder, even without additional feedback knowledge, may be required to modify encoding statistics as time goes by. This follows, roughly stated, since a competitive encoder at any given time must optimize performance over a monotonely shrinking set of channel states. Initially, the encoder must act with all potential states in mind. However, as time goes by, certain channel statistics, if realized, would have already allowed successful decoding, and thus no longer need to be considered in the encoderโs optimization. In this work, we ask whether this sub-optimality applies to the competitive analysis of the time-varying AVC model as well. In AVCs, the encoder can no longer rule out future channel states based on prior assumptions. Why then should an AVC encoder change its encoding statistics as time goes by? Rather surprisingly, we show that, even without additional feedback knowledge, the optimal AVC encoder may be required to modify its statistics as time goes by. In what follows, we introduce additional notation that allows us to formally state and then prove our main result.
We address the number of input distributions needed in code design to achieve an optimal competitive ratio. Formally, a uniform message and an encoder map** induce a distribution on the infinite ensemble of channel inputs . Computing the marginal distribution at each time step, we obtain a product distribution . The product distribution does not characterize the code at hand (as different inputs may depend on each other) - but will suffice for our purposes. We ask how many different distributions in are needed to achieve an optimal competitive ratio.
We define product distributions that alternate at points as follows:
Definition 1
The set includes all product distributions with where and are non-negative integers.
For example, if , the channel input has the same marginal distribution on for all . Complementing Definitionย 1, the competitive ratio is defined similarly to but when the codes in use are restricted to have product distributions close to (slackness is added to allow the slight variability implied by random code design). Specifically, for and a given , a codebook is said to be -close to if the marginal distributions corresponding to the โth codeword entry satisfy for every integer that . Here, is the total variation distance between distributions. A codebook is said to be -close to if it is -close to some . Let be defined similarly to when restricted to codes with product distributions -close to for . We give a formal definition of in Appendixย A. We now have that
(4) |
As stated earlier, traditional results on communication over memoryless channels and AVCs show that there exist optimal encoders (with corresponding product distributions) that lie in . For the compound channel, [30] show, on the one hand, that codes in are sub-optimal for competitive measures, but, on the other, that codes in are sufficient for competitive optimality. The latter, for finite , enables a single-letter characterization of in the compound setting (as an optimization over input distributions).
The work at hand asks whether suffices to achieve optimal competitive measures on AVCs. We show through an example that single-distribution codes do not suffice to achieve the competitive capacity for AVCs. That is, . Even given the worst-case flavor of the AVC model, namely, given the fact that at every time step any channel can come next, the fact that for certain channel sequences, decoding can be done before others, leads to the realization that the encoder may benefit from different behaviors as time passes.
IV Main Result
In this section, we present our main result and the main steps to establish it.
Theorem 1
Single input distributions do not achieve the optimal competitive ratio in AVCs. That is, .
The assertion implies that there exists AVCs for which any single-letter characterization of the optimal competitive ratio should include at least two channel input distributions. Whether there is an upper bound to the number of input distributions sufficient for optimal competitive codes, i.e., whether there exists such that , is left open in this work.
To prove Theorem 1, we show for a particular channel family the following chain of inequalities.
(5) |
To prove Theorem 1, it is sufficient to prove steps in (5). These steps are proved in Section V. Furthermore, we prove step since this bound is quite close to the lower bound of step , and its proof method may be of independent interest.
The channel family we study to establish (5) is depicted in Fig. 1, and is discussed in detail in Section V. In the particular example we study, the channel state can be determined by the channel output. Thus, in our achievability schemes for Theoremย 1, we consider rateless codes for the AVC setting in which the decoder has full state information (DSI). Consequently, we employ Theoremย 2 stated below in the proof of Theoremย 1. While the proof of Theoremย 2 may follow from various modifications in prior works that study rateless AVC achievability in the presence of DSI [19, 20, 21], as our model is slightly different, a self-contained proof of Theoremย 2 appears in Appendixย B. In what follows, we consider codes with decoder state information (DSI). That is, the decoder map**s depend causally on the channel outputs and states.
Theorem 2
Let . Let be sufficiently large. Let . Let . Let be a channel family for which , where for state and index , is distributed according to . For let
Then, there exists a code that is -close to with decoding error probability and competitive ratio at least
where with .
The intuition behind Theorem 2 is that a decoder with state information can successfully decode the message once the accumulated mutual information between the channel input and output surpasses the message entropy. The optimal stop** time, , follows in a similar way when the encoder can use the distribution achieving capacity for each state .
V Proof of Theoremย 1
In this section, we present a family of channels such that the best competitive ratio with a single input distribution is , where one can obtain if two input distributions are allowed. This establishes the proof of Theorem 1. The proof of step in (5), , appears in the appendix.
The channel family consists of two channels and is depicted in Figureย 1. Let and . In the first channel, , if the output is , and if the output has a uniform distribution on .
In the second channel, , if the output is with probability and with probability . For , channel sets to be uniform in with probability and with probability . The channel is represented as the concatenation of two channels, a first channel followed by an erasure channel (EC). We will use the fact that the channel capacities are and , respectively.
The example is a modified version of the โbilingual speakerโ example from [33, 34], and was also studied in our prior work on the competitive analysis of compound channels [30]. Note that the encoder is uncertain about the state sequence, but the decoder can deduce the channel state from the channel output since if and only if the first channel is used. Thus, we can utilize Theorem 2 that addresses code analysis in the setting of DSI.
For code design, by symmetry, it is sufficient to consider for input distributions that satisfy
(6) |
The achievable rates for and for single-distribution codes governed by such equal and as depicted in Figureย 1.
By Theoremย 2, there exists a communication scheme that allows successful decoding once the cumulative mutual information exceeds the message length. Roughly speaking, decoding is successful after
(7) |
where corresponds to the distribution of the input at time and can be parameterized by . We will also utilize the optimal decoding time defined in Theorem 2.
V-A Proof of :
A lower bound is achieved by setting the channel input distribution in (V) with . This implies that the mutual information under both channels is and therefore for all . The competitive ratio is thus bounded, for any , by where the equality follows by setting the prefix of to since it minimizes the optimal decoding time.
To prove the upper bound on , it suffices to consider the inequality
(8) |
for of our choice. Bounding by the left expression in (8) roughly follows from standard capacity bounds, e.g., [30, Section VI]. For the right expression, we analyze the setting .
For the first sequence , the optimal decoding time is . For this sequence, utilizing a code with a single distribution , we have (We can assume since otherwise the decoding time for is infinite). For the second sequence, , we have and, for , . Combining the two ratios that correspond to the different sequences and taking limits over we obtain
(9) |
V-B Proof of :
The prove the assertion, we present a family of codes with two input distributions and analyze their performance. We consider codes that utilize until time and afterwards. Without loss of optimality, we consider since the minimal and maximal optimal stop** times are within this range. The optimal parameters will be shown to be and .
For any state sequence , let its optimal decoding time be where and depends on the underlying state sequence.111With some abuse of notation, throughout the analysis, we omit sub-linear terms of that do not affect the competitive ratio analysis. Let be the fraction of in , and be the fraction of 2โs in . By the accumulation of mutual information, it holds that where are the capacities of the channels, respectively; or in other words . Let be a constant that is independent of and satisfies for all . Our objective is to find the least that satisfies the inequality for all since it directly implies .
We proceed to the analysis broken to sub-cases based on the value of .
Case 1: . The range of implies the ordering .222The case is not possible; this can only occur if , but this contradicts our assumption that combined with our upper bound . That is, the alternating distribution point lies between the stop** time and the optimal stop** time.
Let be the prefix of up to length . By our assumption that , we can consider instead of since this sequence has the largest stop** time. We have by the accumulation of the mutual information in (7)
which implies . This inequality holds for all state sequences such that their corresponding satisfies , so we proceed to maximize the upper bound by maximizing it over .
The bound is increasing (in ) if and is decreasing otherwise. That means that if the condition is satisfied, the bound is maximizes at resulting in . If the condition is not satisfied, the upper bound is maximized at resulting in . It is easy to verify that the condition is not satisfied for , and so we will use the second bound.
Case 2: . The range of implies that the number of 2โs can be contained within . That is, . This now implies that the worst sequence (for this case) is the one starting with โs, which in turn implies that:
(10) |
which implies .
The bound is increasing if implying that and , and if the condition is not satisfied, is the maximizer and we have .
Case 3: . In this range, we have so that, in the worst state-sequence for this case, the number of 2โs occupies :
(11) |
This implies and is minimized at so that .
Summary: We need to combine the different cases and choose the largest upper bound among these. If the condition of Case is satisfied, we obtain
(12) |
The second term is dominated by the maximum between the first and third terms for all in the range. By comparing the first and third terms we obtain giving
The upper bound is minimized at , implying in turn that . To conclude, the optimized bound is and proves . Note that if the condition of Case is not satisfied, we necessarily have a greater bound since we have the same optimization as before but the minimum over is constrained.
VI Conclusions
In this work, we study the competitive analysis of AVCs in the rateless setting. Unlike traditional solutions for AVCs, we find that codes using a single input distribution fall short of achieving optimal competitive performance. This emphasizes the necessity of encoding technologies that adapt the input distribution over time, even in the absence of feedback or any knowledge about subsequent channel states. A single-letter expression for the optimal competitive ratio is left open in this work. In particular, an upper bound to the number of input distributions sufficient for optimal competitive codes is subject to future studies.
References
- [1] D.ย Blackwell, L.ย Breiman, and A.ย J. Thomasian. The capacities of certain channel classes under random coding. The Annals of Mathematical Statistics, pages 558โ567, 1960.
- [2] R.ย Ahlswede. Elimination of correlation in random codes for arbitrarily varying channels. Z. Wahrsch. Verw. Gebiete, 33:159โ175, 1978.
- [3] I.ย Csiszรกr and P.ย Narayan. Arbitrarily varying channels with constrained inputs and states. IEEE Transactions on Information Theory, 34(1):27โ34, 1988.
- [4] I.ย Csiszรกr and P.ย Narayan. Capacity of the Gaussian arbitrarily varying channel. IEEE Transactions on Information Theory, 37(1):18โ26, January 1991.
- [5] I.ย Csiszรกr and P.ย Narayan. The capacity of the arbitrarily varying channel revisited : Positivity, constraints. IEEE Transactions on Information Theory, 34(2):181โ193, 1988.
- [6] M.ย Luby. LT codes. In Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science, pages 271โ280, 2002.
- [7] D.ย J.ย C. MacKay. Fountain codes. IEE Proceedings-Communications, 152(6):1062โ1068, 2005.
- [8] A.ย Shokrollahi. Raptor codes. IEEE Transactions on Information Theory, 52(6):2551โ2567, 2006.
- [9] O.ย Shayevitz and M.ย Feder. Communicating using feedback over a binary channel with arbitrary noise sequence. In IEEE International Symposium on Information Theory (ISIT), pages 1516โ1520, 2005.
- [10] K.ย Eswaran, A.ย D. Sarwate, A.ย Sahai, and M.ย Gastpar. Using zero-rate feedback on binary additive channels with individual noise sequences. In IEEE International Symposium on Information Theory (ISIT), pages 1431โ1435, 2007.
- [11] K.ย Eswaran, A.ย D Sarwate, A.ย Sahai, and M.ย C. Gastpar. Zero-rate feedback can achieve the empirical capacity. IEEE Transactions on Information Theory, 56(1):25โ39, 2009.
- [12] O.ย Shayevitz and M.ย Feder. Achieving the empirical capacity using feedback: Memoryless additive models. IEEE Transactions on Information Theory, 55(3):1269โ1295, 2009.
- [13] K.ย Woyach, K.ย Harrison, G.ย Ranade, and A.ย Sahai. Comments on unknown channels. In Information Theory Workshop, pages 172โ176. IEEE, 2012.
- [14] Y.ย Lomnitz and M.ย Feder. Communication over individual channels. IEEE Transactions on Information Theory, 57(11):7333โ7358, 2011.
- [15] Y.ย Lomnitz and M.ย Feder. Communication over individual channelsโa general framework. arXiv preprint, arXiv:1203.1406, 2012.
- [16] N.ย Blits. Rateless codes for finite message set. M.Sc. dissertation, Tel-Aviv University, 2012.
- [17] Y.ย Lomnitz and M.ย Feder. Universal communication over arbitrarily varying channels. IEEE Transactions on Information Theory, 59(6):3720โ3752, 2013.
- [18] P.ย Joshi, A.ย Purkayastha, Y.ย Zhang, A.ย J. Budkuley, and S.ย Jaggi. On the Capacity of Additive AVCs with Feedback. In IEEE International Symposium on Information Theory (ISIT), pages 504โ509, 2022.
- [19] S.ย C. Draper, F.ย R. Kschischang, and F.ย Brendan. Rateless coding for arbitrary channel mixtures with decoder channel state information. IEEE Transactions on Information Theory, 55(9):4119โ4133, 2009.
- [20] A.ย D. Sarwate. Robust and adaptive communication under uncertain interference. PhD thesis, University of California, Berkeley, 2008.
- [21] A.ย D. Sarwate and M.ย Gastpar. Rateless codes for AVC models. IEEE Transactions on Information Theory, 56(7):3105โ3114, 2010.
- [22] M.ย V. Burnashev. Data transmission over a discrete channel with feedback. Random transmission time. Probl. Inf. Transm., 12(4):250โโ265, 1976.
- [23] Y.ย Hu, Y.ย Zhu, M.ย C. Gursoy, and A.ย Schmeink. SWIPT-enabled relaying in IoT networks operating with finite blocklength codes. IEEE Journal on Selected Areas in Communications, 37(1):74โ88, 2018.
- [24] Y.ย Hu, Y.ย Li, M.ย Gursoy, S.ย Velipasalar, and A.ย Schmeink. Throughput analysis of low-latency IoT systems with QoS constraints and finite blocklength codes. IEEE Transactions on Vehicular Technology, 69(3):3093โ3104, 2020.
- [25] F.ย Ghanami, G.ย A. Hodtani, B.ย Vucetic, and M.ย Shirvanimoghaddam. Performance analysis and optimization of NOMA with HARQ for short packet communications in massive IoT. IEEE Internet of Things Journal, 8(6):4736โ4748, 2020.
- [26] N.ย Agrawal, A.ย Bansal, K.ย Singh, C.P. Li, and S.ย Mumtaz. Finite block length analysis of RIS-assisted UAV-based multiuser IoT communication system with non-linear EH. IEEE Transactions on Communications, 70(5):3542โ3557, 2022.
- [27] P.ย Popovski, J.ย J. Nielsen, C.ย Stefanovic, E.ย Deย Carvalho, E.ย Strom, K.ย F. Trillingsgaard, A.ย S. Bana, D.ย M. Kim, R.ย Kotaba, J.ย Park, and R.ย B. Sorensen. Wireless access for ultra-reliable low-latency communication: Principles and building blocks. IEEE Network, 32(2):16โ23, 2018.
- [28] C.ย She, C.ย Pan, T.ย Q. Duong, T.ย Q.ย S. Quek, R.ย Schober, M.ย Simsek, and P.ย Zhu. Guest Editorial xURLLC in 6G: Next Generation Ultra-Reliable and Low-Latency Communications. IEEE Journal on Selected Areas in Communications, 41(7):1963โ1968, 2023.
- [29] N.ย H. Mahmood, I.ย Atzeni, E.ย A. Jorswieck, and O.ย L.ย A. Lรณpez. Ultra-Reliable Low-Latency Communications: Foundations, Enablers, System Design, and Evolution Towards 6G. Foundations and Trendsยฎ in Communications and Information Theory, 20(5-6):512โ747, 2023.
- [30] M.ย Langberg and O.ย Sabag. Competitive channel-capacity. IEEE Transactions on Information Theory, 2024.
- [31] A.ย Lapidoth and P.ย Narayan. Reliable communication under channel uncertainty. IEEE Transactions on Information Theory, 44(10):2148โ2177, 1998.
- [32] O.ย Kosut and J.ย Kliewer. Finite blocklength and dispersion bounds for the arbitrarily-varying channel. In IEEE International Symposium on Information Theory (ISIT), pages 2007โ2011, 2018.
- [33] N.ย Shulman and M.ย Feder. Static broadcasting. In IEEE International Symposium on Information Theory, pageย 23, 2000.
- [34] N.ย Shulman. Communication over an unknown channel via common broadcasting. Ph.D. dissertation, Tel Aviv University, 2003.
- [35] C.ย Shannon. A mathematical theory of communication. Bell System Technical Journal, 27(3):379โ423, 623โ656, July 1948.
- [36] I.ย Csiszรกr and J.ย Korner. Information Theory: Coding Theorems for Discrete Memoryless Systems, 2nd edition. Akademiai Kiado, New York, NY, 1997.
- [37] S.ย Z. Stambler. Shannon theorems for a full class of channels with state known at the output. Problems of Information Transmission, 11(4):3โ12, (In Russian). 1975.
- [38] R.ย W. Yeung. Information theory and network coding. Springer Science & Business Media, 2008.
Appendix A Definition of
Let . Let be a positive integer. Let be the set of all rateless codes with message size that correspond to product distributions that are close to . Let,
(13) |
Let,
(14) |
Finally, let
(15) |
Appendix B Achievability (Proof of Theoremย 2)
Let . Let be sufficiently large. Consider any in . Here, may depend on . In what follows, we show that a rateless code designed at random, i.e., in which the โth codewords entries are drawn independently from the โth entry in , has (with high probability) competitive ratio at least
B-A Overview
Before diving into the technical proof, we give a rough overview. In general, in the setting of DSI for finite , traditional analysis of random code design for memoryless channels [35] shows for any given fixed that with overwhelming probability over code construction, the resulting code guarantees a vanishing (in ) decoding error once the decoder waits for an appropriate amount of time. The main question at hand is the decay of as grows. For several applications, a polynomial or exponential decay of in is sufficient. However, in our case the code should have a vanishing decoding error, no matter which is in use. As the number of possible states is exponential, we will require to have double-exponential dependence on . Analysis using double-exponential (or super-exponential) concentration on code design is rather common in the AVC literature. Once established in our setting, combining a few additional ideas with a union bound over all possible suffices to conclude that a single code achieves the expression for all states sequence. We proceed to establish the stated concentration on .
Let be a fixed state vector. We divide into consecutive chunks according to the chunks of such that where for the length of is . Here are defined by , and will be defined shortly. Let be the corresponding chunk decomposition of in which, for , , and . Let
where is distributed according to the โth entry of , and is the outcome of where is the โth entry of . Assume that and set . Namely, . We remove this assumption later at the end of the proof. We employ the method of types in our decoding according to the definitions in [36]. For , let be the type of . In what follows we assume a function such that, for , we have and for any . This assumption is also removed at the end of the proof.
To simplify our presentation, for and , we further partition each chunk to sub-chunks corresponding to entries in which equals . We assume without loss of generality that, for each , the sub-chunk consists of consecutive entries of . This follows from the fact that the decoder, knowing , can reorder the received information accordingly and from the fact that the random encoding rule is consistent over chunks. Thus, we have, for and , that .
Let be the vector of random variables corresponding to in which, for any positive integer , is distributed according to the โth entry in . We divide into chunks as well corresponding to the decomposition of and . Namely, let where, for , the length of is . Let be a random code as discussed above. Namely, for each message , the codeword corresponding to is independently distributed according to and will be denoted by , where, for , the length of is and each entry in is distributed according to . Moreover, for , we divide into sub-chunks according to those of . Namely, where, for , the entries of correspond to those in . Finally, let be the channel output at the receiver, and, for , let be the chunk and sub-chunk decomposition respectively, both according to the chunk and sub-chunks of and . It now holds, for , , and that each entry of is independently distributed according to and that is distributed according to . Here, is the length of the sub-chunk . Note that, by our assumption on the type of and on the chunk size , it holds that . Namely, for the sub-chunk corresponding to our analysis reduces to the performance of the fixed memoryless channel of blocklength in the presence of a code generated randomly according to the fixed distribution . Let and for distributed according to . Recall from the theorem statement that . Notice, by our definition of , and , that . We are now ready to define our encoder and decoder.
Encoding: Let with encoder be the randomly constructed code as defined above. The encoder picks a uniform and transmits . We show below, for sufficiently large , that with high probability the resulting code has marginals that are -close to . Let be the โth entry of (we use a superscript here to distinguish with which governs the โth chunk of ), and let be the marginal distribution of the โth entry of the code . Using Sanovโs theorem and Pinskerโs inequality, for any entry in the codebook it holds with probability at most that . Thus, using standard concentration bounds and the independence between entries of , it is not hard to verify that . In both bounds above, we use sufficiently large. This implies that, for sufficiently large , with probability at least over code-design, the resulting code is has corresponding marginal distributions that are -close to .
Decoding: Let be the received transmission and the state sequence (that is available to the decoder). Decoding follows standard typicality decoding outlined in the context of AVCโs with DSI in, e.g., [37] and Exercise 12.16(b) of [36]. Let be a function of to be defined later. Let . The decoder iterates over between and , and decodes to the first such that for all and the joint type of the pair is of distance at most from the distribution over . If no message passes the test above, the decoder decodes arbitrarily to .
We now prove the following concentration on for decoding with average error .
Claim 1
Let . Let and . With probability over code design, it holds that
Proof:
For , let be the indicator of the error event that either such that or such that . Let be any fixed values for the codewords . We first analyze the expected value of conditioned on .
Standard analysis, appearing for example in Chapter 7 of [38], implies the existence of that tends to zero when tends to zero, such that for and any ,
and for any and any ,
Here are distributed according to . As the events in every sub-chunk are independent, we have for any that,
Thus, by the union bound (over and more),
As, it holds that . We now have that
for and thus sufficiently small such that and sufficiently large.
Let . To conclude the claim assertion, using the Chernoff-type Lemma A.1 of [5] and noting that depends only on , we have
โ
Consider any . Let . Thus, using a union bound over all of length at most , Claimย 1 implies the existence of a code that is -close to and a DSI-decoder that for any decodes at time with average error at most . We conclude that the competitive ratio obtained by the suggested scheme is
(16) |
To conclude our proof, we revisit the functions , and alongside our assumptions. Throughout, we fix to be a sufficiently small constant and asymptotically large. We chose as . We chose as to be a sufficiently small function of to satisfy the requirement stated above. If we remove the assumptions that for all , and for all , then the decoder can neglect the entries corresponding to bad pairs that violate (one of) the assumptions. The cumulative mutual information (summed over bad pairs) lost at the decoder is bounded by which for sufficiently small and suitable leaves the remaining cumulative mutual information (summed over good pairs) at the decoder to be at least which replaces in the previous presented proof. If the assumption that does not hold, and instead it holds that then one need only consider the prefix of consisting of the first chunks and replace in the analysis by .
Appendix C Proof of
In this section, we prove that for the family of channels in Fig.ย 1 we have . The main idea is to identify from the code analysis of our lower bound a collection of state sequences that constrain the optimization defining the competitive ratio . Namely,
(17) |
Bounding by the top expression in (C) follows from standard capacity bounds, e.g., [30, Section VI]. The upper bound in the bottom inequality of (C) holds for any subset of state sequences. Here, we consider consisting of two sets. The first is corresponding to the two sequences whose prefix of length is , followed by a constant-state sequence with . The second set is defined as where is any sequence of length whose number of states of type and satisfy a ratio of , e.g., . Simply put, we fix the type of the sequence in this interval.
The particular choice of these sets is based on their optimal decoding times. We have for all since their prefix is . For the second set, we have for all since the location of โs in the interval has no impact on the optimal decoding time.
The remainder of the proof consists of two main steps. The first step is to show that the optimization of input distributions in (C) can be limited to input distributions that are constant within the intervals . This step will follow from the structure of the chosen state-sequence sets and . Then, to complete the proof of the upper bound, in the second step we compute the optimization in (C) under the restricted domain of input distributions.
Our first claim is that during the time interval the optimizing input distribution is fixed and need not change. This follows from the fact that no matter which state sequence is realized, the optimal stop** time is at least . Thus, by the concavity of mutual information, any collection of time-varying input distributions in this interval is sub-optimal. The fixed distribution in this interval is denoted by .
For the second interval, , we also claim that a fixed input distribution optimizes the competitive ratio. The argument follows from symmetry: the type of the state sequence is fixed, thus we can take an expected value over the state sequence to upper bound the minimal cumulative mutual information, i.e.,
(18) |
where in step we bound the minimum sequence by a uniform distribution over a uniform random sequence of the corresponding fixed type. In step , we define a uniform random variable , independent of . The latter expectation converges for large to with and distributed according to . We note that where the joint distribution on the right hand side is with . The inequality follows from the Markov chain . To conclude, an i.i.d. distribution of channel inputs will achieve the upper bound for the cumulative mutual information for large . A similar result holds in the interval for as well since the state is constant during this interval. In particular, this follows from the concavity of mutual information.
Finally, we note that since for any , , and since our sets of sequences are symmetric after time , the best rate after time is achieved when the rates of both channels are equal, i.e., the best rate is . This completes the first step of the proof, to show that the optimal input distributions can be restricted to be constant in the specified intervals.
We proceed with the second step of our proof in which we compute the stop** times induced by each set of sequences. From the stop** times we obtain values for competitive ratios of the form ; then, optimizing over these ratios according to (C) will result in our upper bound on the competitive ratio.
For sequences in , we have
(19) | ||||
(20) |
Note that the stop** time is independent of . This is due to our choice of the ratio in the definition of . We obtain that the stop** time is and that the competitive ratio for this set is .
For the set , we know that the stop** time is greater than but we need to consider different cases depending on whether the stop** time is smaller or greater than . Recall that there are two sequences in and therefore we have four cases. We start by computing the stop** time of any sequence in such that :
(21) |
where and depending on whether the sequence in ends with or . Simplifying the equation, we obtain that the stop** time is subject to input distributions that satisfy (or equivalently, ). Moreover, the competitive ratio is .
We next consider the stop** time of sequences in with :
(22) |
which provides with the stop** time , subject to input distributions that satisfy .
To determine the competitive ratio for , we analyze four different cases.
-
1.
Case A: for both sequences in .
-
2.
Case B: For , we have , while for we have .
-
3.
Case C: for both sequences in .
-
4.
Case D: For , we have , while for we have .
One can derive the conditions on that specify each of the above cases. For example, Case A is valid only if . We omit the derivation of the other cases and proceed to directly compute an upper bound on the competitive ratio conditioned on each case. The upper bound on the overall competitive ratio is the largest upper bound among the different cases and will be shown to be upper bounded by .
For Case A, we have that the optimized competitive ratio is
(23) |
where the first equality follows by the fact that the second term does not depend on ; thus, optimizing over the first expression yields . The second equality is obtained by comparing the two terms (which are equal when ). Indeed, the optimal point here lies in the feasible region so we have equalities in both steps. We now turn to Cases B-D, and show, as well, that the induced upper bound is less or equal than .
For Case B, we have for but for . Combining these competitive ratios with yields
(24) |
where follows by comparing the two terms that depend on . These two terms are equal when or , and it is clear that is not feasible so we choose . Note that is an upper bound (and not equality) since may not be feasible depending on the value of . Step follows from (C) by ignoring the constraint on .
For Case C, we have
(25) |
The upper bound is simple since that equates both terms depending on lies in the feasible region. We then maximize over that gives as in Case A.
For last case, Case D, the optimized competitive ratio is
(26) |
and comparing the terms depending on provides . The only feasible point is which simplifies the optimization to be as in Case A. We note that the optimal unconstrained solution does not lie in the feasible region and therefore is a strict upper bound.
Combining the different cases, we conclude that the competitive ratio is upper bounded by the maximum among these bounds and is thus equal to as asserted.