-
The (r, δ)-Locality of Repeated-Root Cyclic Codes with Prime Power Lengths
Authors:
Wei Zhao,
Weixian Li,
Shenghao Yang,
Kenneth W. Shum
Abstract:
Locally repairable codes (LRCs) are designed for distributed storage systems to reduce the repair bandwidth and disk I/O complexity during the storage node repair process. A code with $(r,δ)$-locality (also called an $(r,δ)$-LRC) can simultaneously repair up to $δ-1$ symbols in a codeword by accessing at most $r$ other symbols in the codeword. In this paper, we propose a new method to calculate th…
▽ More
Locally repairable codes (LRCs) are designed for distributed storage systems to reduce the repair bandwidth and disk I/O complexity during the storage node repair process. A code with $(r,δ)$-locality (also called an $(r,δ)$-LRC) can simultaneously repair up to $δ-1$ symbols in a codeword by accessing at most $r$ other symbols in the codeword. In this paper, we propose a new method to calculate the $(r,δ)$-locality of cyclic codes. Initially, we give a description of the algebraic structure of repeated-root cyclic codes of prime power lengths. Using this result, we derive a formula of $(r,δ)$-locality of these cyclic codes for a wide range of $δ$ values. Furthermore, we calculate the parameters of repeated-root cyclic codes of prime power lengths and obtain several infinite families of optimal cyclic $(r,δ)$-LRCs, which exhibit new parameters compared with existing research on optimal $(r,δ)$-LRCs with a cyclic structure. For the specific case of $δ=2$, we have comprehensively identified all potential optimal cyclic $(r,2)$-LRCs of prime power lengths.
△ Less
Submitted 5 May, 2024; v1 submitted 3 April, 2023;
originally announced April 2023.
-
Optimal Quaternary (r,delta)-Locally Repairable Codes Achieving the Singleton-type Bound
Authors:
Kenneth W. Shum,
Jie Hao
Abstract:
Locally repairable codes enables fast repair of node failure in a distributed storage system. The code symbols in a codeword are stored in different storage nodes, such that a disk failure can be recovered by accessing a small fraction of the storage nodes. The number of storage nodes that are contacted during the repair of a failed node is a parameter called locality. We consider locally repairab…
▽ More
Locally repairable codes enables fast repair of node failure in a distributed storage system. The code symbols in a codeword are stored in different storage nodes, such that a disk failure can be recovered by accessing a small fraction of the storage nodes. The number of storage nodes that are contacted during the repair of a failed node is a parameter called locality. We consider locally repairable codes that can be locally recovered in the presence of multiple node failures. The punctured code obtained by removing the code symbols in the complement of a repair group is called a local code. We aim at designing a code such that all local codes have a prescribed minimum distance, so that any node failure can be repaired locally, provided that the total number of node failures is less than the tolerance parameter. We consider linear locally repairable codes defined over a finite field of size four. This alphabet has characteristic 2, and hence is amenable to practical implementation. We classify all quaternary locally repairable codes that attain the Singleton-type upper bound for minimum distance. For each combination of achievable code parameters, an explicit code construction is given.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
Repeated-root Constacyclic Codes with Optimal Locality
Authors:
Wei Zhao,
Kenneth W. Shum,
Shenghao Yang
Abstract:
A code is called a locally repairable code (LRC) if any code symbol is a function of a small fraction of other code symbols. When a locally repairable code is employed in a distributed storage systems, an erased symbol can be recovered by accessing only a small number of other symbols, and hence alleviating the network resources required during the repair process. In this paper we consider repeate…
▽ More
A code is called a locally repairable code (LRC) if any code symbol is a function of a small fraction of other code symbols. When a locally repairable code is employed in a distributed storage systems, an erased symbol can be recovered by accessing only a small number of other symbols, and hence alleviating the network resources required during the repair process. In this paper we consider repeated-root constacyclic codes, which is a generalization of cyclic codes, that are optimal with respect to a Singleton-like bound on minimum distance. An LRC with the structure of a constacyclic code can be encoded efficiently using any encoding algorithm for constacyclic codes in general. In this paper we obtain optimal LRCs among these repeated-root constacyclic codes. Several infinite classes of optimal LRCs over a fixed alphabet are found. Under a further assumption that the ambient space of the repeated-root constacyclic codes is a chain ring, we show that there is no other optimal LRC.
△ Less
Submitted 24 November, 2021;
originally announced November 2021.
-
Multichannel Conflict-Avoiding Codes of Weights Three and Four
Authors:
Yuan-Hsun Lo,
Kenneth W. Shum,
Wing Shing Wong,
Yi** Zhang
Abstract:
Conflict-avoiding codes (CACs) were introduced by Levenshtein as a single-channel transmission scheme for a multiple-access collision channel without feedback. When the number of simultaneously active source nodes is less than or equal to the weight of a CAC, it is able to provide a hard guarantee that each active source node transmits at least one packet successfully within a fixed time duration,…
▽ More
Conflict-avoiding codes (CACs) were introduced by Levenshtein as a single-channel transmission scheme for a multiple-access collision channel without feedback. When the number of simultaneously active source nodes is less than or equal to the weight of a CAC, it is able to provide a hard guarantee that each active source node transmits at least one packet successfully within a fixed time duration, no matter what the relative time offsets between the source nodes are. In this paper, we extend CACs to multichannel CACs for providing such a hard guarantee over multiple orthogonal channels. Upper bounds on the number of codewords for multichannel CACs of weights three and four are derived, and constructions that are optimal with respect to these bounds are presented.
△ Less
Submitted 20 April, 2021; v1 submitted 24 September, 2020;
originally announced September 2020.
-
Schedule Sequence Design for Broadcast in Multi-channel Ad Hoc Networks
Authors:
Fang Liu,
Kenneth W. Shum,
Yi** Zhang,
Wing Shing Wong
Abstract:
We consider a single-hop ad hoc network in which each node aims to broadcast packets to its neighboring nodes by using multiple slotted, TDD collision channels. There is no cooperation among the nodes. To ensure successful broadcast, we propose to pre-assign each node a periodic sequence to schedule transmissions and receptions at each time slot. These sequences are referred to as schedule sequenc…
▽ More
We consider a single-hop ad hoc network in which each node aims to broadcast packets to its neighboring nodes by using multiple slotted, TDD collision channels. There is no cooperation among the nodes. To ensure successful broadcast, we propose to pre-assign each node a periodic sequence to schedule transmissions and receptions at each time slot. These sequences are referred to as schedule sequences. Since each node starts its transmission schedule independently, there exist relative time offsets among the schedule sequences they use. Our objective is to design schedule sequences such that each node can transmit at least one packet to each of its neighbors successfully within a common period, no matter what the time offsets are. The sequence period should be designed as short as possible. In this paper, we analyze the lower bound on sequence period, and propose a sequence construction method by which the period can achieve the same order as the lower bound.
We also consider the random scheme in which each node transmits or receives on a channel at each time slot with a pre-determined probability. The frame length and broadcast completion time under different schemes are compared by numerical studies.
△ Less
Submitted 19 September, 2020;
originally announced September 2020.
-
Network Coding Based on Byte-wise Circular Shift and Integer Addition
Authors:
Kenneth W. Shum,
Hanxu Hou
Abstract:
A novel implementation of a special class of Galois ring, in which the multiplication can be realized by a cyclic convolution, is applied to the construction of network codes. The primitive operations involved are byte-wise shifts and integer additions modulo a power of 2. Both of them can be executed efficiently in microprocessors. An illustration of how to apply this idea to array code is given…
▽ More
A novel implementation of a special class of Galois ring, in which the multiplication can be realized by a cyclic convolution, is applied to the construction of network codes. The primitive operations involved are byte-wise shifts and integer additions modulo a power of 2. Both of them can be executed efficiently in microprocessors. An illustration of how to apply this idea to array code is given at the end of the paper.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
On the Optimal Minimum Distance of Fractional Repetition Codes
Authors:
Bing Zhu,
Kenneth W. Shum,
Wei** Wang,
Jianxin Wang
Abstract:
Fractional repetition (FR) codes are a class of repair efficient erasure codes that can recover a failed storage node with both optimal repair bandwidth and complexity. In this paper, we study the minimum distance of FR codes, which is the smallest number of nodes whose failure leads to the unrecoverable loss of the stored file. We consider upper bounds on the minimum distance and present several…
▽ More
Fractional repetition (FR) codes are a class of repair efficient erasure codes that can recover a failed storage node with both optimal repair bandwidth and complexity. In this paper, we study the minimum distance of FR codes, which is the smallest number of nodes whose failure leads to the unrecoverable loss of the stored file. We consider upper bounds on the minimum distance and present several families of explicit FR codes attaining these bounds. The optimal constructions are derived from regular graphs and combinatorial designs, respectively.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
Capacity of Distributed Storage Systems with Clusters and Separate Nodes
Authors:
**gzhao Wang,
Tinghan Wang,
Yuan Luo,
Kenneth W. Shum
Abstract:
In distributed storage systems (DSSs), the optimal tradeoff between node storage and repair bandwidth is an important issue for designing distributed coding strategies to ensure large scale data reliability. The capacity of DSSs is obtained as a function of node storage and repair bandwidth parameters, characterizing the tradeoff. There are lots of works on DSSs with clusters (racks) where the rep…
▽ More
In distributed storage systems (DSSs), the optimal tradeoff between node storage and repair bandwidth is an important issue for designing distributed coding strategies to ensure large scale data reliability. The capacity of DSSs is obtained as a function of node storage and repair bandwidth parameters, characterizing the tradeoff. There are lots of works on DSSs with clusters (racks) where the repair bandwidths from intra-cluster and cross-cluster are differentiated. However, separate nodes are also prevalent in the realistic DSSs, but the works on DSSs with clusters and separate nodes (CSN-DSSs) are insufficient. In this paper, we formulate the capacity of CSN-DSSs with one separate node for the first time where the bandwidth to repair a separate node is of cross-cluster. Consequently, the optimal tradeoff between node storage and repair bandwidth are derived and compared with cluster DSSs. A regenerating code instance is constructed based on the tradeoff. Furthermore, the influence of adding a separate node is analyzed and formulated theoretically. We prove that when each cluster contains R nodes and any k nodes suffice to recover the original file (MDS property), adding an extra separate node will keep the capacity if R|k, and reduce the capacity otherwise.
△ Less
Submitted 9 January, 2019;
originally announced January 2019.
-
On the Duality and File Size Hierarchy of Fractional Repetition Codes
Authors:
Bing Zhu,
Kenneth W. Shum,
Hui Li
Abstract:
Distributed storage systems that deploy erasure codes can provide better features such as lower storage overhead and higher data reliability. In this paper, we focus on fractional repetition (FR) codes, which are a class of storage codes characterized by the features of uncoded exact repair and minimum repair bandwidth. We study the duality of FR codes, and investigate the relationship between the…
▽ More
Distributed storage systems that deploy erasure codes can provide better features such as lower storage overhead and higher data reliability. In this paper, we focus on fractional repetition (FR) codes, which are a class of storage codes characterized by the features of uncoded exact repair and minimum repair bandwidth. We study the duality of FR codes, and investigate the relationship between the supported file size of an FR code and its dual code. Based on the established relationship, we derive an improved dual bound on the supported file size of FR codes. We further show that FR codes constructed from $t$-designs are optimal when the size of the stored file is sufficiently large. Moreover, we present the tensor product technique for combining FR codes, and elaborate on the file size hierarchy of resulting codes.
△ Less
Submitted 6 August, 2018;
originally announced August 2018.
-
On Secure Exact-repair Regenerating Codes with a Single Pareto Optimal Point
Authors:
Fangwei Ye,
Shiqiu Liu,
Kenneth W. Shum,
Raymond W. Yeung
Abstract:
The problem of exact-repair regenerating codes against eavesdrop** attack is studied. The eavesdrop** model we consider is that the eavesdropper has the capability to observe the data involved in the repair of a subset of $\ell$ nodes. An $(n,k,d,\ell)$ secure exact-repair regenerating code is an $(n,k,d)$ exact-repair regenerating code that is secure under this eavesdrop** model. It has bee…
▽ More
The problem of exact-repair regenerating codes against eavesdrop** attack is studied. The eavesdrop** model we consider is that the eavesdropper has the capability to observe the data involved in the repair of a subset of $\ell$ nodes. An $(n,k,d,\ell)$ secure exact-repair regenerating code is an $(n,k,d)$ exact-repair regenerating code that is secure under this eavesdrop** model. It has been shown that for some parameters $(n,k,d,\ell)$, the associated optimal storage-bandwidth tradeoff curve, which has one corner point, can be determined. The focus of this paper is on characterizing such parameters. We establish a lower bound $\hat{\ell}$ on the number of wiretap nodes, and show that this bound is tight for the case $k = d = n-1$.
△ Less
Submitted 8 May, 2018;
originally announced May 2018.
-
A Unified Form of EVENODD and RDP Codes and Their Efficient Decoding
Authors:
Hanxu Hou,
Yunghsiang S. Han,
Kenneth W. Shum,
Hui Li
Abstract:
Array codes have been widely employed in storage systems, such as Redundant Arrays of Inexpensive Disks (RAID). The row-diagonal parity (RDP) codes and EVENODD codes are two popular double-parity array codes. As the capacity of hard disks increases, better fault tolerance by using array codes with three or more parity disks is needed. Although many extensions of RDP codes and EVENODD codes have be…
▽ More
Array codes have been widely employed in storage systems, such as Redundant Arrays of Inexpensive Disks (RAID). The row-diagonal parity (RDP) codes and EVENODD codes are two popular double-parity array codes. As the capacity of hard disks increases, better fault tolerance by using array codes with three or more parity disks is needed. Although many extensions of RDP codes and EVENODD codes have been proposed, the high decoding complexity is the main drawback of them. In this paper, we present a new construction for all families of EVENODD codes and RDP codes, and propose a unified form of them. Under this unified form, RDP codes can be treated as shortened codes of EVENODD codes. Moreover, an efficient decoding algorithm based on an LU factorization of Vandermonde matrix is proposed when the number of continuous surviving parity columns is no less than the number of erased information columns. The new decoding algorithm is faster than the existing algorithms when more than three information columns fail. The proposed efficient decoding algorithm is also applicable to other Vandermonde array codes. Thus the proposed MDS array code is practically very meaningful for storage systems that need higher reliability.
△ Less
Submitted 9 March, 2018;
originally announced March 2018.
-
Rack-Aware Regenerating Codes for Data Centers
Authors:
Hanxu Hou,
Patrick P. C. Lee,
Kenneth W. Shum,
Yuchong Hu
Abstract:
Erasure coding is widely used for massive storage in data centers to achieve high fault tolerance and low storage redundancy. Since the cross-rack communication cost is often high, it is critical to design erasure codes that minimize the cross-rack repair bandwidth during failure repair. In this paper, we analyze the optimal trade-off between storage redundancy and cross-rack repair bandwidth spec…
▽ More
Erasure coding is widely used for massive storage in data centers to achieve high fault tolerance and low storage redundancy. Since the cross-rack communication cost is often high, it is critical to design erasure codes that minimize the cross-rack repair bandwidth during failure repair. In this paper, we analyze the optimal trade-off between storage redundancy and cross-rack repair bandwidth specifically for data centers, subject to the condition that the original data can be reconstructed from a sufficient number of any non-failed nodes. We characterize the optimal trade-off curve under functional repair, and propose a general family of erasure codes called rack-aware regenerating codes (RRC), which achieve the optimal trade-off. We further propose exact repair constructions of RRC that have minimum storage redundancy and minimum cross-rack repair bandwidth, respectively. We show that (i) the minimum storage redundancy constructions support a wide range of parameters and have cross-rack repair bandwidth that is strictly less than that of the classical minimum storage regenerating codes in most cases, and (ii) the minimum cross-rack repair bandwidth constructions support all the parameters and have less cross-rack repair bandwidth than that of the minimum bandwidth regenerating codes for almost all of the parameters.
△ Less
Submitted 25 February, 2019; v1 submitted 12 February, 2018;
originally announced February 2018.
-
On the Duality of Fractional Repetition Codes
Authors:
Bing Zhu,
Kenneth W. Shum,
Hui Li
Abstract:
Erasure codes have emerged as an efficient technology for providing data redundancy in distributed storage systems. However, it is a challenging task to repair the failed storage nodes in erasure-coded storage systems, which requires large quantities of network resources. In this paper, we study fractional repetition (FR) codes, which enable the minimal repair complexity and also minimum repair ba…
▽ More
Erasure codes have emerged as an efficient technology for providing data redundancy in distributed storage systems. However, it is a challenging task to repair the failed storage nodes in erasure-coded storage systems, which requires large quantities of network resources. In this paper, we study fractional repetition (FR) codes, which enable the minimal repair complexity and also minimum repair bandwidth during node repair. We focus on the duality of FR codes, and investigate the relationship between the supported file size of an FR code and its dual code. Furthermore, we present a dual bound on the supported file size of FR codes.
△ Less
Submitted 22 October, 2017;
originally announced October 2017.
-
New CRT sequence sets for a collision channel without feedback
Authors:
Yi** Zhang,
Yuan-Hsun Lo,
Kenneth W. Shum,
Wing Shing Wong
Abstract:
Protocol sequences are binary and periodic sequences used for deterministic multiple access in a collision channel without feedback. In this paper, we focus on user-irrepressible (UI) protocol sequences that can guarantee a positive individual throughput per sequence period with probability one for a slot-synchronous channel, regardless of the delay offsets among the users. As the sequence period…
▽ More
Protocol sequences are binary and periodic sequences used for deterministic multiple access in a collision channel without feedback. In this paper, we focus on user-irrepressible (UI) protocol sequences that can guarantee a positive individual throughput per sequence period with probability one for a slot-synchronous channel, regardless of the delay offsets among the users. As the sequence period has a fundamental impact on the worst-case channel access delay, a common objective of designing UI sequences is to make the sequence period as short as possible. Consider a communication channel that is shared by $M$ active users, and assume that each protocol sequence has a constant Hamming weight $w$. To attain a better delay performance than previously known UI sequences, this paper presents a CRTm construction of UI sequences with $w=M+1$, which is a variation of the previously known CRT construction. For all non-prime $M\geq 8$, our construction produces the shortest known sequence period and the shortest known worst-case delay of UI sequences. Numerical results show that the new construction enjoys a better average delay performance than the optimal random access scheme and other constructions with the same sequence period, in a variety of traffic conditions. In addition, we derive an asymptotic lower bound on the minimum sequence period for $w=M+1$ if the sequence structure satisfies some technical conditions, called equi-difference, and prove the tightness of this lower bound by using the CRTm construction.
△ Less
Submitted 4 July, 2017; v1 submitted 9 November, 2016;
originally announced November 2016.
-
Cooperative Repair of Multiple Node Failures in Distributed Storage Systems
Authors:
Kenneth W. Shum,
Junyu Chen
Abstract:
Cooperative regenerating codes are designed for repairing multiple node failures in distributed storage systems. In contrast to the original repair model of regenerating codes, which are for the repair of single node failure, data exchange among the new nodes is enabled. It is known that further reduction in repair bandwidth is possible with cooperative repair. Currently in the literature, we have…
▽ More
Cooperative regenerating codes are designed for repairing multiple node failures in distributed storage systems. In contrast to the original repair model of regenerating codes, which are for the repair of single node failure, data exchange among the new nodes is enabled. It is known that further reduction in repair bandwidth is possible with cooperative repair. Currently in the literature, we have an explicit construction of exact-repair cooperative code achieving all parameters corresponding to the minimum-bandwidth point. We give a slightly generalized and more flexible version of this cooperative regenerating code in this paper. For minimum-storage regeneration with cooperation, we present an explicit code construction which can jointly repair any number of systematic storage nodes.
△ Less
Submitted 28 July, 2016;
originally announced July 2016.
-
Concurrent Regenerating Codes and Scalable Application in Network Storage
Authors:
Huayu Zhang,
Hui Li,
Hanxu Hou,
K. W. Shum,
ShuoYen Robert Li
Abstract:
To recover simultaneous multiple failures in erasure coded storage systems, Patrick Lee et al introduce concurrent repair based minimal storage regenerating codes to reduce repair traffic. The architecture of this approach is simpler and more practical than that of the cooperative mechanism in non-fully distributed environment, hence this paper unifies such class of regenerating codes as concurren…
▽ More
To recover simultaneous multiple failures in erasure coded storage systems, Patrick Lee et al introduce concurrent repair based minimal storage regenerating codes to reduce repair traffic. The architecture of this approach is simpler and more practical than that of the cooperative mechanism in non-fully distributed environment, hence this paper unifies such class of regenerating codes as concurrent regenerating codes and further studies its characteristics by analyzing cut-based information flow graph in the multiple-node recovery model. We present a general storage-bandwidth tradeoff and give closed-form expressions for the points on the curve, including concurrent repair mechanism based on minimal bandwidth regenerating codes. We show that the general concurrent regenerating codes can be constructed by reforming the existing single-node regenerating codes or multiplenode cooperative regenerating codes. Moreover, a connection to strong-MDS is also analyzed. On the other respect, the application of RGC is hardly limited to "repairing". It is of great significance for "scaling", a scenario where we need to increase(decrease) nodes to upgrade(degrade) redundancy and reliability. Thus, by clarifying the similarities and differences, we integrate them into a unified model to adjust to the dynamic storage network.
△ Less
Submitted 22 April, 2016;
originally announced April 2016.
-
Bounds and Constructions of Locally Repairable Codes: Parity-check Matrix Approach
Authors:
Jie Hao,
Shu-Tao Xia,
Kenneth W. Shum,
Bin Chen,
Fang-Wei Fu,
Yi-Xian Yang
Abstract:
A $q$-ary $(n,k,r)$ locally repairable code (LRC) is an $[n,k,d]$ linear code over $\mathbb{F}_q$ such that every code symbol can be recovered by accessing at most $r$ other code symbols. The well-known Singleton-like bound says that $d \le n-k-\lceil k/r\rceil +2$ and an LRC is said to be optimal if it attains this bound. In this paper, we study the bounds and constructions of LRCs from the view…
▽ More
A $q$-ary $(n,k,r)$ locally repairable code (LRC) is an $[n,k,d]$ linear code over $\mathbb{F}_q$ such that every code symbol can be recovered by accessing at most $r$ other code symbols. The well-known Singleton-like bound says that $d \le n-k-\lceil k/r\rceil +2$ and an LRC is said to be optimal if it attains this bound. In this paper, we study the bounds and constructions of LRCs from the view of parity-check matrices. Firstly, a simple and unified framework based on parity-check matrix to analyze the bounds of LRCs is proposed. Several useful structural properties on $q$-ary optimal LRCs are obtained. We derive an upper bound on the minimum distance of $q$-ary optimal $(n,k,r)$-LRCs in terms of the field size $q$. Then, we focus on constructions of optimal LRCs over binary field. It is proved that there are only 5 classes of possible parameters with which optimal binary $(n,k,r)$-LRCs exist. Moreover, by employing the proposed parity-check matrix approach, we completely enumerate all these 5 classes of possible optimal binary LRCs attaining the Singleton-like bound in the sense of equivalence of linear codes.
△ Less
Submitted 22 October, 2019; v1 submitted 21 January, 2016;
originally announced January 2016.
-
HFR Code: A Flexible Replication Scheme for Cloud Storage Systems
Authors:
Bing Zhu,
Hui Li,
Kenneth W. Shum,
Shuo-Yen Robert Li
Abstract:
Fractional repetition (FR) codes are a family of repair-efficient storage codes that provide exact and uncoded node repair at the minimum bandwidth regenerating point. The advantageous repair properties are achieved by a tailor-made two-layer encoding scheme which concatenates an outer maximum-distance-separable (MDS) code and an inner repetition code. In this paper, we generalize the application…
▽ More
Fractional repetition (FR) codes are a family of repair-efficient storage codes that provide exact and uncoded node repair at the minimum bandwidth regenerating point. The advantageous repair properties are achieved by a tailor-made two-layer encoding scheme which concatenates an outer maximum-distance-separable (MDS) code and an inner repetition code. In this paper, we generalize the application of FR codes and propose heterogeneous fractional repetition (HFR) code, which is adaptable to the scenario where the repetition degrees of coded packets are different. We provide explicit code constructions by utilizing group divisible designs, which allow the design of HFR codes over a large range of parameters. The constructed codes achieve the system storage capacity under random access repair and have multiple repair alternatives for node failures. Further, we take advantage of the systematic feature of MDS codes and present a novel design framework of HFR codes, in which storage nodes can be wisely partitioned into clusters such that data reconstruction time can be reduced when contacting nodes in the same cluster.
△ Less
Submitted 12 September, 2015;
originally announced September 2015.
-
On the Optimum Cyclic Subcode Chains of $\mathcal{RM}(2,m)^*$ for Increasing Message Length
Authors:
Xiaogang Liu,
Yuan Luo,
Kenneth W. Shum
Abstract:
The distance profiles of linear block codes can be employed to design variational coding scheme for encoding message with variational length and getting lower decoding error probability by large minimum Hamming distance. %, e.g. the design of TFCI in CDMA and the researches on the second-order Reed-Muller code $\mathcal{RM}(2,m)$, etc.
Considering convenience for encoding, we focus on the distan…
▽ More
The distance profiles of linear block codes can be employed to design variational coding scheme for encoding message with variational length and getting lower decoding error probability by large minimum Hamming distance. %, e.g. the design of TFCI in CDMA and the researches on the second-order Reed-Muller code $\mathcal{RM}(2,m)$, etc.
Considering convenience for encoding, we focus on the distance profiles with respect to cyclic subcode chains (DPCs) of cyclic codes over $GF(q)$ with length $n$ such that $\mbox{gcd}(n,q) = 1$. In this paper the optimum DPCs and the corresponding optimum cyclic subcode chains are investigated on the punctured second-order Reed-Muller code $\mathcal{RM}(2,m)^*$ for increasing message length, where two standards on the optimums are studied according to the rhythm of increase.
△ Less
Submitted 4 June, 2013;
originally announced June 2013.
-
Symmetry in Distributed Storage Systems
Authors:
Satyajit Thakor,
Terence Chan,
Kenneth W. Shum
Abstract:
The max-flow outer bound is achievable by regenerating codes for functional repair distributed storage system. However, the capacity of exact repair distributed storage system is an open problem. In this paper, the linear programming bound for exact repair distributed storage systems is formulated. A notion of symmetrical sets for a set of random variables is given and equalities of joint entropie…
▽ More
The max-flow outer bound is achievable by regenerating codes for functional repair distributed storage system. However, the capacity of exact repair distributed storage system is an open problem. In this paper, the linear programming bound for exact repair distributed storage systems is formulated. A notion of symmetrical sets for a set of random variables is given and equalities of joint entropies for certain subsets of random variables in a symmetrical set is established. Concatenation coding scheme for exact repair distributed storage systems is proposed and it is shown that concatenation coding scheme is sufficient to achieve any admissible rate for any exact repair distributed storage system. Equalities of certain joint entropies of random variables induced by concatenation scheme is shown. These equalities of joint entropies are new tools to simplify the linear programming bound and to obtain stronger converse results for exact repair distributed storage systems.
△ Less
Submitted 15 May, 2013;
originally announced May 2013.
-
Optimized-Cost Repair in Multi-hop Distributed Storage Systems with Network Coding
Authors:
Majid Gerami,
Ming Xiao,
Mikael Skoglund,
Kenneth W. Shum,
Dengsheng Lin
Abstract:
In distributed storage systems reliability is achieved through redundancy stored at different nodes in the network. Then a data collector can reconstruct source information even though some nodes fail. To maintain reliability, an autonomous and efficient protocol should be used to repair the failed node. The repair process causes traffic and consequently transmission cost in the network. Recent re…
▽ More
In distributed storage systems reliability is achieved through redundancy stored at different nodes in the network. Then a data collector can reconstruct source information even though some nodes fail. To maintain reliability, an autonomous and efficient protocol should be used to repair the failed node. The repair process causes traffic and consequently transmission cost in the network. Recent results found the optimal trafficstorage tradeoff, and proposed regenerating codes to achieve the optimality. We aim at minimizing the transmission cost in the repair process. We consider the network topology in the repair, and accordingly modify information flow graphs. Then we analyze the cut requirement and based on the results, we formulate the minimum-cost as a linear programming problem for linear costs. We show that the solution of the linear problem establishes a fundamental lower bound of the repair-cost. We also show that this bound is achievable for minimum storage regenerating, which uses the optimal-cost minimum-storage regenerating (OCMSR) code. We propose surviving node cooperation which can efficiently reduce the repair cost. Further, the field size for the construction of OCMSR codes is discussed. We show the gain of optimal-cost repair in tandem, star, grid and fully connected networks.
△ Less
Submitted 25 March, 2013;
originally announced March 2013.
-
Repairing Multiple Failures in the Suh-Ramchandran Regenerating Codes
Authors:
Junyu Chen,
Kenneth W. Shum
Abstract:
Using the idea of interference alignment, Suh and Ramchandran constructed a class of minimum-storage regenerating codes which can repair one systematic or one parity-check node with optimal repair bandwidth. With the same code structure, we show that in addition to single node failure, double node failures can be repaired collaboratively with optimal repair bandwidth as well. We give an example of…
▽ More
Using the idea of interference alignment, Suh and Ramchandran constructed a class of minimum-storage regenerating codes which can repair one systematic or one parity-check node with optimal repair bandwidth. With the same code structure, we show that in addition to single node failure, double node failures can be repaired collaboratively with optimal repair bandwidth as well. We give an example of how to repair double failures in the Suh-Ramchandran regenerating code with six nodes, and give the proof for the general case.
△ Less
Submitted 14 May, 2013; v1 submitted 5 February, 2013;
originally announced February 2013.
-
Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems
Authors:
Yuchong Hu,
Patrick P. C. Lee,
Kenneth W. Shum
Abstract:
Modern distributed storage systems apply redundancy coding techniques to stored data. One form of redundancy is based on regenerating codes, which can minimize the repair bandwidth, i.e., the amount of data transferred when repairing a failed storage node. Existing regenerating codes mainly require surviving storage nodes encode data during repair. In this paper, we study functional minimum storag…
▽ More
Modern distributed storage systems apply redundancy coding techniques to stored data. One form of redundancy is based on regenerating codes, which can minimize the repair bandwidth, i.e., the amount of data transferred when repairing a failed storage node. Existing regenerating codes mainly require surviving storage nodes encode data during repair. In this paper, we study functional minimum storage regenerating (FMSR) codes, which enable uncoded repair without the encoding requirement in surviving nodes, while preserving the minimum repair bandwidth guarantees and also minimizing disk reads. Under double-fault tolerance settings, we formally prove the existence of FMSR codes, and provide a deterministic FMSR code construction that can significantly speed up the repair process. We further implement and evaluate our deterministic FMSR codes to show the benefits. Our work is built atop a practical cloud storage system that implements FMSR codes, and we provide theoretical validation to justify the practicality of FMSR codes.
△ Less
Submitted 21 January, 2013; v1 submitted 14 August, 2012;
originally announced August 2012.
-
Cooperative Regenerating Codes
Authors:
Kenneth W. Shum,
Yuchong Hu
Abstract:
One of the design objectives in distributed storage system is the minimization of the data traffic during the repair of failed storage nodes. By repairing multiple failures simultaneously and cooperatively, further reduction of repair traffic is made possible. A closed-form expression of the optimal tradeoff between the repair traffic and the amount of storage in each node for cooperative repair i…
▽ More
One of the design objectives in distributed storage system is the minimization of the data traffic during the repair of failed storage nodes. By repairing multiple failures simultaneously and cooperatively, further reduction of repair traffic is made possible. A closed-form expression of the optimal tradeoff between the repair traffic and the amount of storage in each node for cooperative repair is given. We show that the points on the tradeoff curve can be achieved by linear cooperative regenerating codes, with an explicit bound on the required finite field size. The proof relies on a max-flow-min-cut-type theorem for submodular flow from combinatorial optimization. Two families of explicit constructions are given.
△ Less
Submitted 18 July, 2013; v1 submitted 29 July, 2012;
originally announced July 2012.
-
Linear Network Code for Erasure Broadcast Channel with Feedback: Complexity and Algorithms
Authors:
Chi Wan Sung,
Linyu Huang,
Ho Yuet Kwan,
Kenneth W. Shum
Abstract:
This paper investigates the construction of linear network codes for broadcasting a set of data packets to a number of users. The links from the source to the users are modeled as independent erasure channels. Users are allowed to inform the source node whether a packet is received correctly via feedback channels. In order to minimize the number of packet transmissions until all users have receive…
▽ More
This paper investigates the construction of linear network codes for broadcasting a set of data packets to a number of users. The links from the source to the users are modeled as independent erasure channels. Users are allowed to inform the source node whether a packet is received correctly via feedback channels. In order to minimize the number of packet transmissions until all users have received all packets successfully, it is necessary that a data packet, if successfully received by a user, can increase the dimension of the vector space spanned by the encoding vectors he or she has received by one. Such an encoding vector is called innovative. We prove that innovative linear network code is uniformly optimal in minimizing user download delay. When the finite field size is strictly smaller than the number of users, the problem of determining the existence of innovative vectors is proven to be NP-complete. When the field size is larger than or equal to the number of users, innovative vectors always exist and random linear network code (RLNC) is able to find an innovative vector with high probability. While RLNC is optimal in terms of completion time, it has high decoding complexity due to the need of solving a system of linear equations. To reduce decoding time, we propose the use of sparse linear network code, since the sparsity property of encoding vectors can be exploited when solving systems of linear equations. Generating a sparsest encoding vector with large finite field size, however, is shown to be NP-hard. An approximation algorithm that guarantee the Hamming weight of a generated encoding vector to be smaller than a certain factor of the optimal value is constructed. Our simulation results show that our proposed methods have excellent performance in completion time and outperforms RLNC in terms of decoding time.
△ Less
Submitted 8 December, 2013; v1 submitted 23 May, 2012;
originally announced May 2012.
-
Imperfect Secrecy in Wiretap Channel II
Authors:
Fan Cheng,
Raymond W. Yeung,
Kenneth W. Shum
Abstract:
In a point-to-point communication system which consists of a sender, a receiver and a set of noiseless channels, the sender wishes to transmit a private message to the receiver through the channels which may be eavesdropped by a wiretapper. The set of wiretap sets is arbitrary. The wiretapper can access any one but not more than one wiretap set. From each wiretap set, the wiretapper can obtain som…
▽ More
In a point-to-point communication system which consists of a sender, a receiver and a set of noiseless channels, the sender wishes to transmit a private message to the receiver through the channels which may be eavesdropped by a wiretapper. The set of wiretap sets is arbitrary. The wiretapper can access any one but not more than one wiretap set. From each wiretap set, the wiretapper can obtain some partial information about the private message which is measured by the equivocation of the message given the symbols obtained by the wiretapper. The security strategy is to encode the message with some random key at the sender. Only the message is required to be recovered at the receiver. Under this setting, we define an achievable rate tuple consisting of the size of the message, the size of the key, and the equivocation for each wiretap set. We first prove a tight rate region when both the message and the key are required to be recovered at the receiver. Then we extend the result to the general case when only the message is required to be recovered at the receiver. Moreover, we show that even if stochastic encoding is employed at the sender, the message rate cannot be increased.
△ Less
Submitted 12 October, 2014; v1 submitted 3 February, 2012;
originally announced February 2012.
-
Minimization of Storage Cost in Distributed Storage Systems with Repair Consideration
Authors:
Quan Yu,
Kenneth W. Shum,
Chi Wan Sung
Abstract:
In a distributed storage system, the storage costs of different storage nodes, in general, can be different. How to store a file in a given set of storage nodes so as to minimize the total storage cost is investigated. By analyzing the min-cut constraints of the information flow graph, the feasible region of the storage capacities of the nodes can be determined. The storage cost minimization can t…
▽ More
In a distributed storage system, the storage costs of different storage nodes, in general, can be different. How to store a file in a given set of storage nodes so as to minimize the total storage cost is investigated. By analyzing the min-cut constraints of the information flow graph, the feasible region of the storage capacities of the nodes can be determined. The storage cost minimization can then be reduced to a linear programming problem, which can be readily solved. Moreover, the tradeoff between storage cost and repair-bandwidth is established.
△ Less
Submitted 28 July, 2011;
originally announced July 2011.
-
Generation of Innovative and Sparse Encoding Vectors for Broadcast Systems with Feedback
Authors:
Ho Yuet Kwan,
Kenneth W. Shum,
Chi Wan Sung
Abstract:
In the application of linear network coding to wireless broadcasting with feedback, we prove that the problem of determining the existence of an innovative encoding vector is NP-complete when the finite field size is two. When the finite field size is larger than or equal to the number of users, it is shown that we can always find an encoding vector which is both innovative and sparse. The sparsit…
▽ More
In the application of linear network coding to wireless broadcasting with feedback, we prove that the problem of determining the existence of an innovative encoding vector is NP-complete when the finite field size is two. When the finite field size is larger than or equal to the number of users, it is shown that we can always find an encoding vector which is both innovative and sparse. The sparsity can be utilized in speeding up the decoding process. An efficient algorithm to generate innovative and sparse encoding vectors is developed. Simulations show that the delay performance of our scheme with binary finite field outperforms a number of existing schemes in terms of average and worst-case delay.
△ Less
Submitted 24 May, 2011; v1 submitted 17 February, 2011;
originally announced February 2011.
-
Exact Minimum-Repair-Bandwidth Cooperative Regenerating Codes for Distributed Storage Systems
Authors:
Kenneth W. Shum,
Yuchong Hu
Abstract:
In order to provide high data reliability, distributed storage systems disperse data with redundancy to multiple storage nodes. Regenerating codes is a new class of erasure codes to introduce redundancy for the purpose of improving the data repair performance in distributed storage. Most of the studies on regenerating codes focus on the single-failure recovery, but it is not uncommon to see two or…
▽ More
In order to provide high data reliability, distributed storage systems disperse data with redundancy to multiple storage nodes. Regenerating codes is a new class of erasure codes to introduce redundancy for the purpose of improving the data repair performance in distributed storage. Most of the studies on regenerating codes focus on the single-failure recovery, but it is not uncommon to see two or more node failures at the same time in large storage networks. To exploit the opportunity of repairing multiple failed nodes simultaneously, a cooperative repair mechanism, in the sense that the nodes to be repaired can exchange data among themselves, is investigated. A lower bound on the repair-bandwidth for cooperative repair is derived and a construction of a family of exact cooperative regenerating codes matching this lower bound is presented.
△ Less
Submitted 31 May, 2011; v1 submitted 8 February, 2011;
originally announced February 2011.
-
Cooperative Regenerating Codes for Distributed Storage Systems
Authors:
Kenneth W. Shum
Abstract:
When there are multiple node failures in a distributed storage system, regenerating the failed storage nodes individually in a one-by-one manner is suboptimal as far as repair-bandwidth minimization is concerned. If data exchange among the newcomers is enabled, we can get a better tradeoff between repair bandwidth and the storage per node. An explicit and optimal construction of cooperative regene…
▽ More
When there are multiple node failures in a distributed storage system, regenerating the failed storage nodes individually in a one-by-one manner is suboptimal as far as repair-bandwidth minimization is concerned. If data exchange among the newcomers is enabled, we can get a better tradeoff between repair bandwidth and the storage per node. An explicit and optimal construction of cooperative regenerating code is illustrated.
△ Less
Submitted 7 February, 2011; v1 submitted 27 January, 2011;
originally announced January 2011.
-
Construction and Applications of CRT Sequences
Authors:
Kenneth W. Shum,
Wing Shing Wong
Abstract:
Protocol sequences are used for channel access in the collision channel without feedback. Each user accesses the channel according to a deterministic zero-one pattern, called the protocol sequence. In order to minimize fluctuation of throughput due to delay offsets, we want to construct protocol sequences whose pairwise Hamming cross-correlation is as close to a constant as possible. In this paper…
▽ More
Protocol sequences are used for channel access in the collision channel without feedback. Each user accesses the channel according to a deterministic zero-one pattern, called the protocol sequence. In order to minimize fluctuation of throughput due to delay offsets, we want to construct protocol sequences whose pairwise Hamming cross-correlation is as close to a constant as possible. In this paper, we present a construction of protocol sequences which is based on the bijective map** between one-dimensional sequence and two-dimensional array by the Chinese Remainder Theorem (CRT). In the application to the collision channel without feedback, a worst-case lower bound on system throughput is derived.
△ Less
Submitted 8 November, 2010; v1 submitted 29 June, 2010;
originally announced June 2010.
-
Construction of Short Protocol Sequences with Worst-Case Throughput Guarantee
Authors:
Kenneth W. Shum,
Wing Shing Wong
Abstract:
Protocol sequences are used in channel access for the multiple-access collision channel without feedback. A new construction of protocol sequences with a guarantee of worst-case system throughput is proposed. The construction is based on Chinese remainder theorem. The Hamming crosscorrelation is proved to be concentrated around the mean. The sequence period is much shorter than existing protocol s…
▽ More
Protocol sequences are used in channel access for the multiple-access collision channel without feedback. A new construction of protocol sequences with a guarantee of worst-case system throughput is proposed. The construction is based on Chinese remainder theorem. The Hamming crosscorrelation is proved to be concentrated around the mean. The sequence period is much shorter than existing protocol sequences with the same throughput performance. The new construction reduces the complexity in implementation and also shortens the waiting time until a packet can be sent successfully.
△ Less
Submitted 27 April, 2010;
originally announced April 2010.
-
Information Flow in One-Dimensional Vehicular Ad Hoc Networks
Authors:
Chi Wan Sung,
Kenneth W. Shum,
Wing Ho Yuen
Abstract:
We consider content distribution in vehicular ad hoc networks. We assume that a file is encoded using fountain code, and the encoded message is cached at infostations. Vehicles are allowed to download data packets from infostations, which are placed along a highway. In addition, two vehicles can exchange packets with each other when they are in proximity. As long as a vehicle has received enough…
▽ More
We consider content distribution in vehicular ad hoc networks. We assume that a file is encoded using fountain code, and the encoded message is cached at infostations. Vehicles are allowed to download data packets from infostations, which are placed along a highway. In addition, two vehicles can exchange packets with each other when they are in proximity. As long as a vehicle has received enough packets from infostations or from other vehicles, the original file can be recovered. In this work, we show that system throughput increases linearly with number of users, meaning that the system exhibits linear scalability. Furthermore, we analyze the effect of mobility on system throughput by considering both discrete and continuous velocity distributions for the vehicles. In both cases, system throughput is shown to decrease when the average speed of all vehicles increases. In other words, higher overall mobility reduces system throughput.
△ Less
Submitted 3 March, 2010;
originally announced March 2010.
-
A General Upper Bound on the Size of Constant-Weight Conflict-Avoiding Codes
Authors:
Kenneth W. Shum,
Wing Shing Wong,
Chung Shue Chen
Abstract:
Conflict-avoiding codes are used in the multiple-access collision channel without feedback. The number of codewords in a conflict-avoiding code is the number of potential users that can be supported in the system. In this paper, a new upper bound on the size of conflict-avoiding codes is proved. This upper bound is general in the sense that it is applicable to all code lengths and all Hamming weig…
▽ More
Conflict-avoiding codes are used in the multiple-access collision channel without feedback. The number of codewords in a conflict-avoiding code is the number of potential users that can be supported in the system. In this paper, a new upper bound on the size of conflict-avoiding codes is proved. This upper bound is general in the sense that it is applicable to all code lengths and all Hamming weights. Several existing constructions for conflict-avoiding codes, which are known to be optimal for Hamming weights equal to four and five, are shown to be optimal for all Hamming weights in general.
△ Less
Submitted 6 November, 2010; v1 submitted 27 October, 2009;
originally announced October 2009.
-
Achieving Capacity of Bi-Directional Tandem Collision Network by Joint Medium-Access Control and Channel-Network Coding
Authors:
Kenneth W. Shum,
Chi Wan Sung
Abstract:
In ALOHA-type packetized network, the transmission times of packets follow a stochastic process. In this paper, we advocate a deterministic approach for channel multiple-access. Each user is statically assigned a periodic protocol signal, which takes value either zero or one, and transmit packets whenever the value of the protocol signal is equal to one. On top of this multiple-access protocol,…
▽ More
In ALOHA-type packetized network, the transmission times of packets follow a stochastic process. In this paper, we advocate a deterministic approach for channel multiple-access. Each user is statically assigned a periodic protocol signal, which takes value either zero or one, and transmit packets whenever the value of the protocol signal is equal to one. On top of this multiple-access protocol, efficient channel coding and network coding schemes are devised. We illustrate the idea by constructing a transmission scheme for the tandem collision network, for both slot-synchronous and slot-asynchronous systems. This cross-layer approach is able to achieve the capacity region when the network is bi-directional.
△ Less
Submitted 28 September, 2009;
originally announced September 2009.
-
On the Fairness of Rate Allocation in Gaussian Multiple Access Channel and Broadcast Channel
Authors:
Kenneth W. Shum,
Chi Wan Sung
Abstract:
The capacity region of a channel consists of all achievable rate vectors. Picking a particular point in the capacity region is synonymous with rate allocation. The issue of fairness in rate allocation is addressed in this paper. We review several notions of fairness, including max-min fairness, proportional fairness and Nash bargaining solution. Their efficiencies for general multiuser channels…
▽ More
The capacity region of a channel consists of all achievable rate vectors. Picking a particular point in the capacity region is synonymous with rate allocation. The issue of fairness in rate allocation is addressed in this paper. We review several notions of fairness, including max-min fairness, proportional fairness and Nash bargaining solution. Their efficiencies for general multiuser channels are discussed. We apply these ideas to the Gaussian multiple access channel (MAC) and the Gaussian broadcast channel (BC). We show that in the Gaussian MAC, max-min fairness and proportional fairness coincide. For both Gaussian MAC and BC, we devise efficient algorithms that locate the fair point in the capacity region. Some elementary properties of fair rate allocations are proved.
△ Less
Submitted 3 November, 2006;
originally announced November 2006.