-
5G NR CA-Polar Maximum Likelihood Decoding by GRAND
Authors:
Ken Duffy,
Amit Solomon,
Kishori M. Konwar,
Muriel Medard
Abstract:
CA-Polar codes have been selected for all control channel communications in 5G NR, but accurate, computationally feasible decoders are still subject to development. Here we report the performance of a recently proposed class of optimally precise Maximum Likelihood (ML) decoders, GRAND, that can be used with any block-code. As published theoretical results indicate that GRAND is computationally eff…
▽ More
CA-Polar codes have been selected for all control channel communications in 5G NR, but accurate, computationally feasible decoders are still subject to development. Here we report the performance of a recently proposed class of optimally precise Maximum Likelihood (ML) decoders, GRAND, that can be used with any block-code. As published theoretical results indicate that GRAND is computationally efficient for short-length, high-rate codes and 5G CA-Polar codes are in that class, here we consider GRAND's utility for decoding them. Simulation results indicate that decoding of 5G CA-Polar codes by GRAND, and a simple soft detection variant, is a practical possibility.
△ Less
Submitted 18 February, 2021; v1 submitted 1 July, 2019;
originally announced July 2019.
-
SNOW Revisited: Understanding When Ideal READ Transactions Are Possible
Authors:
Kishori M Konwar,
Wyatt Lloyd,
Haonan Lu,
Nancy Lynch
Abstract:
READ transactions that read data distributed across servers dominate the workloads of real-world distributed storage systems. The SNOW Theorem stated that ideal READ transactions that have optimal latency and the strongest guarantees, i.e. "SNOW" READ transactions, are impossible in one specific setting that requires three or more clients: at least two readers and one writer. However, it left many…
▽ More
READ transactions that read data distributed across servers dominate the workloads of real-world distributed storage systems. The SNOW Theorem stated that ideal READ transactions that have optimal latency and the strongest guarantees, i.e. "SNOW" READ transactions, are impossible in one specific setting that requires three or more clients: at least two readers and one writer. However, it left many open questions. We close all of these open questions with new impossibility results and new algorithms. First, we prove rigorously the result from the The SNOW Theorem paper saying that it is impossible to have a READ transactions system that satisfies SNOW properties with three or more clients. The insight we gained from this proof led to teasing out the implicit assumptions that are required to state the results and also, resolving the open question regarding the possibility of SNOW with two clients. We show that it is possible to design an algorithm, where SNOW is possible in a multi-writer, single-reader (MWSR) setting when a client can send messages to other clients; on the other hand, we prove it is impossible to implement SNOW in a multi-writer, single-reader (MWSR) setting, which is more general than the two-client setting, when client-to-client communication is disallowed. We also correct the previous claim in The SNOW Theorem paper that incorrectly identified one existing system, Eiger, as supporting the strongest guarantees (SW) and whose read-only transactions had bounded latency. Thus, there were no previous algorithms that provided the strongest guarantees and had bounded latency. Finally, we introduce the first two algorithms to provide the strongest guarantees with bounded latency.
△ Less
Submitted 22 May, 2021; v1 submitted 26 November, 2018;
originally announced November 2018.
-
ARES: Adaptive, Reconfigurable, Erasure coded, atomic Storage
Authors:
Nicolas Nicolaou,
Viveck Cadambe,
N. Prakash,
Andria Trigeorgi,
Kishori M. Konwar,
Nancy Lynch,
Muriel Medard
Abstract:
Atomicity or strong consistency is one of the fundamental, most intuitive, and hardest to provide primitives in distributed shared memory emulations. To ensure survivability, scalability, and availability of a storage service in the presence of failures, traditional approaches for atomic memory emulation, in message passing environments, replicate the objects across multiple servers. Compared to r…
▽ More
Atomicity or strong consistency is one of the fundamental, most intuitive, and hardest to provide primitives in distributed shared memory emulations. To ensure survivability, scalability, and availability of a storage service in the presence of failures, traditional approaches for atomic memory emulation, in message passing environments, replicate the objects across multiple servers. Compared to replication based algorithms, erasure code-based atomic memory algorithms has much lower storage and communication costs, but usually, they are harder to design. The difficulty of designing atomic memory algorithms further grows, when the set of servers may be changed to ensure survivability of the service over software and hardware upgrades, while avoiding service interruptions. Atomic memory algorithms for performing server reconfiguration, in the replicated systems, are very few, complex, and are still part of an active area of research; reconfigurations of erasure-code based algorithms are non-existent.
In this work, we present ARES, an algorithmic framework that allows reconfiguration of the underlying servers, and is particularly suitable for erasure-code based algorithms emulating atomic objects. ARES introduces new configurations while kee** the service available. To use with ARES we also propose a new, and to our knowledge, the first two-round erasure code based algorithm TREAS, for emulating multi-writer, multi-reader (MWMR) atomic objects in asynchronous, message-passing environments, with near-optimal communication and storage costs. Our algorithms can tolerate crash failures of any client and some fraction of servers, and yet, guarantee safety and liveness property. Moreover, by bringing together the advantages of ARES and TREAS, we propose an optimized algorithm where new configurations can be installed without the objects values passing through the reconfiguration clients.
△ Less
Submitted 28 May, 2021; v1 submitted 9 May, 2018;
originally announced May 2018.
-
A Layered Architecture for Erasure-Coded Consistent Distributed Storage
Authors:
Kishori M. Konwar,
N. Prakash,
Nancy Lynch,
Muriel Medard
Abstract:
Motivated by emerging applications to the edge computing paradigm, we introduce a two-layer erasure-coded fault-tolerant distributed storage system offering atomic access for read and write operations. In edge computing, clients interact with an edge-layer of servers that is geographically near; the edge-layer in turn interacts with a back-end layer of servers. The edge-layer provides low latency…
▽ More
Motivated by emerging applications to the edge computing paradigm, we introduce a two-layer erasure-coded fault-tolerant distributed storage system offering atomic access for read and write operations. In edge computing, clients interact with an edge-layer of servers that is geographically near; the edge-layer in turn interacts with a back-end layer of servers. The edge-layer provides low latency access and temporary storage for client operations, and uses the back-end layer for persistent storage. Our algorithm, termed Layered Data Storage (LDS) algorithm, offers several features suitable for edge-computing systems, works under asynchronous message-passing environments, supports multiple readers and writers, and can tolerate $f_1 < n_1/2$ and $f_2 < n_2/3$ crash failures in the two layers having $n_1$ and $n_2$ servers, respectively. We use a class of erasure codes known as regenerating codes for storage of data in the back-end layer. The choice of regenerating codes, instead of popular choices like Reed-Solomon codes, not only optimizes the cost of back-end storage, but also helps in optimizing communication cost of read operations, when the value needs to be recreated all the way from the back-end. The two-layer architecture permits a modular implementation of atomicity and erasure-code protocols; the implementation of erasure-codes is mostly limited to interaction between the two layers. We prove liveness and atomicity of LDS, and also compute performance costs associated with read and write operations. Further, in a multi-object system running $N$ independent instances of LDS, where only a small fraction of the objects undergo concurrent accesses at any point during the execution, the overall storage cost is dominated by that of persistent storage in the back-end layer, and is given by $Θ(N)$.
△ Less
Submitted 30 May, 2017; v1 submitted 3 March, 2017;
originally announced March 2017.
-
RADON: Repairable Atomic Data Object in Networks
Authors:
Kishori M. Konwar,
N. Prakash,
Nancy Lynch,
Muriel Medard
Abstract:
Erasure codes offer an efficient way to decrease storage and communication costs while implementing atomic memory service in asynchronous distributed storage systems. In this paper, we provide erasure-code-based algorithms having the additional ability to perform background repair of crashed nodes. A repair operation of a node in the crashed state is triggered externally, and is carried out by the…
▽ More
Erasure codes offer an efficient way to decrease storage and communication costs while implementing atomic memory service in asynchronous distributed storage systems. In this paper, we provide erasure-code-based algorithms having the additional ability to perform background repair of crashed nodes. A repair operation of a node in the crashed state is triggered externally, and is carried out by the concerned node via message exchanges with other active nodes in the system. Upon completion of repair, the node re-enters active state, and resumes participation in ongoing and future read, write, and repair operations. To guarantee liveness and atomicity simultaneously, existing works assume either the presence of nodes with stable storage, or presence of nodes that never crash during the execution. We demand neither of these; instead we consider a natural, yet practical network stability condition $N1$ that only restricts the number of nodes in the crashed/repair state during broadcast of any message.
We present an erasure-code based algorithm $RADON_C$ that is always live, and guarantees atomicity as long as condition $N1$ holds. In situations when the number of concurrent writes is limited, $RADON_C$ has significantly improved storage and communication cost over a replication-based algorithm $RADON_R$, which also works under $N1$. We further show how a slightly stronger network stability condition $N2$ can be used to construct algorithms that never violate atomicity. The guarantee of atomicity comes at the expense of having an additional phase during the read and write operations.
△ Less
Submitted 21 November, 2016; v1 submitted 18 May, 2016;
originally announced May 2016.
-
Storage-Optimized Data-Atomic Algorithms for Handling Erasures and Errors in Distributed Storage Systems
Authors:
Kishori M. Konwar,
N. Prakash,
Erez Kantor,
Nancy Lynch,
Muriel Medard,
Alexander A. Schwarzmann
Abstract:
Erasure codes are increasingly being studied in the context of implementing atomic memory objects in large scale asynchronous distributed storage systems. When compared with the traditional replication based schemes, erasure codes have the potential of significantly lowering storage and communication costs while simultaneously guaranteeing the desired resiliency levels. In this work, we propose th…
▽ More
Erasure codes are increasingly being studied in the context of implementing atomic memory objects in large scale asynchronous distributed storage systems. When compared with the traditional replication based schemes, erasure codes have the potential of significantly lowering storage and communication costs while simultaneously guaranteeing the desired resiliency levels. In this work, we propose the Storage-Optimized Data-Atomic (SODA) algorithm for implementing atomic memory objects in the multi-writer multi-reader setting. SODA uses Maximum Distance Separable (MDS) codes, and is specifically designed to optimize the total storage cost for a given fault-tolerance requirement. For tolerating $f$ server crashes in an $n$-server system, SODA uses an $[n, k]$ MDS code with $k=n-f$, and incurs a total storage cost of $\frac{n}{n-f}$. SODA is designed under the assumption of reliable point-to-point communication channels. The communication cost of a write and a read operation are respectively given by $O(f^2)$ and $\frac{n}{n-f}(δ_w+1)$, where $δ_w$ denotes the number of writes that are concurrent with the particular read. In comparison with the recent CASGC algorithm, which also uses MDS codes, SODA offers lower storage cost while pays more on the communication cost.
We also present a modification of SODA, called SODA$_{\text{err}}$, to handle the case where some of the servers can return erroneous coded elements during a read operation. Specifically, in order to tolerate $f$ server failures and $e$ error-prone coded elements, the SODA$_{\text{err}}$ algorithm uses an $[n, k]$ MDS code such that $k=n-2e-f$. SODA$_{\text{err}}$ also guarantees liveness and atomicity, while maintaining an optimized total storage cost of $\frac{n}{n-f-2e}$.
△ Less
Submitted 5 May, 2016;
originally announced May 2016.
-
Technical Report: Estimating Reliability of Workers for Cooperative Distributed Computing
Authors:
Seda Davtyan,
Kishori M. Konwar,
Alexander A. Shvartsman
Abstract:
Internet supercomputing is an approach to solving partitionable, computation-intensive problems by harnessing the power of a vast number of interconnected computers. For the problem of using network supercomputing to perform a large collection of independent tasks, prior work introduced a decentralized approach and provided randomized synchronous algorithms that perform all tasks correctly with hi…
▽ More
Internet supercomputing is an approach to solving partitionable, computation-intensive problems by harnessing the power of a vast number of interconnected computers. For the problem of using network supercomputing to perform a large collection of independent tasks, prior work introduced a decentralized approach and provided randomized synchronous algorithms that perform all tasks correctly with high probability, while dealing with misbehaving or crash-prone processors. The main weaknesses of existing algorithms is that they assume either that the \emph{average} probability of a non-crashed processor returning incorrect results is inferior to $\frac{1}{2}$, or that the probability of returning incorrect results is known to \emph{each} processor. Here we present a randomized synchronous distributed algorithm that tightly estimates the probability of each processor returning correct results. Starting with the set $P$ of $n$ processors, let $F$ be the set of processors that crash. Our algorithm estimates the probability $p_i$ of returning a correct result for each processor $i \in P-F$, making the estimates available to all these processors. The estimation is based on the $(ε, δ)$-approximation, where each estimated probability $\tilde{p_i}$ of $p_i$ obeys the bound ${\sf Pr}[p_i(1-ε) \leq \tilde{p_i} \leq p_i(1+ε)] > 1 - δ$, for any constants $δ>0$ and $ε>0$ chosen by the user. An important aspect of this algorithm is that each processor terminates without global coordination. We assess the efficiency of the algorithm in three adversarial models as follows. For the model where the number of non-crashed processors $|P-F|$ is linearly bounded the time complexity $T(n)$ of the algorithm is $Θ(\log{n})$, work complexity $W(n)$ is $Θ(n\log{n})$, and message complexity $M(n)$ is $Θ(n\log^2n)$.
△ Less
Submitted 1 July, 2014;
originally announced July 2014.
-
Technical Report: Dealing with Undependable Workers in Decentralized Network Supercomputing
Authors:
Seda Davtyan,
Kishori M. Konwar,
Alexander Russell,
Alexander A. Shvartsman
Abstract:
Internet supercomputing is an approach to solving partitionable, computation-intensive problems by harnessing the power of a vast number of interconnected computers. This paper presents a new algorithm for the problem of using network supercomputing to perform a large collection of independent tasks, while dealing with undependable processors. The adversary may cause the processors to return bogus…
▽ More
Internet supercomputing is an approach to solving partitionable, computation-intensive problems by harnessing the power of a vast number of interconnected computers. This paper presents a new algorithm for the problem of using network supercomputing to perform a large collection of independent tasks, while dealing with undependable processors. The adversary may cause the processors to return bogus results for tasks with certain probabilities, and may cause a subset $F$ of the initial set of processors $P$ to crash. The adversary is constrained in two ways. First, for the set of non-crashed processors $P-F$, the \emph{average} probability of a processor returning a bogus result is inferior to $\frac{1}{2}$. Second, the adversary may crash a subset of processors $F$, provided the size of $P-F$ is bounded from below. We consider two models: the first bounds the size of $P-F$ by a fractional polynomial, the second bounds this size by a poly-logarithm. Both models yield adversaries that are much stronger than previously studied. Our randomized synchronous algorithm is formulated for $n$ processors and $t$ tasks, with $n\le t$, where depending on the number of crashes each live processor is able to terminate dynamically with the knowledge that the problem is solved with high probability. For the adversary constrained by a fractional polynomial, the round complexity of the algorithm is $O(\frac{t}{n^\varepsilon}\log{n}\log{\log{n}})$, its work is $O(t\log{n} \log{\log{n}})$ and message complexity is $O(n\log{n}\log{\log{n}})$. For the poly-log constrained adversary, the round complexity is $O(t)$, work is $O(t n^{\varepsilon})$, %$O(t \, poly \log{n})$, and message complexity is $O(n^{1+\varepsilon})$ %$O(n \, poly \log{n})$. All bounds are shown to hold with high probability.
△ Less
Submitted 1 July, 2014;
originally announced July 2014.
-
Highly Scalable Algorithms for Robust String Barcoding
Authors:
Bhaskar DasGupta,
Kishori M. Konwar,
Ion I. Mandoiu,
Alex A. Shvartsman
Abstract:
String barcoding is a recently introduced technique for genomic-based identification of microorganisms. In this paper we describe the engineering of highly scalable algorithms for robust string barcoding. Our methods enable distinguisher selection based on whole genomic sequences of hundreds of microorganisms of up to bacterial size on a well-equipped workstation, and can be easily parallelized…
▽ More
String barcoding is a recently introduced technique for genomic-based identification of microorganisms. In this paper we describe the engineering of highly scalable algorithms for robust string barcoding. Our methods enable distinguisher selection based on whole genomic sequences of hundreds of microorganisms of up to bacterial size on a well-equipped workstation, and can be easily parallelized to further extend the applicability range to thousands of bacterial size genomes. Experimental results on both randomly generated and NCBI genomic data show that whole-genome based selection results in a number of distinguishers nearly matching the information theoretic lower bounds for the problem.
△ Less
Submitted 14 February, 2005;
originally announced February 2005.