Skip to main content

Showing 51–68 of 68 results for author: Shah, N B

.
  1. arXiv:1411.5977  [pdf, other

    stat.ML cs.HC cs.LG

    On the Impossibility of Convex Inference in Human Computation

    Authors: Nihar B. Shah, Dengyong Zhou

    Abstract: Human computation or crowdsourcing involves joint inference of the ground-truth-answers and the worker-abilities by optimizing an objective function, for instance, by maximizing the data likelihood based on an assumed underlying model. A variety of methods have been proposed in the literature to address this inference problem. As far as we know, none of the objective functions in existing methods… ▽ More

    Submitted 21 November, 2014; originally announced November 2014.

    Comments: AAAI 2015

  2. Fundamental Limits on Communication for Oblivious Updates in Storage Networks

    Authors: Preetum Nakkiran, Nihar B. Shah, K. V. Rashmi

    Abstract: In distributed storage systems, storage nodes intermittently go offline for numerous reasons. On coming back online, nodes need to update their contents to reflect any modifications to the data in the interim. In this paper, we consider a setting where no information regarding modified data needs to be logged in the system. In such a setting, a 'stale' node needs to update its contents by download… ▽ More

    Submitted 5 September, 2014; originally announced September 2014.

    Comments: IEEE Global Communications Conference (GLOBECOM) 2014

  3. arXiv:1408.1387  [pdf, other

    cs.GT cs.HC cs.LG

    Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing

    Authors: Nihar B. Shah, Dengyong Zhou

    Abstract: Crowdsourcing has gained immense popularity in machine learning applications for obtaining large amounts of labeled data. Crowdsourcing is cheap and fast, but suffers from the problem of low-quality data. To address this fundamental challenge in crowdsourcing, we propose a simple payment mechanism to incentivize workers to answer only the questions that they are sure of and skip the rest. We show… ▽ More

    Submitted 16 December, 2015; v1 submitted 6 August, 2014; originally announced August 2014.

  4. arXiv:1406.6618  [pdf, other

    stat.ML cs.LG

    When is it Better to Compare than to Score?

    Authors: Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, Martin Wainwright

    Abstract: When eliciting judgements from humans for an unknown quantity, one often has the choice of making direct-scoring (cardinal) or comparative (ordinal) measurements. In this paper we study the relative merits of either choice, providing empirical and theoretical guidelines for the selection of a measurement scheme. We provide empirical evidence based on experiments on Amazon Mechanical Turk that in a… ▽ More

    Submitted 25 June, 2014; originally announced June 2014.

  5. arXiv:1311.2851  [pdf, other

    cs.NI cs.DC cs.PF

    When Do Redundant Requests Reduce Latency ?

    Authors: Nihar B. Shah, Kangwook Lee, Kannan Ramchandran

    Abstract: Several systems possess the flexibility to serve requests in more than one way. For instance, a distributed storage system storing multiple replicas of the data can serve a request from any of the multiple servers that store the requested data, or a computational task may be performed in a compute-cluster by any one of multiple processors. In such systems, the latency of serving the requests may p… ▽ More

    Submitted 6 November, 2013; originally announced November 2013.

    Comments: Extended version of paper presented at Allerton Conference 2013

  6. arXiv:1309.0186  [pdf, other

    cs.NI cs.DC cs.IT

    A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster

    Authors: K. V. Rashmi, Nihar B. Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, Kannan Ramchandran

    Abstract: Erasure codes, such as Reed-Solomon (RS) codes, are being increasingly employed in data centers to combat the cost of reliably storing large amounts of data. Although these codes provide optimal storage efficiency, they require significantly high network and disk usage during recovery of missing data. In this paper, we first present a study on the impact of recovery operations of erasure-coded dat… ▽ More

    Submitted 1 September, 2013; originally announced September 2013.

    Comments: In proceedings of USENIX HotStorage, San Jose, June 2013

  7. arXiv:1302.5872  [pdf, other

    cs.IT cs.DC cs.NI

    A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes

    Authors: K. V. Rashmi, Nihar B. Shah, Kannan Ramchandran

    Abstract: We present a new 'piggybacking' framework for designing distributed storage codes that are efficient in data-read and download required during node-repair. We illustrate the power of this framework by constructing classes of explicit codes that entail the smallest data-read and download for repair among all existing solutions for three important settings: (a) codes meeting the constraints of being… ▽ More

    Submitted 24 February, 2013; originally announced February 2013.

    Comments: Extended version of ISIT 2013 submission

  8. On Minimizing Data-read and Download for Storage-Node Recovery

    Authors: Nihar B. Shah

    Abstract: We consider the problem of efficient recovery of the data stored in any individual node of a distributed storage system, from the rest of the nodes. Applications include handling failures and degraded reads. We measure efficiency in terms of the amount of data-read and the download required. To minimize the download, we focus on the minimum bandwidth setting of the 'regenerating codes' model for d… ▽ More

    Submitted 2 April, 2013; v1 submitted 31 December, 2012; originally announced December 2012.

    Comments: IEEE Communications Letters

  9. arXiv:1211.5405  [pdf, other

    cs.IT cs.NI math.OC

    The MDS Queue: Analysing the Latency Performance of Erasure Codes

    Authors: Nihar B. Shah, Kangwook Lee, Kannan Ramchandran

    Abstract: In order to scale economically, data centers are increasingly evolving their data storage methods from the use of simple data replication to the use of more powerful erasure codes, which provide the same level of reliability as replication but at a significantly lower storage cost. In particular, it is well known that Maximum-Distance-Separable (MDS) codes, such as Reed-Solomon codes, provide the… ▽ More

    Submitted 10 November, 2013; v1 submitted 22 November, 2012; originally announced November 2012.

  10. arXiv:1207.0120  [pdf, other

    cs.CR cs.IT

    Distributed Secret Dissemination Across a Network

    Authors: Nihar B. Shah, K. V. Rashmi, Kannan Ramchandran

    Abstract: Shamir's (n, k) threshold secret sharing is an important component of several cryptographic protocols, such as those for secure multiparty-computation and key management. These protocols typically assume the presence of direct communication links from the dealer to all participants, in which case the dealer can directly pass the shares of the secret to each participant. In this paper, we consider… ▽ More

    Submitted 22 October, 2014; v1 submitted 30 June, 2012; originally announced July 2012.

    Comments: Extended version of a paper presented at the International Symposium on Information Theory (ISIT) 2013

  11. arXiv:1202.1050  [pdf, other

    cs.IT cs.DC cs.NI

    Regenerating Codes for Errors and Erasures in Distributed Storage

    Authors: K. V. Rashmi, Nihar B. Shah, Kannan Ramchandran, P. Vijay Kumar

    Abstract: Regenerating codes are a class of codes proposed for providing reliability of data and efficient repair of failed nodes in distributed storage systems. In this paper, we address the fundamental problem of handling errors and erasures during the data-reconstruction and node-repair operations. We provide explicit regenerating codes that are resilient to errors and erasures, and show that these codes… ▽ More

    Submitted 23 May, 2012; v1 submitted 6 February, 2012; originally announced February 2012.

    Comments: ISIT 2012

  12. arXiv:1107.5279  [pdf, ps, other

    cs.IT cs.DC cs.NI

    Information-theoretically Secure Regenerating Codes for Distributed Storage

    Authors: Nihar B. Shah, K. V. Rashmi, P. Vijay Kumar

    Abstract: Regenerating codes are a class of codes for distributed storage networks that provide reliability and availability of data, and also perform efficient node repair. Another important aspect of a distributed storage network is its security. In this paper, we consider a threat model where an eavesdropper may gain access to the data stored in a subset of the storage nodes, and possibly also, to the da… ▽ More

    Submitted 26 July, 2011; originally announced July 2011.

    Comments: Globecom 2011

  13. arXiv:1101.0133  [pdf, other

    cs.IT cs.DC cs.NI

    Enabling Node Repair in Any Erasure Code for Distributed Storage

    Authors: K. V. Rashmi, Nihar B. Shah, P. Vijay Kumar

    Abstract: Erasure codes are an efficient means of storing data across a network in comparison to data replication, as they tend to reduce the amount of data stored in the network and offer increased resilience in the presence of node failures. The codes perform poorly though, when repair of a failed node is called for, as they typically require the entire file to be downloaded to repair a failed node. A new… ▽ More

    Submitted 30 June, 2011; v1 submitted 30 December, 2010; originally announced January 2011.

    Comments: IEEE International Symposium on Information Theory (ISIT) 2011 (to be presented)

  14. arXiv:1011.2361  [pdf, other

    cs.IT cs.DC cs.NI

    Distributed Storage Codes with Repair-by-Transfer and Non-achievability of Interior Points on the Storage-Bandwidth Tradeoff

    Authors: Nihar B. Shah, K. V. Rashmi, P. Vijay Kumar, Kannan Ramchandran

    Abstract: Regenerating codes are a class of recently developed codes for distributed storage that, like Reed-Solomon codes, permit data recovery from any subset of k nodes within the n-node network. However, regenerating codes possess in addition, the ability to repair a failed node by connecting to an arbitrary subset of d nodes. It has been shown that for the case of functional-repair, there is a tradeoff… ▽ More

    Submitted 16 November, 2010; v1 submitted 10 November, 2010; originally announced November 2010.

    Comments: 30 pages, 6 figures. Submitted to IEEE Transactions on Information Theory

  15. arXiv:1005.4178  [pdf, other

    cs.IT cs.DC cs.NI

    Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction

    Authors: K. V. Rashmi, Nihar B. Shah, P. Vijay Kumar

    Abstract: Regenerating codes are a class of distributed storage codes that optimally trade the bandwidth needed for repair of a failed node with the amount of data stored per node of the network. Minimum Storage Regenerating (MSR) codes minimize first, the amount of data stored per node, and then the repair bandwidth, while Minimum Bandwidth Regenerating (MBR) codes carry out the minimization in the reverse… ▽ More

    Submitted 20 January, 2011; v1 submitted 23 May, 2010; originally announced May 2010.

    Comments: Submitted to IEEE Transactions on Information Theory. Contains 20 pages, 2 figures

    Journal ref: IEEE Transactions on Information Theory, vol. 57, no. 8, pp. 5227 - 5239, August 2011

  16. arXiv:1005.1634  [pdf, other

    cs.IT cs.DC cs.NI

    Interference Alignment in Regenerating Codes for Distributed Storage: Necessity and Code Constructions

    Authors: Nihar B. Shah, K. V. Rashmi, P. Vijay Kumar, Kannan Ramchandran

    Abstract: Regenerating codes are a class of recently developed codes for distributed storage that, like Reed-Solomon codes, permit data recovery from any arbitrary k of n nodes. However regenerating codes possess in addition, the ability to repair a failed node by connecting to any arbitrary d nodes and downloading an amount of data that is typically far less than the size of the data file. This amount of d… ▽ More

    Submitted 13 September, 2010; v1 submitted 10 May, 2010; originally announced May 2010.

    Comments: 38 pages, 12 figures, submitted to the IEEE Transactions on Information Theory;v3 - The title has been modified to better reflect the contributions of the submission. The paper is extensively revised with several carefully constructed figures and examples

  17. arXiv:0908.2984  [pdf, other

    cs.IT

    Explicit Codes Minimizing Repair Bandwidth for Distributed Storage

    Authors: Nihar B. Shah, K. V. Rashmi, P. Vijay Kumar, Kannan Ramchandran

    Abstract: We consider the setting of data storage across n nodes in a distributed manner. A data collector (DC) should be able to reconstruct the entire data by connecting to any k out of the n nodes and downloading all the data stored in them. When a node fails, it has to be regenerated back using the existing nodes. In a recent paper, Wu et al. have obtained an information theoretic lower bound for the… ▽ More

    Submitted 5 September, 2009; v1 submitted 20 August, 2009; originally announced August 2009.

    Comments: 11 pages, 4 figures v2: corrected typos

  18. arXiv:0906.4913  [pdf, other

    cs.IT

    Explicit Construction of Optimal Exact Regenerating Codes for Distributed Storage

    Authors: K. V. Rashmi, Nihar B. Shah, P. Vijay Kumar, Kannan Ramchandran

    Abstract: Erasure coding techniques are used to increase the reliability of distributed storage systems while minimizing storage overhead. Also of interest is minimization of the bandwidth required to repair the system following a node failure. In a recent paper, Wu et al. characterize the tradeoff between the repair bandwidth and the amount of data stored per node. They also prove the existence of regene… ▽ More

    Submitted 6 October, 2009; v1 submitted 26 June, 2009; originally announced June 2009.

    Comments: 7 pages, 2 figures, in the Proceedings of Allerton Conference on Communication, Control and Computing, September 2009