Skip to main content

Showing 1–11 of 11 results for author: Andersen, D G

.
  1. arXiv:2003.02391  [pdf, other

    cs.DB

    Order-Preserving Key Compression for In-Memory Search Trees

    Authors: Huanchen Zhang, Xiaoxuan Liu, David G. Andersen, Michael Kaminsky, Kimberly Keeton, Andrew Pavlo

    Abstract: We present the High-speed Order-Preserving Encoder (HOPE) for in-memory search trees. HOPE is a fast dictionary-based compressor that encodes arbitrary keys while preserving their order. HOPE's approach is to identify common key patterns at a fine granularity and exploit the entropy to achieve high compression rates with a small dictionary. We first develop a theoretical model to reason about orde… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

    Comments: SIGMOD'20 version + Appendix

  2. arXiv:1910.00762  [pdf, other

    cs.LG stat.ML

    Accelerating Deep Learning by Focusing on the Biggest Losers

    Authors: Angela H. Jiang, Daniel L. -K. Wong, Giulio Zhou, David G. Andersen, Jeffrey Dean, Gregory R. Ganger, Gauri Joshi, Michael Kaminksy, Michael Kozuch, Zachary C. Lipton, Padmanabhan Pillai

    Abstract: This paper introduces Selective-Backprop, a technique that accelerates the training of deep neural networks (DNNs) by prioritizing examples with high loss at each iteration. Selective-Backprop uses the output of a training example's forward pass to decide whether to use that example to compute gradients and update parameters, or to skip immediately to the next example. By reducing the number of co… ▽ More

    Submitted 1 October, 2019; originally announced October 2019.

  3. arXiv:1905.13536  [pdf, other

    cs.CV cs.LG cs.PF eess.IV stat.ML

    Scaling Video Analytics on Constrained Edge Nodes

    Authors: Christopher Canel, Thomas Kim, Giulio Zhou, Conglong Li, Hyeontaek Lim, David G. Andersen, Michael Kaminsky, Subramanya R. Dulloor

    Abstract: As video camera deployments continue to grow, the need to process large volumes of real-time data strains wide area network infrastructure. When per-camera bandwidth is limited, it is infeasible for applications such as traffic monitoring and pedestrian tracking to offload high-quality video streams to a datacenter. This paper presents FilterForward, a new edge-to-cloud system that enables datacen… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

    Comments: This paper is an extended version of a paper with the same title published in the 2nd SysML Conference, SysML '19 (Canel et. al., 2019)

  4. arXiv:1904.03257  [pdf, ps, other

    cs.LG cs.DB cs.DC cs.SE stat.ML

    MLSys: The New Frontier of Machine Learning Systems

    Authors: Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood , et al. (44 additional authors not shown)

    Abstract: Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically different development and deployment profile of modern ML methods, and the range of practical concerns that come with broader adoption. We propose to foster a ne… ▽ More

    Submitted 1 December, 2019; v1 submitted 29 March, 2019; originally announced April 2019.

  5. arXiv:1812.03626  [pdf, other

    cs.CV

    EDF: Ensemble, Distill, and Fuse for Easy Video Labeling

    Authors: Giulio Zhou, Subramanya Dulloor, David G. Andersen, Michael Kaminsky

    Abstract: We present a way to rapidly bootstrap object detection on unseen videos using minimal human annotations. We accomplish this by combining two complementary sources of knowledge (one generic and the other specific) using bounding box merging and model distillation. The first (generic) knowledge source is obtained from ensembling pre-trained object detectors using a novel bounding box merging and con… ▽ More

    Submitted 10 December, 2018; originally announced December 2018.

  6. arXiv:1806.00680  [pdf, other

    cs.OS

    Datacenter RPCs can be General and Fast

    Authors: Anuj Kalia, Michael Kaminsky, David G. Andersen

    Abstract: It is commonly believed that datacenter networking software must sacrifice generality to attain high performance. The popularity of specialized distributed systems designed specifically for niche technologies such as RDMA, lossless networks, FPGAs, and programmable switches testifies to this belief. In this paper, we show that such specialization is not necessary. eRPC is a new general-purpose rem… ▽ More

    Submitted 14 January, 2019; v1 submitted 2 June, 2018; originally announced June 2018.

    Comments: Updated to NSDI 2019 version

  7. arXiv:1802.07389  [pdf, other

    cs.LG cs.DC stat.ML

    3LC: Lightweight and Effective Traffic Compression for Distributed Machine Learning

    Authors: Hyeontaek Lim, David G. Andersen, Michael Kaminsky

    Abstract: The performance and efficiency of distributed machine learning (ML) depends significantly on how long it takes for nodes to exchange state changes. Overly-aggressive attempts to reduce communication often sacrifice final model accuracy and necessitate additional ML techniques to compensate for this loss, limiting their generality. Some attempts to reduce communication incur high computation overhe… ▽ More

    Submitted 20 February, 2018; originally announced February 2018.

  8. arXiv:1610.06918  [pdf, other

    cs.CR cs.LG

    Learning to Protect Communications with Adversarial Neural Cryptography

    Authors: Martín Abadi, David G. Andersen

    Abstract: We ask whether neural networks can learn to use secret keys to protect information from other neural networks. Specifically, we focus on ensuring confidentiality properties in a multiagent system, and we specify those properties in terms of an adversary. Thus, a system may consist of neural networks named Alice and Bob, and we aim to limit what a third neural network named Eve learns from eavesdro… ▽ More

    Submitted 21 October, 2016; originally announced October 2016.

    Comments: 15 pages

  9. arXiv:1603.04387  [pdf, other

    cs.NI

    NetMemex: Providing Full-Fidelity Traffic Archival

    Authors: Hyeontaek Lim, Vyas Sekar, Yoshihisa Abe, David G. Andersen

    Abstract: NetMemex explores efficient network traffic archival without any loss of information. Unlike NetFlow-like aggregation, NetMemex allows retrieving the entire packet data including full payload, which makes it useful in forensic analysis, networked and distributed system research, and network administration. Different from packet trace dumps, NetMemex performs sophisticated data compression for smal… ▽ More

    Submitted 14 March, 2016; originally announced March 2016.

    Comments: A reformatted version of the ACM SIGCOMM 2013 submission

  10. arXiv:1505.04636  [pdf, other

    cs.DC cs.AI cs.LG

    Graph Partitioning via Parallel Submodular Approximation to Accelerate Distributed Machine Learning

    Authors: Mu Li, Dave G. Andersen, Alexander J. Smola

    Abstract: Distributed computing excels at processing large scale data, but the communication cost for synchronizing the shared parameters may slow down the overall performance. Fortunately, the interactions between parameter and data in many problems are sparse, which admits efficient partition in order to reduce the communication overhead. In this paper, we formulate data placement as a graph partitionin… ▽ More

    Submitted 18 May, 2015; originally announced May 2015.

    ACM Class: I.2.11; I.5.1; G.1.6

  11. arXiv:cs/0104012  [pdf, ps, other

    cs.NI cs.OS

    System Support for Bandwidth Management and Content Adaptation in Internet Applications

    Authors: David G. Andersen, Deepak Bansal, Dorothy Curtis, Srinivasan Seshan, Hari Balakrishnan

    Abstract: This paper describes the implementation and evaluation of an operating system module, the Congestion Manager (CM), which provides integrated network flow management and exports a convenient programming interface that allows applications to be notified of, and adapt to, changing network conditions. We describe the API by which applications interface with the CM, and the architectural consideratio… ▽ More

    Submitted 7 April, 2001; originally announced April 2001.

    Comments: 14 pages, appeared in OSDI 2000

    ACM Class: D.4.4

    Journal ref: Proc. OSDI 2000