Search | arXiv e-print repository

arXiv:2405.20982 [pdf, other]

Scaling Data Plane Verification with Intent-based Slicing

Authors: Kuan-Yen Chou, Santhosh Prabhu, Giri Subramanian, Wenxuan Zhou, Aanand Nayyar, Brighten Godfrey, Matthew Caesar

Abstract: Data plane verification has grown into a powerful tool to ensure network correctness. However, existing monolithic data plane models have high memory requirements with large networks, and the existing method of scaling out is too limited in expressiveness to capture practical network features. In this paper, we describe Scylla, a general data plane verifier that provides fine-grained scale-out wit… ▽ More Data plane verification has grown into a powerful tool to ensure network correctness. However, existing monolithic data plane models have high memory requirements with large networks, and the existing method of scaling out is too limited in expressiveness to capture practical network features. In this paper, we describe Scylla, a general data plane verifier that provides fine-grained scale-out without the need for a monolithic network model. Scylla creates models for what we call intent-based slices, each of which is constructed at a fine (rule-level) granularity with just enough to verify a given set of intents. The sliced models are retained in memory across a cluster and are incrementally updated in a distributed compute cluster in response to network updates. Our experiments show that Scylla makes the scaling problem more granular -- tied to the size of the intent-based slices rather than that of the overall network. This enables Scylla to verify large, complex networks in minimum units of work that are significantly smaller (in both memory and time) than past techniques, enabling fast scale-out verification with minimal resource requirement. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2405.20444 [pdf, other]

New Limit on Dark Photon Kinetic Mixing in the 0.2-1.2 $\boldsymbolμ$eV Mass Range From the Dark E-Field Radio Experiment

Authors: Joseph Levine, Benjamin Godfrey, J. Anthony Tyson, S. Mani Tripathi, Daniel Polin, Amin Aminaei, Brian H. Kolner, Paul Stucky

Abstract: We report new limits on the kinetic mixing strength of the dark photon spanning the mass range 0.21 -- 1.24 $μ$eV corresponding to a frequency span of 50 -- 300 MHz. The Dark E-Field Radio experiment is a wide-band search for dark photon dark matter. In this paper we detail changes in calibration and upgrades since our proof-of-concept pilot run. Our detector employs a wide bandwidth E-field anten… ▽ More We report new limits on the kinetic mixing strength of the dark photon spanning the mass range 0.21 -- 1.24 $μ$eV corresponding to a frequency span of 50 -- 300 MHz. The Dark E-Field Radio experiment is a wide-band search for dark photon dark matter. In this paper we detail changes in calibration and upgrades since our proof-of-concept pilot run. Our detector employs a wide bandwidth E-field antenna moved to multiple positions in a shielded room, a low noise amplifier, wideband ADC, followed by a $2^{24}$-point FFT. An optimal filter searches for signals with Q $\approx10^6$. In nine days of integration, this system is capable of detecting dark photon signals corresponding to $ε$ several orders of magnitude lower than previous limits. We find a 95% exclusion limit on $ε$ over this mass range between $6\times 10^{-15}$ and $6\times 10^{-13}$, tracking the complex resonant mode structure in the shielded room. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 15 pages, 12 figures, submitted to PRD

arXiv:2312.07813 [pdf, other]

On a Foundation Model for Operating Systems

Authors: Divyanshu Saxena, Nihal Sharma, Donghyun Kim, Rohit Dwivedula, Jiayi Chen, Chenxi Yang, Sriram Ravula, Zichao Hu, Aditya Akella, Sebastian Angel, Joydeep Biswas, Swarat Chaudhuri, Isil Dillig, Alex Dimakis, P. Brighten Godfrey, Daehyeok Kim, Chris Rossbach, Gang Wang

Abstract: This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in… ▽ More This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in varying environments and workloads. We discuss a wide range of possibilities that then arise, from employing foundation models as policy agents to utilizing them as generators and predictors to assist traditional OS control algorithms. Our hope is that this paper spurs further research into OS foundation models and creating the next generation of operating systems for the evolving computing landscape. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Comments: Machine Learning for Systems Workshop at 37th NeurIPS Conference, 2023, New Orleans, LA, USA

arXiv:2311.02800 [pdf, other]

Kivi: Verification for Cluster Management

Authors: Bingzhe Liu, Gangmuk Lim, Ryan Beckett, P. Brighten Godfrey

Abstract: Modern cloud infrastructure is powered by cluster management systems such as Kubernetes and Docker Swarm. While these systems seek to minimize users' operational burden, the complex, dynamic, and non-deterministic nature of these systems makes them hard to reason about, potentially leading to failures ranging from performance degradation to outages. We present Kivi, the first system for verifying… ▽ More Modern cloud infrastructure is powered by cluster management systems such as Kubernetes and Docker Swarm. While these systems seek to minimize users' operational burden, the complex, dynamic, and non-deterministic nature of these systems makes them hard to reason about, potentially leading to failures ranging from performance degradation to outages. We present Kivi, the first system for verifying controllers and their configurations in cluster management systems. Kivi focuses on the popular system Kubernetes, and models its controllers and events into processes whereby their interleavings are exhaustively checked via model checking. Central to handling autoscaling and large-scale deployments is our design that seeks to find violations in a smaller and reduced topology. We also develop several model optimizations in Kivi to scale to large clusters. We show that Kivi is effective and accurate in finding issues in realistic and complex scenarios and showcase two new issues in Kubernetes controller source code. △ Less

Submitted 5 November, 2023; originally announced November 2023.

Comments: 18 pages

arXiv:2305.03348 [pdf, other]

Flock: Accurate network fault localization at scale

Authors: Vipul Harsh, Tong Meng, Kapil Agrawal, P. Brighten Godfrey

Abstract: Inferring the root cause of failures among thousands of components in a data center network is challenging, especially for "gray" failures that are not reported directly by switches. Faults can be localized through end-to-end measurements, but past localization schemes are either too slow for large-scale networks or sacrifice accuracy. We describe Flock, a network fault localization algorithm and… ▽ More Inferring the root cause of failures among thousands of components in a data center network is challenging, especially for "gray" failures that are not reported directly by switches. Faults can be localized through end-to-end measurements, but past localization schemes are either too slow for large-scale networks or sacrifice accuracy. We describe Flock, a network fault localization algorithm and system that achieves both high accuracy and speed at datacenter scale. Flock uses a probabilistic graphical model (PGM) to achieve high accuracy, coupled with new techniques to dramatically accelerate inference in discrete-valued Bayesian PGMs. Large-scale simulations and experiments in a hardware testbed show Flock speeds up inference by >10000x compared to past PGM methods, and improves accuracy over the best previous datacenter fault localization approaches, reducing inference error by 1.19-11x on the same input telemetry, and by 1.2-55x after incorporating passive telemetry. We also prove Flock's inference is optimal in restricted settings △ Less

Submitted 5 May, 2023; originally announced May 2023.

Comments: To appear in ACM PACMNET, Vol 1, June 2023

arXiv:2303.13850 [pdf, other]

Towards Learning and Explaining Indirect Causal Effects in Neural Networks

Authors: Abbavaram Gowtham Reddy, Saketh Bachu, Harsharaj Pathak, Benin L Godfrey, Vineeth N. Balasubramanian, Varshaneya V, Satya Narayanan Kar

Abstract: Recently, there has been a growing interest in learning and explaining causal effects within Neural Network (NN) models. By virtue of NN architectures, previous approaches consider only direct and total causal effects assuming independence among input variables. We view an NN as a structural causal model (SCM) and extend our focus to include indirect causal effects by introducing feedforward conne… ▽ More Recently, there has been a growing interest in learning and explaining causal effects within Neural Network (NN) models. By virtue of NN architectures, previous approaches consider only direct and total causal effects assuming independence among input variables. We view an NN as a structural causal model (SCM) and extend our focus to include indirect causal effects by introducing feedforward connections among input neurons. We propose an ante-hoc method that captures and maintains direct, indirect, and total causal effects during NN model training. We also propose an algorithm for quantifying learned causal effects in an NN model and efficient approximation strategies for quantifying causal effects in high-dimensional data. Extensive experiments conducted on synthetic and real-world datasets demonstrate that the causal effects learned by our ante-hoc method better approximate the ground truth effects compared to existing methods. △ Less

Submitted 8 January, 2024; v1 submitted 24 March, 2023; originally announced March 2023.

Comments: AAAI 2024

arXiv:2209.13979 [pdf, other]

doi 10.1088/1748-0221/18/01/P01020

Studies in Pulse Shape Discrimination for an Optimized ASIC Design

Authors: B. Boxer, B. Godfrey, C. Grace, J. Johnson, R. Khandwala, M. Tripathi

Abstract: The continued advancements of Silicon Photomultipliers (SiPMs) have made them viable photosensors for low recoil energy Pulse Shape Discrimination (PSD) between fast neutron and gamma interactions when coupled to an appropriate scintillator. At the same time, the large number of channels in a typical array calls for the development of low-cost and low-power electronics. A custom integrated circuit… ▽ More The continued advancements of Silicon Photomultipliers (SiPMs) have made them viable photosensors for low recoil energy Pulse Shape Discrimination (PSD) between fast neutron and gamma interactions when coupled to an appropriate scintillator. At the same time, the large number of channels in a typical array calls for the development of low-cost and low-power electronics. A custom integrated circuit (ASIC) is an ideal solution for this purpose. To assess the requirements for such an ASIC, studies were performed using two scintillators, Stilbene and EJ-276, coupled to a 6 x 6 mm SiPM from Onsemi. We demonstrate that both scintillators are viable for performing PSD for interaction energies from 100 keV to several MeV while optimizing the integration periods used in the PSD metric. These measurements inform the design parameters of the ASIC under development. △ Less

Submitted 21 December, 2022; v1 submitted 28 September, 2022; originally announced September 2022.

arXiv:2207.13280 [pdf, other]

On-Device CPU Scheduling for Sense-React Systems

Authors: Aditi Partap, Samuel Grayson, Muhammad Huzaifa, Sarita Adve, Brighten Godfrey, Saurabh Gupta, Kris Hauser, Radhika Mittal

Abstract: Sense-react systems (e.g. robotics and AR/VR) have to take highly responsive real-time actions, driven by complex decisions involving a pipeline of sensing, perception, planning, and reaction tasks. These tasks must be scheduled on resource-constrained devices such that the performance goals and the requirements of the application are met. This is a difficult scheduling problem that requires handl… ▽ More Sense-react systems (e.g. robotics and AR/VR) have to take highly responsive real-time actions, driven by complex decisions involving a pipeline of sensing, perception, planning, and reaction tasks. These tasks must be scheduled on resource-constrained devices such that the performance goals and the requirements of the application are met. This is a difficult scheduling problem that requires handling multiple scheduling dimensions, and variations in resource usage and availability. In practice, system designers manually tune parameters for their specific hardware and application, which results in poor generalization and increases the development burden. In this work, we highlight the emerging need for scheduling CPU resources at runtime in sense-react systems. We study three canonical applications (face tracking, robot navigation, and VR) to first understand the key scheduling requirements for such systems. Armed with this understanding, we develop a scheduling framework, Catan, that dynamically schedules compute resources across different components of an app so as to meet the specified application requirements. Through experiments with a prototype implemented on a widely-used robotics framework (ROS) and an open-source AR/VR platform, we show the impact of system scheduling on meeting the performance goals for the three applications, how Catan is able to achieve better application performance than hand-tuned configurations, and how it dynamically adapts to runtime variations. △ Less

Submitted 14 August, 2022; v1 submitted 27 July, 2022; originally announced July 2022.

Comments: 13 pages, 13 figures. This version of the paper extends a shorter version that has been accepted at IROS'22

arXiv:2101.02805 [pdf, other]

doi 10.1103/PhysRevD.104.012013

Search for Dark Photon Dark Matter: Dark E-Field Radio Pilot Experiment

Authors: Benjamin Godfrey, J. Anthony Tyson, Seth Hillbrand, Jon Balajthy, Daniel Polin, S. Mani Tripathi, Shelby Klomp, Joseph Levine, Nate MacFadden, Brian H. Kolner, Molly R. Smith, Paul Stucky, Arran Phipps, Peter Graham, Kent Irwin

Abstract: We are building an experiment to search for dark matter in the form of dark photons in the nano- to milli-eV mass range. This experiment is the electromagnetic dual of magnetic detector dark radio experiments. It is also a frequency-time dual experiment in two ways: We search for a high-Q signal in wide-band data rather than tuning a high-$Q$ resonator, and we measure electric rather than magnetic… ▽ More We are building an experiment to search for dark matter in the form of dark photons in the nano- to milli-eV mass range. This experiment is the electromagnetic dual of magnetic detector dark radio experiments. It is also a frequency-time dual experiment in two ways: We search for a high-Q signal in wide-band data rather than tuning a high-$Q$ resonator, and we measure electric rather than magnetic fields. In this paper we describe a pilot experiment using room temperature electronics which demonstrates feasibility and sets useful limits to the kinetic coupling $ε\sim 10^{-12}$ over 50--300 MHz. With a factor of 2000 increase in real-time spectral coverage, and lower system noise temperature, it will soon be possible to search a wide range of masses at 100 times this sensitivity. We describe the planned experiment in two phases: Phase-I will implement a wide band, 5-million channel, real-time FFT processor over the 30--300 MHz range with a back-end time-domain optimal filter to search for the predicted $Q\sim 10^6$ line using low-noise amplifiers. We have completed spot frequency calibrations using a biconical dipole antenna in a shielded room that extrapolate to a $5 σ$ limit of $ε\sim 10^{-13}$ for the coupling from the dark field, per month of integration. Phase-II will extend the search to 20 GHz using cryogenic preamplifiers and new antennas. △ Less

Submitted 17 November, 2021; v1 submitted 7 January, 2021; originally announced January 2021.

Comments: 11 pages, 13 figures. Updated to published version. Corrected minor error in Fig 12 x-axis; results unchanged

Journal ref: Phys. Rev. D 104, 012013 (2021)

arXiv:2004.14020 [pdf, other]

Caramel: Accelerating Decentralized Distributed Deep Learning with Computation Scheduling

Authors: Sayed Hadi Hashemi, Sangeetha Abdu Jyothi, Brighten Godfrey, Roy Campbell

Abstract: The method of choice for parameter aggregation in Deep Neural Network (DNN) training, a network-intensive task, is shifting from the Parameter Server model to decentralized aggregation schemes (AllReduce) inspired by theoretical guarantees of better performance. However, current implementations of AllReduce overlook the interdependence of communication and computation, resulting in significant per… ▽ More The method of choice for parameter aggregation in Deep Neural Network (DNN) training, a network-intensive task, is shifting from the Parameter Server model to decentralized aggregation schemes (AllReduce) inspired by theoretical guarantees of better performance. However, current implementations of AllReduce overlook the interdependence of communication and computation, resulting in significant performance degradation. In this paper, we develop Caramel, a system that accelerates decentralized distributed deep learning through model-aware computation scheduling and communication optimizations for AllReduce. Caramel achieves this goal through (a) computation DAG scheduling that expands the feasible window of transfer for each parameter (transfer boundaries), and (b) network optimizations for smoothening of the load including adaptive batching and pipelining of parameter transfers. Caramel maintains the correctness of the dataflow model, is hardware-independent, and does not require any user-level or framework-level changes. We implement Caramel over TensorFlow and show that the iteration time of DNN training can be improved by up to 3.62x in a cloud environment. △ Less

Submitted 29 April, 2020; originally announced April 2020.

arXiv:1911.02128 [pdf, other]

Plankton: Scalable network configuration verification through model checking

Authors: Santhosh Prabhu, Kuan-Yen Chou, Ali Kheradmand, P. Brighten Godfrey, Matthew Caesar

Abstract: Network configuration verification enables operators to ensure that the network will behave as intended, prior to deployment of their configurations. Although techniques ranging from graph algorithms to SMT solvers have been proposed, scalable configuration verification with sufficient protocol support continues to be a challenge. In this paper, we show that by combining equivalence partitioning w… ▽ More Network configuration verification enables operators to ensure that the network will behave as intended, prior to deployment of their configurations. Although techniques ranging from graph algorithms to SMT solvers have been proposed, scalable configuration verification with sufficient protocol support continues to be a challenge. In this paper, we show that by combining equivalence partitioning with explicit-state model checking, network configuration verification can be scaled significantly better than the state of the art, while still supporting a rich set of protocol features. We propose Plankton, which uses symbolic partitioning to manage large header spaces and efficient model checking to exhaustively explore protocol behavior. Thanks to a highly effective suite of optimizations including state hashing, partial order reduction, and policy-based pruning, Plankton successfully verifies policies in industrial-scale networks quickly and compactly, at times reaching a 10000$\times$ speedup compared to the state of the art. △ Less

Submitted 5 November, 2019; originally announced November 2019.

Comments: NSDI'20

arXiv:1908.04852 [pdf]

Forecasting U.S. Textile Comparative Advantage Using Autoregressive Integrated Moving Average Models and Time Series Outlier Analysis

Authors: Zahra Saki, Lori Rothenberg, Marguerite Moor, Ivan Kandilov, A. Blanton Godfrey

Abstract: To establish an updated understanding of the U.S. textile and apparel (TAP) industrys competitive position within the global textile environment, trade data from UN-COMTRADE (1996-2016) was used to calculate the Normalized Revealed Comparative Advantage (NRCA) index for 169 TAP categories at the four-digit Harmonized Schedule (HS) code level. Univariate time series using Autoregressive Integrated… ▽ More To establish an updated understanding of the U.S. textile and apparel (TAP) industrys competitive position within the global textile environment, trade data from UN-COMTRADE (1996-2016) was used to calculate the Normalized Revealed Comparative Advantage (NRCA) index for 169 TAP categories at the four-digit Harmonized Schedule (HS) code level. Univariate time series using Autoregressive Integrated Moving Average (ARIMA) models forecast short-term future performance of Revealed categories with export advantage. Accompanying outlier analysis examined permanent level shifts that might convey important information about policy changes, influential drivers and random events. △ Less

Submitted 13 August, 2019; originally announced August 2019.

Comments: 11 pages, 1Figure and 9 tables

Journal ref: 2018 Joint Statistical Meeting, 1996-2006

arXiv:1811.10737 [pdf, other]

Dissecting Latency in the Internet's Fiber Infrastructure

Authors: Ilker Nadi Bozkurt, Waqar Aqeel, Debopam Bhattacherjee, Balakrishnan Chandrasekaran, Philip Brighten Godfrey, Gregory Laughlin, Bruce M. Maggs, Ankit Singla

Abstract: The recent publication of the `InterTubes' map of long-haul fiber-optic cables in the contiguous United States invites an exciting question: how much faster would the Internet be if routes were chosen to minimize latency? Previous measurement campaigns suggest the following rule of thumb for estimating Internet latency: multiply line-of-sight distance by 2.1, then divide by the speed of light in f… ▽ More The recent publication of the `InterTubes' map of long-haul fiber-optic cables in the contiguous United States invites an exciting question: how much faster would the Internet be if routes were chosen to minimize latency? Previous measurement campaigns suggest the following rule of thumb for estimating Internet latency: multiply line-of-sight distance by 2.1, then divide by the speed of light in fiber. But a simple computation of shortest-path lengths through the conduits in the InterTubes map suggests that the conversion factor for all pairs of the 120 largest population centers in the U.S.\ could be reduced from 2.1 to 1.3, in the median, even using less than half of the links. To determine whether an overlay network could be used to provide shortest paths, and how well it would perform, we used the diverse server deployment of a CDN to measure latency across individual conduits. We were surprised to find, however, that latencies are sometimes much higher than would be predicted by conduit length alone. To understand why, we report findings from our analysis of network latency data from the backbones of two Tier-1 ISPs, two scientific and research networks, and the recently built fiber backbone of a CDN. △ Less

Submitted 26 November, 2018; originally announced November 2018.

arXiv:1811.00212 [pdf, other]

Expander Datacenters: From Theory to Practice

Authors: Vipul Harsh, Sangeetha Abdu Jyothi, Inderdeep Singh, P. Brighten Godfrey

Abstract: Recent work has shown that expander-based data center topologies are robust and can yield superior performance over Clos topologies. However, to achieve these benefits, previous proposals use routing and transport schemes that impede quick industry adoption. In this paper, we examine if expanders can be effective for the technology and environments practical in today's data centers, including the… ▽ More Recent work has shown that expander-based data center topologies are robust and can yield superior performance over Clos topologies. However, to achieve these benefits, previous proposals use routing and transport schemes that impede quick industry adoption. In this paper, we examine if expanders can be effective for the technology and environments practical in today's data centers, including the use of traditional protocols, at both small and large scale while complying with common practices such as over-subscription. We study bandwidth, latency and burst tolerance of topologies, highlighting pitfalls of previous topology comparisons. We consider several other metrics of interest: packet loss during failures, queue occupancy and topology degradation. Our experiments show that expanders can realize 3x more throughput than an equivalent fat tree, and 1.5x more throughput than an equivalent leaf-spine topology, for a wide range of scenarios, with only traditional protocols. We observe that expanders achieve lower flow completion times, are more resilient to bursty load conditions like incast and outcast and degrade more gracefully with increasing load. Our results are based on extensive simulations and experiments on a hardware testbed with realistic topologies and real traffic patterns. △ Less

Submitted 31 October, 2018; originally announced November 2018.

Comments: 15 pages, 17 figures

arXiv:1810.08578 [pdf, other]

Leveraging Product as an Activation Function in Deep Networks

Authors: Luke B. Godfrey, Michael S. Gashler

Abstract: Product unit neural networks (PUNNs) are powerful representational models with a strong theoretical basis, but have proven to be difficult to train with gradient-based optimizers. We present windowed product unit neural networks (WPUNNs), a simple method of leveraging product as a nonlinearity in a neural network. Windowing the product tames the complex gradient surface and enables WPUNNs to learn… ▽ More Product unit neural networks (PUNNs) are powerful representational models with a strong theoretical basis, but have proven to be difficult to train with gradient-based optimizers. We present windowed product unit neural networks (WPUNNs), a simple method of leveraging product as a nonlinearity in a neural network. Windowing the product tames the complex gradient surface and enables WPUNNs to learn effectively, solving the problems faced by PUNNs. WPUNNs use product layers between traditional sum layers, capturing the representational power of product units and using the product itself as a nonlinearity. We find the result that this method works as well as traditional nonlinearities like ReLU on the MNIST dataset. We demonstrate that WPUNNs can also generalize gated units in recurrent neural networks, yielding results comparable to LSTM networks. △ Less

Submitted 19 October, 2018; originally announced October 2018.

Comments: 6 pages, 3 figures, IEEE SMC 2018

arXiv:1810.03259 [pdf, other]

Internet Congestion Control via Deep Reinforcement Learning

Authors: Nathan Jay, Noga H. Rotman, P. Brighten Godfrey, Michael Schapira, Aviv Tamar

Abstract: We present and investigate a novel and timely application domain for deep reinforcement learning (RL): Internet congestion control. Congestion control is the core networking task of modulating traffic sources' data-transmission rates to efficiently utilize network capacity, and is the subject of extensive attention in light of the advent of Internet services such as live video, virtual reality, In… ▽ More We present and investigate a novel and timely application domain for deep reinforcement learning (RL): Internet congestion control. Congestion control is the core networking task of modulating traffic sources' data-transmission rates to efficiently utilize network capacity, and is the subject of extensive attention in light of the advent of Internet services such as live video, virtual reality, Internet-of-Things, and more. We show that casting congestion control as RL enables training deep network policies that capture intricate patterns in data traffic and network conditions, and leverage this to outperform the state-of-the-art. We also highlight significant challenges facing real-world adoption of RL-based congestion control, including fairness, safety, and generalization, which are not trivial to address within conventional RL formalism. To facilitate further research and reproducibility of our results, we present a test suite for RL-guided congestion control based on the OpenAI Gym interface. △ Less

Submitted 21 May, 2019; v1 submitted 7 October, 2018; originally announced October 2018.

Comments: 10 pages, accepted to ICML 2019

arXiv:1809.10897 [pdf, other]

cISP: A Speed-of-Light Internet Service Provider

Authors: Debopam Bhattacherjee, Sangeetha Abdu Jyothi, Ilker Nadi Bozkurt, Muhammad Tirmazi, Waqar Aqeel, Anthony Aguirre, Balakrishnan Chandrasekaran, P. Brighten Godfrey, Gregory P. Laughlin, Bruce M. Maggs, Ankit Singla

Abstract: Low latency is a requirement for a variety of interactive network applications. The Internet, however, is not optimized for latency. We thus explore the design of cost-effective wide-area networks that move data over paths very close to great-circle paths, at speeds very close to the speed of light in vacuum. Our cISP design augments the Internet's fiber with free-space wireless connectivity. cISP… ▽ More Low latency is a requirement for a variety of interactive network applications. The Internet, however, is not optimized for latency. We thus explore the design of cost-effective wide-area networks that move data over paths very close to great-circle paths, at speeds very close to the speed of light in vacuum. Our cISP design augments the Internet's fiber with free-space wireless connectivity. cISP addresses the fundamental challenge of simultaneously providing low latency and scalable bandwidth, while accounting for numerous practical factors ranging from transmission tower availability to packet queuing. We show that instantiations of cISP across the contiguous United States and Europe would achieve mean latencies within 5% of that achievable using great-circle paths at the speed of light, over medium and long distances. Further, we estimate that the economic value from such networks would substantially exceed their expense. △ Less

Submitted 10 October, 2018; v1 submitted 28 September, 2018; originally announced September 2018.

arXiv:1708.08557 [pdf, other]

A parameterized activation function for learning fuzzy logic operations in deep neural networks

Authors: Luke B. Godfrey, Michael S. Gashler

Abstract: We present a deep learning architecture for learning fuzzy logic expressions. Our model uses an innovative, parameterized, differentiable activation function that can learn a number of logical operations by gradient descent. This activation function allows a neural network to determine the relationships between its input variables and provides insight into the logical significance of learned netwo… ▽ More We present a deep learning architecture for learning fuzzy logic expressions. Our model uses an innovative, parameterized, differentiable activation function that can learn a number of logical operations by gradient descent. This activation function allows a neural network to determine the relationships between its input variables and provides insight into the logical significance of learned network parameters. We provide a theoretical basis for this parameterization and demonstrate its effectiveness and utility by successfully applying our model to five classification problems from the UCI Machine Learning Repository. △ Less

Submitted 11 September, 2017; v1 submitted 28 August, 2017; originally announced August 2017.

Comments: 6 pages, 3 figures, IEEE SMC 2017

arXiv:1706.05371 [pdf, other]

doi 10.1088/1748-0221/13/03/C03041

On the Evaluation of Silicon Photomultipliers for Use as Photosensors in Liquid Xenon Detectors

Authors: Benjamin Godfrey, Tyler Anderson, Earl Breedon, Jacob Cutter, Navneet Dhaliwal, Olivia Dalager, Seth Hillbrand, Michael Irving, Aaron Manalaysay, Juan Montoya, James Morad, Christian Neher, Dustin Stolp, Mani Tripathi, Ryan Wilson

Abstract: Silicon photomultipliers (SiPMs) are potential solid-state alternatives to traditional photomultiplier tubes (PMTs) for single-photon detection. In this paper, we report on evaluating SensL MicroFC-10035-SMT SiPMs for their suitability as PMT replacements. The devices were successfully operated in a liquid-xenon detector, which demonstrates that SiPMs can be used in noble element time projection c… ▽ More Silicon photomultipliers (SiPMs) are potential solid-state alternatives to traditional photomultiplier tubes (PMTs) for single-photon detection. In this paper, we report on evaluating SensL MicroFC-10035-SMT SiPMs for their suitability as PMT replacements. The devices were successfully operated in a liquid-xenon detector, which demonstrates that SiPMs can be used in noble element time projection chambers as photosensors. The devices were also cooled down to 170 K to observe dark count dependence on temperature. No dependencies on the direction of an applied 3.2 kV/cm electric field were observed with respect to dark-count rate, gain, or photon detection efficiency. △ Less

Submitted 3 April, 2018; v1 submitted 16 June, 2017; originally announced June 2017.

Comments: This is an author-created, un-copyedited version of an article accepted for publication/published in Journal of Instrumentation. IOP Publishing Ltd is not responsible for any errors or omissions in this version of the manuscript or any version derived from it. The Version of Record is available online at https://doi.org/10.1088/1748-0221/13/03/C03041

Journal ref: B. Godfrey, T. Anderson, E. Breedon, J. Cutter, N. Dhaliwal, O.Dalager et al., 2018 On the Evaluation of Silicon Photomultipliers for Use as Photosensors in Liquid Xenon Detectors JINST 13 C03041

arXiv:1705.09137 [pdf, other]

doi 10.1109/TNNLS.2017.2709324

Neural Decomposition of Time-Series Data for Effective Generalization

Authors: Luke B. Godfrey, Michael S. Gashler

Abstract: We present a neural network technique for the analysis and extrapolation of time-series data called Neural Decomposition (ND). Units with a sinusoidal activation function are used to perform a Fourier-like decomposition of training samples into a sum of sinusoids, augmented by units with nonperiodic activation functions to capture linear trends and other nonperiodic components. We show how careful… ▽ More We present a neural network technique for the analysis and extrapolation of time-series data called Neural Decomposition (ND). Units with a sinusoidal activation function are used to perform a Fourier-like decomposition of training samples into a sum of sinusoids, augmented by units with nonperiodic activation functions to capture linear trends and other nonperiodic components. We show how careful weight initialization can be combined with regularization to form a simple model that generalizes well. Our method generalizes effectively on the Mackey-Glass series, a dataset of unemployment rates as reported by the U.S. Department of Labor Statistics, a time-series of monthly international airline passengers, the monthly ozone concentration in downtown Los Angeles, and an unevenly sampled time-series of oxygen isotope measurements from a cave in north India. We find that ND outperforms popular time-series forecasting techniques including LSTM, echo state networks, ARIMA, SARIMA, SVR with a radial basis function, and Gashler and Ashmore's model. △ Less

Submitted 5 June, 2017; v1 submitted 25 May, 2017; originally announced May 2017.

Comments: 13 pages, 11 figures, IEEE TNNLS Preprint

Journal ref: IEEE Transactions on Neural Networks and Learning Systems 29.7 (2018) 2973-2985

arXiv:1608.00227 [pdf, other]

doi 10.1103/PhysRevE.94.053305

Elimination of Numerical Cherenkov Instability in flowing-plasma Particle-In-Cell simulations by using Galilean coordinates

Authors: Remi Lehe, Manuel Kirchen, Brendan B. Godfrey, Andreas R. Maier, Jean-Luc Vay

Abstract: Particle-In-Cell (PIC) simulations of relativistic flowing plasmas are of key interest to several fields of physics (including e.g. laser-wakefield acceleration, when viewed in a Lorentz-boosted frame), but remain sometimes infeasible due to the well-known numerical Cherenkov instability (NCI). In this article, we show that, for a plasma drifting at a uniform relativistic velocity, the NCI can be… ▽ More Particle-In-Cell (PIC) simulations of relativistic flowing plasmas are of key interest to several fields of physics (including e.g. laser-wakefield acceleration, when viewed in a Lorentz-boosted frame), but remain sometimes infeasible due to the well-known numerical Cherenkov instability (NCI). In this article, we show that, for a plasma drifting at a uniform relativistic velocity, the NCI can be eliminated by simply integrating the PIC equations in Galilean coordinates that follow the plasma (also sometimes known as comoving coordinates) within a spectral analytical framework. The elimination of the NCI is verified empirically and confirmed by a theoretical analysis of the instability. Moreover, it is shown that this method is applicable both to Cartesian geometry and to cylindrical geometry with azimuthal Fourier decomposition. △ Less

Submitted 31 July, 2016; originally announced August 2016.

Comments: 18 pages, 6 figures

Journal ref: Phys. Rev. E 94, 053305 (2016)

arXiv:1608.00215 [pdf, other]

doi 10.1063/1.4964770

Stable discrete representation of relativistically drifting plasmas

Authors: Manuel Kirchen, Remi Lehe, Brendan B. Godfrey, Irene Dornmair, Soeren Jalas, Kevin Peters, Jean-Luc Vay, Andreas R. Maier

Abstract: Representing the electrodynamics of relativistically drifting particle ensembles in discrete, co-propagating Galilean coordinates enables the derivation of a Particle-in-Cell algorithm that is intrinsically free of the Numerical Cherenkov Instability, for plasmas flowing at a uniform velocity. Application of the method is shown by modeling plasma accelerators in a Lorentz-transformed optimal frame… ▽ More Representing the electrodynamics of relativistically drifting particle ensembles in discrete, co-propagating Galilean coordinates enables the derivation of a Particle-in-Cell algorithm that is intrinsically free of the Numerical Cherenkov Instability, for plasmas flowing at a uniform velocity. Application of the method is shown by modeling plasma accelerators in a Lorentz-transformed optimal frame of reference. △ Less

Submitted 31 July, 2016; originally announced August 2016.

arXiv:1602.01321 [pdf, ps, other]

A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks

Authors: Luke B. Godfrey, Michael S. Gashler

Abstract: We present the soft exponential activation function for artificial neural networks that continuously interpolates between logarithmic, linear, and exponential functions. This activation function is simple, differentiable, and parameterized so that it can be trained as the rest of the network is trained. We hypothesize that soft exponential has the potential to improve neural network learning, as i… ▽ More We present the soft exponential activation function for artificial neural networks that continuously interpolates between logarithmic, linear, and exponential functions. This activation function is simple, differentiable, and parameterized so that it can be trained as the rest of the network is trained. We hypothesize that soft exponential has the potential to improve neural network learning, as it can exactly calculate many natural operations that typical neural networks can only approximate, including addition, multiplication, inner product, distance, polynomials, and sinusoids. △ Less

Submitted 3 February, 2016; originally announced February 2016.

Comments: 6 pages, 8 figures, conference, In Proceedings of Knowledge Discovery and Information Retrieval (KDIR) 2015, Lisbon, Portugal, December 2015

arXiv:1507.04790 [pdf, other]

doi 10.1016/j.cpc.2016.02.007

A spectral, quasi-cylindrical and dispersion-free Particle-In-Cell algorithm

Authors: Remi Lehe, Manuel Kirchen, Igor A. Andriyash, Brendan B. Godfrey, Jean-Luc Vay

Abstract: We propose a spectral Particle-In-Cell (PIC) algorithm that is based on the combination of a Hankel transform and a Fourier transform. For physical problems that have close-to-cylindrical symmetry, this algorithm can be much faster than full 3D PIC algorithms. In addition, unlike standard finite-difference PIC codes, the proposed algorithm is free of numerical dispersion. This algorithm is benchma… ▽ More We propose a spectral Particle-In-Cell (PIC) algorithm that is based on the combination of a Hankel transform and a Fourier transform. For physical problems that have close-to-cylindrical symmetry, this algorithm can be much faster than full 3D PIC algorithms. In addition, unlike standard finite-difference PIC codes, the proposed algorithm is free of numerical dispersion. This algorithm is benchmarked in several situations that are of interest for laser-plasma interactions. These benchmarks show that it avoids a number of numerical artifacts, that would otherwise affect the physics in a standard PIC algorithm - including the zero-order numerical Cherenkov effect. △ Less

Submitted 2 March, 2016; v1 submitted 16 July, 2015; originally announced July 2015.

Comments: 23 pages, 8 figures

arXiv:1505.03449 [pdf, other]

Towards a Speed of Light Internet

Authors: Ankit Singla, Balakrishnan Chandrasekaran, P. Brighten Godfrey, Bruce Maggs

Abstract: In principle, a network can transfer data at nearly the speed of light. Today's Internet, however, is much slower: our measurements show that latencies are typically more than one, and often more than two orders of magnitude larger than the lower bound implied by the speed of light. Closing this gap would not only add value to today's Internet applications, but might also open the door to exciting… ▽ More In principle, a network can transfer data at nearly the speed of light. Today's Internet, however, is much slower: our measurements show that latencies are typically more than one, and often more than two orders of magnitude larger than the lower bound implied by the speed of light. Closing this gap would not only add value to today's Internet applications, but might also open the door to exciting new applications. Thus, we propose a grand challenge for the networking research community: building a speed-of-light Internet. Towards addressing this goal, we begin by investigating the causes of latency inflation in the Internet across the network stack. Our analysis reveals that while protocol overheads, which have dominated the community's attention, are indeed important, infrastructural inefficiencies are a significant and under-explored problem. Thus, we propose a radical, yet surprisingly low-cost approach to mitigating latency inflation at the lowest layers and building a nearly speed-of-light Internet infrastructure. △ Less

Submitted 13 May, 2015; originally announced May 2015.

arXiv:1502.01387 [pdf, other]

doi 10.1016/j.cpc.2015.06.008

Improved Numerical Cherenkov Instability Suppression in the Generalized PSTD PIC Algorithm

Authors: Brendan B. Godfrey, Jean-Luc Vay

Abstract: The family of generalized Pseudo-Spectral Time Domain (including the Pseudo-Spectral Analytical Time Domain) Particle-in-Cell algorithms offers substantial versatility for simulating particle beams and plasmas, and well written codes using these algorithms run reasonably fast. When simulating relativistic beams and streaming plasmas in multiple dimensions, they are, however, subject to the numeric… ▽ More The family of generalized Pseudo-Spectral Time Domain (including the Pseudo-Spectral Analytical Time Domain) Particle-in-Cell algorithms offers substantial versatility for simulating particle beams and plasmas, and well written codes using these algorithms run reasonably fast. When simulating relativistic beams and streaming plasmas in multiple dimensions, they are, however, subject to the numerical Cherenkov instability. Previous studies have shown that instability growth rates can be reduced substantially by modifying slightly the transverse fields as seen by the streaming particles . Here, we offer an approach which completely eliminates the fundamental mode of the numerical Cherenkov instability while minimizing the transverse field corrections. The procedure, numerically computed residual growth rates (from weaker, higher order instability aliases), and comparisons with WARP simulations are presented. In some instances, there are no numerical instabilities whatsoever, at least in the linear regime. △ Less

Submitted 4 February, 2015; originally announced February 2015.

Comments: 9 pages, 7 figures

arXiv:1409.7092 [pdf, other]

PCC: Re-architecting Congestion Control for Consistent High Performance

Authors: Mo Dong, Qingxi Li, Doron Zarchy, Brighten Godfrey, Michael Schapira

Abstract: TCP and its variants have suffered from surprisingly poor performance for decades. We argue the TCP family has little hope to achieve consistent high performance due to a fundamental architectural deficiency: hardwiring packet-level events to control responses without understanding the real performance result of its actions. We propose Performance-oriented Congestion Control (PCC), a new congestio… ▽ More TCP and its variants have suffered from surprisingly poor performance for decades. We argue the TCP family has little hope to achieve consistent high performance due to a fundamental architectural deficiency: hardwiring packet-level events to control responses without understanding the real performance result of its actions. We propose Performance-oriented Congestion Control (PCC), a new congestion control architecture in which each sender continuously observes the connection between its actions and empirically experienced performance, enabling it to consistently adopt actions that result in high performance. We prove that PCC converges to a stable and fair equilibrium. Across many real-world and challenging environments, PCC shows consistent and often 10x performance improvement, with better fairness and stability than TCP. PCC requires no router hardware support or new packet format. △ Less

Submitted 10 October, 2014; v1 submitted 24 September, 2014; originally announced September 2014.

arXiv:1408.1146 [pdf]

Review and Recent Advances in PIC Modeling of Relativistic Beams and Plasmas

Authors: Brendan B. Godfrey

Abstract: Particle-in-Cell (PIC) simulation codes have wide applicability to first-principles modeling of multidimensional nonlinear plasma phenomena, including wake-field accelerators. This review addresses both finite difference and pseudo-spectral PIC algorithms, including numerical instability suppression and generalizations of the spectral field solver. Particle-in-Cell (PIC) simulation codes have wide applicability to first-principles modeling of multidimensional nonlinear plasma phenomena, including wake-field accelerators. This review addresses both finite difference and pseudo-spectral PIC algorithms, including numerical instability suppression and generalizations of the spectral field solver. △ Less

Submitted 5 August, 2014; originally announced August 2014.

Comments: Summary of plenary presentation at 2014 Advanced Accelerator Concepts Workshop; 9 pages, 6 figures

arXiv:1402.2531 [pdf, other]

Measuring and Understanding Throughput of Network Topologies

Authors: Sangeetha Abdu Jyothi, Ankit Singla, P. Brighten Godfrey, Alexandra Kolla

Abstract: High throughput is of particular interest in data center and HPC networks. Although myriad network topologies have been proposed, a broad head-to-head comparison across topologies and across traffic patterns is absent, and the right way to compare worst-case throughput performance is a subtle problem. In this paper, we develop a framework to benchmark the throughput of network topologies, using… ▽ More High throughput is of particular interest in data center and HPC networks. Although myriad network topologies have been proposed, a broad head-to-head comparison across topologies and across traffic patterns is absent, and the right way to compare worst-case throughput performance is a subtle problem. In this paper, we develop a framework to benchmark the throughput of network topologies, using a two-pronged approach. First, we study performance on a variety of synthetic and experimentally-measured traffic matrices (TMs). Second, we show how to measure worst-case throughput by generating a near-worst-case TM for any given topology. We apply the framework to study the performance of these TMs in a wide range of network topologies, revealing insights into the performance of topologies with scaling, robustness of performance across TMs, and the effect of scattered workload placement. Our evaluation code is freely available. △ Less

Submitted 14 November, 2016; v1 submitted 11 February, 2014; originally announced February 2014.

arXiv:1401.0838 [pdf, other]

doi 10.1016/j.jcp.2014.02.022

Suppressing the Numerical Cherenkov Instability in FDTD PIC Codes

Authors: Brendan B. Godfrey, Jean-Luc Vay

Abstract: A procedure for largely suppressing the numerical Cherenkov instability in finite difference time-domain (FDTD) particle-in-cell (PIC) simulations of cold, relativistic beams is derived, and residual growth rates computed and compared with WARP code simulation results. Sample laser-plasma acceleration simulation output is provided to further validate the new procedure. A procedure for largely suppressing the numerical Cherenkov instability in finite difference time-domain (FDTD) particle-in-cell (PIC) simulations of cold, relativistic beams is derived, and residual growth rates computed and compared with WARP code simulation results. Sample laser-plasma acceleration simulation output is provided to further validate the new procedure. △ Less

Submitted 4 January, 2014; originally announced January 2014.

Comments: 13 pages, 5 figures

Journal ref: Journal of Compujtational Physics 261, 1-6 (2014)

arXiv:1309.7066 [pdf, other]

High Throughput Data Center Topology Design

Authors: Ankit Singla, P. Brighten Godfrey, Alexandra Kolla

Abstract: With high throughput networks acquiring a crucial role in supporting data-intensive applications, a variety of data center network topologies have been proposed to achieve high capacity at low cost. While this literature explores a large number of design points, even in the limited case of a network of identical switches, no proposal has been able to claim any notion of optimality. The case of het… ▽ More With high throughput networks acquiring a crucial role in supporting data-intensive applications, a variety of data center network topologies have been proposed to achieve high capacity at low cost. While this literature explores a large number of design points, even in the limited case of a network of identical switches, no proposal has been able to claim any notion of optimality. The case of heterogeneous networks, incorporating multiple line-speeds and port-counts as data centers grow over time, introduces even greater complexity. In this paper, we present the first non-trivial upper-bound on network throughput under uniform traffic patterns for any topology with identical switches. We then show that random graphs achieve throughput surprisingly close to this bound, within a few percent at the scale of a few thousand servers. Apart from demonstrating that homogeneous topology design may be reaching its limits, this result also motivates our use of random graphs as building blocks to explore the design of heterogeneous networks. Given a heterogeneous pool of network switches, through experiments and analysis, we explore how the distribution of servers across switches and the interconnection of switches affect network throughput. We apply these insights to a real-world heterogeneous data center topology, VL2, demonstrating as much as 43% higher throughput with the same equipment. △ Less

Submitted 12 February, 2014; v1 submitted 26 September, 2013; originally announced September 2013.

Comments: 15 pages

arXiv:1309.0874 [pdf]

Shortest Paths in Microseconds

Authors: Rachit Agarwal, Matthew Caesar, P. Brighten Godfrey, Ben Y. Zhao

Abstract: Computing shortest paths is a fundamental primitive for several social network applications including socially-sensitive ranking, location-aware search, social auctions and social network privacy. Since these applications compute paths in response to a user query, the goal is to minimize latency while maintaining feasible memory requirements. We present ASAP, a system that achieves this goal by ex… ▽ More Computing shortest paths is a fundamental primitive for several social network applications including socially-sensitive ranking, location-aware search, social auctions and social network privacy. Since these applications compute paths in response to a user query, the goal is to minimize latency while maintaining feasible memory requirements. We present ASAP, a system that achieves this goal by exploiting the structure of social networks. ASAP preprocesses a given network to compute and store a partial shortest path tree (PSPT) for each node. The PSPTs have the property that for any two nodes, each edge along the shortest path is with high probability contained in the PSPT of at least one of the nodes. We show that the structure of social networks enable the PSPT of each node to be an extremely small fraction of the entire network; hence, PSPTs can be stored efficiently and each shortest path can be computed extremely quickly. For a real network with 5 million nodes and 69 million edges, ASAP computes a shortest path for most node pairs in less than 49 microseconds per pair. ASAP, unlike any previous technique, also computes hundreds of paths (along with corresponding distances) between any node pair in less than 100 microseconds. Finally, ASAP admits efficient implementation on distributed programming frameworks like MapReduce. △ Less

Submitted 3 September, 2013; originally announced September 2013.

Comments: Extended version of WOSN'12 paper: new techniques (reduced memory, faster computations), distributed (MapReduce) algorithm, multiple paths between a source-destination pair

arXiv:1309.0116 [pdf, other]

doi 10.1109/TPS.2014.2310654

Numerical Stability Improvements for the Pseudo-Spectral EM PIC Algorithm

Authors: Brendan B. Godfrey, Jean-Luc Vay, Irving Haber

Abstract: The pseudo-spectral analytical time-domain (PSATD) particle-in-cell (PIC) algorithm solves the vacuum Maxwell's equations exactly, has no Courant time-step limit (as conventionally defined), and offers substantial flexibility in plasma and particle beam simulations. It is, however, not free of the usual numerical instabilities, including the numerical Cherenkov instability, when applied to relativ… ▽ More The pseudo-spectral analytical time-domain (PSATD) particle-in-cell (PIC) algorithm solves the vacuum Maxwell's equations exactly, has no Courant time-step limit (as conventionally defined), and offers substantial flexibility in plasma and particle beam simulations. It is, however, not free of the usual numerical instabilities, including the numerical Cherenkov instability, when applied to relativistic beam simulations. This paper presents several approaches that, when combined with digital filtering, almost completely eliminate the numerical Cherenkov instability. The paper also investigates the numerical stability of the PSATD algorithm at low beam energies. △ Less

Submitted 31 August, 2013; originally announced September 2013.

Comments: Based on presentation at the 2013 IEEE Pulsed Power and Plasma Science Conference; 8 pages, 8 figures

Journal ref: IEEE Transactions on Plasma Science 42, 1339-1344 (2014)

arXiv:1306.3707 [pdf, other]

Low latency via redundancy

Authors: Ashish Vulimiri, P. Brighten Godfrey, Radhika Mittal, Justine Sherry, Sylvia Ratnasamy, Scott Shenker

Abstract: Low latency is critical for interactive networked applications. But while we know how to scale systems to increase capacity, reducing latency --- especially the tail of the latency distribution --- can be much more difficult. In this paper, we argue that the use of redundancy is an effective way to convert extra capacity into reduced latency. By initiating redundant operations across diverse resou… ▽ More Low latency is critical for interactive networked applications. But while we know how to scale systems to increase capacity, reducing latency --- especially the tail of the latency distribution --- can be much more difficult. In this paper, we argue that the use of redundancy is an effective way to convert extra capacity into reduced latency. By initiating redundant operations across diverse resources and using the first result which completes, redundancy improves a system's latency even under exceptional conditions. We study the tradeoff with added system utilization, characterizing the situations in which replicating all tasks reduces mean latency. We then demonstrate empirically that replicating all operations can result in significant mean and tail latency reduction in real-world systems including DNS queries, database servers, and packet forwarding within networks. △ Less

Submitted 16 June, 2013; originally announced June 2013.

arXiv:1306.3534 [pdf, other]

A cost-benefit analysis of low latency via added utilization

Authors: Ashish Vulimiri, P. Brighten Godfrey, Sri Varsha Gorge, Zitian Liu, Scott Shenker

Abstract: Several recently proposed techniques achieve latency reduction by trading it off for some amount of additional bandwidth usage. But how would one quantify whether the tradeoff is actually beneficial in a given system? We develop an economic cost vs. benefit analysis for answering this question. We use the analysis to derive a benchmark for wide-area client-server applications, and demonstrate how… ▽ More Several recently proposed techniques achieve latency reduction by trading it off for some amount of additional bandwidth usage. But how would one quantify whether the tradeoff is actually beneficial in a given system? We develop an economic cost vs. benefit analysis for answering this question. We use the analysis to derive a benchmark for wide-area client-server applications, and demonstrate how it can be applied to reason about a particular latency saving technique --- redundant DNS requests. △ Less

Submitted 4 December, 2014; v1 submitted 14 June, 2013; originally announced June 2013.

arXiv:1305.7375 [pdf, other]

doi 10.1016/j.jcp.2013.10.053

Numerical stability analysis of the Pseudo-Spectral Analytical Time-Domain PIC algorithm

Authors: Brendan B. Godfrey, Jean-Luc Vay, Irving Haber

Abstract: The pseudo-spectral analytical time-domain (PSATD) particle-in-cell (PIC) algorithm solves the vacuum Maxwell's equations exactly, has no Courant time-step limit (as conventionally defined), and offers substantial flexibility in plasma and particle beam simulations. It is, however, not free of the usual numerical instabilities, including the numerical Cherenkov instability, when applied to relativ… ▽ More The pseudo-spectral analytical time-domain (PSATD) particle-in-cell (PIC) algorithm solves the vacuum Maxwell's equations exactly, has no Courant time-step limit (as conventionally defined), and offers substantial flexibility in plasma and particle beam simulations. It is, however, not free of the usual numerical instabilities, including the numerical Cherenkov instability, when applied to relativistic beam simulations. This paper derives and solves the numerical dispersion relation for the PSATD algorithm and compares the results with corresponding behavior of the more conventional pseudo-spectral time-domain (PSTD) and finite difference time-domain (FDTD) algorithms. In general, PSATD offers superior stability properties over a reasonable range of time steps. More importantly, one version of the PSATD algorithm, when combined with digital filtering, is almost completely free of the numerical Cherenkov instability for time steps (scaled to the speed of light) comparable to or smaller than the axial cell size. △ Less

Submitted 5 June, 2013; v1 submitted 31 May, 2013; originally announced May 2013.

Comments: 38 pages, 16 figures, 2 tables; Fig 15 revised, reference added, link to supplementary material corrected, option (c) discussion expanded, use of smaller gamma in LPA time-step sweeps emphasized, minor typos corrected

arXiv:1302.6156 [pdf, other]

Scalable Routing on Flat Names

Authors: Ankit Singla, P. Brighten Godfrey, Kevin Fall, Gianluca Iannaccone, Sylvia Ratnasamy

Abstract: We introduce a protocol which routes on flat, location-independent identifiers with guaranteed scalability and low stretch. Our design builds on theoretical advances in the area of compact routing, and is the first to realize these guarantees in a dynamic distributed setting. We introduce a protocol which routes on flat, location-independent identifiers with guaranteed scalability and low stretch. Our design builds on theoretical advances in the area of compact routing, and is the first to realize these guarantees in a dynamic distributed setting. △ Less

Submitted 25 February, 2013; originally announced February 2013.

Comments: 13 pages

Journal ref: Extends our ACM CoNEXT 2010 paper with proofs for the theoretical results

arXiv:1211.0232 [pdf, other]

doi 10.1016/j.jcp.2013.04.006

Numerical stability of relativistic beam multidimensional PIC simulations employing the Esirkepov algorithm

Authors: Brendan B. Godfrey, Jean-Luc Vay

Abstract: Rapidly growing numerical instabilities routinely occur in multidimensional particle-in-cell computer simulations of plasma-based particle accelerators, astrophysical phenomena, and relativistic charged particle beams. Reducing instability growth to acceptable levels has necessitated higher resolution grids, high-order field solvers, current filtering, etc. except for certain ratios of the time st… ▽ More Rapidly growing numerical instabilities routinely occur in multidimensional particle-in-cell computer simulations of plasma-based particle accelerators, astrophysical phenomena, and relativistic charged particle beams. Reducing instability growth to acceptable levels has necessitated higher resolution grids, high-order field solvers, current filtering, etc. except for certain ratios of the time step to the axial cell size, for which numerical growth rates and saturation levels are reduced substantially. This paper derives and solves the cold beam dispersion relation for numerical instabilities in multidimensional, relativistic, electromagnetic particle-in-cell programs employing either the standard or the Cole-Karkkainnen finite difference field solver on a staggered mesh and the common Esirkepov current-gathering algorithm. Good overall agreement is achieved with previously reported results of the WARP code. In particular, the existence of select time steps for which instabilities are minimized is explained. Additionally, an alternative field interpolation algorithm is proposed for which instabilities are almost completely eliminated for a particular time step in ultra-relativistic simulations. △ Less

Submitted 1 November, 2012; originally announced November 2012.

Journal ref: Journal of Computational Physics 248, 33-46 (2013)

arXiv:1207.3732 [pdf, other]

doi 10.1145/2332432.2332478

On the Resilience of Routing Tables

Authors: Joan Feigenbaum, Brighten Godfrey, Aurojit Panda, Michael Schapira, Scott Shenker, Ankit Singla

Abstract: Many modern network designs incorporate "failover" paths into routers' forwarding tables. We initiate the theoretical study of the conditions under which such resilient routing tables can guarantee delivery of packets. Many modern network designs incorporate "failover" paths into routers' forwarding tables. We initiate the theoretical study of the conditions under which such resilient routing tables can guarantee delivery of packets. △ Less

Submitted 3 August, 2012; v1 submitted 16 July, 2012; originally announced July 2012.

Comments: Brief announcement, PODC 2012

arXiv:1206.2057 [pdf, other]

Finishing Flows Quickly with Preemptive Scheduling

Authors: Chi-Yao Hong, Matthew Caesar, P. Brighten Godfrey

Abstract: Today's data centers face extreme challenges in providing low latency. However, fair sharing, a principle commonly adopted in current congestion control protocols, is far from optimal for satisfying latency requirements. We propose Preemptive Distributed Quick (PDQ) flow scheduling, a protocol designed to complete flows quickly and meet flow deadlines. PDQ enables flow preemption to approximate… ▽ More Today's data centers face extreme challenges in providing low latency. However, fair sharing, a principle commonly adopted in current congestion control protocols, is far from optimal for satisfying latency requirements. We propose Preemptive Distributed Quick (PDQ) flow scheduling, a protocol designed to complete flows quickly and meet flow deadlines. PDQ enables flow preemption to approximate a range of scheduling disciplines. For example, PDQ can emulate a shortest job first algorithm to give priority to the short flows by pausing the contending flows. PDQ borrows ideas from centralized scheduling disciplines and implements them in a fully distributed manner, making it scalable to today's data centers. Further, we develop a multipath version of PDQ to exploit path diversity. Through extensive packet-level and flow-level simulation, we demonstrate that PDQ significantly outperforms TCP, RCP and D3 in data center environments. We further show that PDQ is stable, resilient to packet loss, and preserves nearly all its performance gains even given inaccurate flow information. △ Less

Submitted 12 June, 2012; v1 submitted 10 June, 2012; originally announced June 2012.

Comments: The conference version was published in SIGCOMM 2012

arXiv:1206.1134 [pdf]

Shortest Paths in Less Than a Millisecond

Authors: Rachit Agarwal, Matthew Caesar, P. Brighten Godfrey, Ben Y. Zhao

Abstract: We consider the problem of answering point-to-point shortest path queries on massive social networks. The goal is to answer queries within tens of milliseconds while minimizing the memory requirements. We present a technique that achieves this goal for an extremely large fraction of path queries by exploiting the structure of the social networks. Using evaluations on real-world datasets, we argu… ▽ More We consider the problem of answering point-to-point shortest path queries on massive social networks. The goal is to answer queries within tens of milliseconds while minimizing the memory requirements. We present a technique that achieves this goal for an extremely large fraction of path queries by exploiting the structure of the social networks. Using evaluations on real-world datasets, we argue that our technique offers a unique trade-off between latency, memory and accuracy. For instance, for the LiveJournal social network (roughly 5 million nodes and 69 million edges), our technique can answer 99.9% of the queries in less than a millisecond. In comparison to storing all pair shortest paths, our technique requires at least 550x less memory; the average query time is roughly 365 microseconds --- 430x faster than the state-of-the-art shortest path algorithm. Furthermore, the relative performance of our technique improves with the size (and density) of the network. For the Orkut social network (3 million nodes and 220 million edges), for instance, our technique is roughly 2588x faster than the state-of-the-art algorithm for computing shortest paths. △ Less

Submitted 6 June, 2012; originally announced June 2012.

Comments: 6 pages; to appear in SIGCOMM WOSN 2012

arXiv:1201.2703 [pdf]

Faster Approximate Distance Queries and Compact Routing in Sparse Graphs

Authors: Rachit Agarwal, P. Brighten Godfrey, Sariel Har-Peled

Abstract: A distance oracle is a compact representation of the shortest distance matrix of a graph. It can be queried to approximate shortest paths between any pair of vertices. Any distance oracle that returns paths of worst-case stretch (2k-1) must require space $Ω(n^{1 + 1/k})$ for graphs of n nodes. The hard cases that enforce this lower bound are, however, rather dense graphs with average degree Ω(n^{1… ▽ More A distance oracle is a compact representation of the shortest distance matrix of a graph. It can be queried to approximate shortest paths between any pair of vertices. Any distance oracle that returns paths of worst-case stretch (2k-1) must require space $Ω(n^{1 + 1/k})$ for graphs of n nodes. The hard cases that enforce this lower bound are, however, rather dense graphs with average degree Ω(n^{1/k}). We present distance oracles that, for sparse graphs, substantially break the lower bound barrier at the expense of higher query time. For any 1 \leq α\leq n, our distance oracles can return stretch 2 paths using O(m + n^2/α) space and stretch 3 paths using O(m + n^2/α^2) space, at the expense of O(αm/n) query time. By setting appropriate values of α, we get the first distance oracles that have size linear in the size of the graph, and return constant stretch paths in non-trivial query time. The query time can be further reduced to O(α), by using an additional O(m α) space for all our distance oracles, or at the cost of a small constant additive stretch. We use our stretch 2 distance oracle to present the first compact routing scheme with worst-case stretch 2. Any compact routing scheme with stretch less than 2 must require linear memory at some nodes even for sparse graphs; our scheme, hence, achieves the optimal stretch with non-trivial memory requirements. Moreover, supported by large-scale simulations on graphs including the AS-level Internet graph, we argue that our stretch-2 scheme would be simple and efficient to implement as a distributed compact routing protocol. △ Less

Submitted 12 January, 2012; originally announced January 2012.

Comments: 20 pages, an earlier version appeared in INFOCOM 2011, this version presents data structures with improved space/query-time trade-off

arXiv:1201.1661 [pdf, ps, other]

Slick Packets

Authors: Giang T. K. Nguyen, Rachit Agarwal, Junda Liu, Matthew Caesar, P. Brighten Godfrey, Scott Shenker

Abstract: Source-controlled routing has been proposed as a way to improve flexibility of future network architectures, as well as simplifying the data plane. However, if a packet specifies its path, this precludes fast local re-routing within the network. We propose SlickPackets, a novel solution that allows packets to slip around failures by specifying alternate paths in their headers, in the form of compa… ▽ More Source-controlled routing has been proposed as a way to improve flexibility of future network architectures, as well as simplifying the data plane. However, if a packet specifies its path, this precludes fast local re-routing within the network. We propose SlickPackets, a novel solution that allows packets to slip around failures by specifying alternate paths in their headers, in the form of compactly-encoded directed acyclic graphs. We show that this can be accomplished with reasonably small packet headers for real network topologies, and results in responsiveness to failures that is competitive with past approaches that require much more state within the network. Our approach thus enables fast failure response while preserving the benefits of source-controlled routing. △ Less

Submitted 8 January, 2012; originally announced January 2012.

Comments: This is the full version of a paper with the same title that appeared in ACM SIGMETRICS 2011, with the inclusion of the appendix. 16 pages

ACM Class: C.2.1; C.2.2; C.2.6

arXiv:1110.1687 [pdf, other]

Jellyfish: Networking Data Centers Randomly

Authors: Ankit Singla, Chi-Yao Hong, Lucian Popa, P. Brighten Godfrey

Abstract: Industry experience indicates that the ability to incrementally expand data centers is essential. However, existing high-bandwidth network designs have rigid structure that interferes with incremental expansion. We present Jellyfish, a high-capacity network interconnect, which, by adopting a random graph topology, yields itself naturally to incremental expansion. Somewhat surprisingly, Jellyfish i… ▽ More Industry experience indicates that the ability to incrementally expand data centers is essential. However, existing high-bandwidth network designs have rigid structure that interferes with incremental expansion. We present Jellyfish, a high-capacity network interconnect, which, by adopting a random graph topology, yields itself naturally to incremental expansion. Somewhat surprisingly, Jellyfish is more cost-efficient than a fat-tree: A Jellyfish interconnect built using the same equipment as a fat-tree, supports as many as 25% more servers at full capacity at the scale of a few thousand nodes, and this advantage improves with scale. Jellyfish also allows great flexibility in building networks with different degrees of oversubscription. However, Jellyfish's unstructured design brings new challenges in routing, physical layout, and wiring. We describe and evaluate approaches that resolve these challenges effectively, indicating that Jellyfish could be deployed in today's data centers. △ Less

Submitted 20 April, 2012; v1 submitted 7 October, 2011; originally announced October 2011.

Comments: 14 pages, 12 figures

arXiv:1108.0192 [pdf, other]

BGP Stability is Precarious

Authors: P. Brighten Godfrey

Abstract: We note a fact which is simple, but may be useful for the networking research community: essentially any change to BGP's decision process can cause divergence --- or convergence when BGP would otherwise diverge. We note a fact which is simple, but may be useful for the networking research community: essentially any change to BGP's decision process can cause divergence --- or convergence when BGP would otherwise diverge. △ Less

Submitted 31 July, 2011; originally announced August 2011.

arXiv:0803.0632 [pdf, ps, other]

Network Coding for Distributed Storage Systems

Authors: Alexandros G. Dimakis, P. Brighten Godfrey, Yunnan Wu, Martin J. Wainwright, Kannan Ramchandran

Abstract: Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since… ▽ More Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate encoded fragments in a distributed way while transferring as little data as possible across the network. For an erasure coded system, a common practice to repair from a node failure is for a new node to download subsets of data stored at a number of surviving nodes, reconstruct a lost coded block using the downloaded data, and store it at the new node. We show that this procedure is sub-optimal. We introduce the notion of regenerating codes, which allow a new node to download \emph{functions} of the stored data from the surviving nodes. We show that regenerating codes can significantly reduce the repair bandwidth. Further, we show that there is a fundamental tradeoff between storage and repair bandwidth which we theoretically characterize using flow arguments on an appropriately constructed graph. By invoking constructive results in network coding, we introduce regenerating codes that can achieve any point in this optimal tradeoff. △ Less

Submitted 5 March, 2008; originally announced March 2008.

Report number: EECS 14546

arXiv:cs/0702015 [pdf, ps, other]

Network Coding for Distributed Storage Systems

Authors: Alexandros G. Dimakis, P. Brighten Godfrey, Martin J. Wainwright, Kannan Ramchandran

Abstract: Peer-to-peer distributed storage systems provide reliable access to data through redundancy spread over nodes across the Internet. A key goal is to minimize the amount of bandwidth used to maintain that redundancy. Storing a file using an erasure code, in fragments spread across nodes, promises to require less redundancy and hence less maintenance bandwidth than simple replication to provide the… ▽ More Peer-to-peer distributed storage systems provide reliable access to data through redundancy spread over nodes across the Internet. A key goal is to minimize the amount of bandwidth used to maintain that redundancy. Storing a file using an erasure code, in fragments spread across nodes, promises to require less redundancy and hence less maintenance bandwidth than simple replication to provide the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate a new fragment in a distributed way while transferring as little data as possible across the network. In this paper, we introduce a general technique to analyze storage architectures that combine any form of coding and replication, as well as presenting two new schemes for maintaining redundancy using erasure codes. First, we show how to optimally generate MDS fragments directly from existing fragments in the system. Second, we introduce a new scheme called Regenerating Codes which use slightly larger fragments than MDS but have lower overall bandwidth use. We also show through simulation that in realistic environments, Regenerating Codes can reduce maintenance bandwidth use by 25 percent or more compared with the best previous design--a hybrid of replication and erasure codes--while simplifying system architecture. △ Less

Submitted 2 February, 2007; originally announced February 2007.

Comments: To appear in INFOCOM 2007

Showing 1–47 of 47 results for author: Godfrey, B