-
Scaling Data Plane Verification with Intent-based Slicing
Authors:
Kuan-Yen Chou,
Santhosh Prabhu,
Giri Subramanian,
Wenxuan Zhou,
Aanand Nayyar,
Brighten Godfrey,
Matthew Caesar
Abstract:
Data plane verification has grown into a powerful tool to ensure network correctness. However, existing monolithic data plane models have high memory requirements with large networks, and the existing method of scaling out is too limited in expressiveness to capture practical network features. In this paper, we describe Scylla, a general data plane verifier that provides fine-grained scale-out wit…
▽ More
Data plane verification has grown into a powerful tool to ensure network correctness. However, existing monolithic data plane models have high memory requirements with large networks, and the existing method of scaling out is too limited in expressiveness to capture practical network features. In this paper, we describe Scylla, a general data plane verifier that provides fine-grained scale-out without the need for a monolithic network model. Scylla creates models for what we call intent-based slices, each of which is constructed at a fine (rule-level) granularity with just enough to verify a given set of intents. The sliced models are retained in memory across a cluster and are incrementally updated in a distributed compute cluster in response to network updates. Our experiments show that Scylla makes the scaling problem more granular -- tied to the size of the intent-based slices rather than that of the overall network. This enables Scylla to verify large, complex networks in minimum units of work that are significantly smaller (in both memory and time) than past techniques, enabling fast scale-out verification with minimal resource requirement.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
New Limit on Dark Photon Kinetic Mixing in the 0.2-1.2 $\boldsymbolμ$eV Mass Range From the Dark E-Field Radio Experiment
Authors:
Joseph Levine,
Benjamin Godfrey,
J. Anthony Tyson,
S. Mani Tripathi,
Daniel Polin,
Amin Aminaei,
Brian H. Kolner,
Paul Stucky
Abstract:
We report new limits on the kinetic mixing strength of the dark photon spanning the mass range 0.21 -- 1.24 $μ$eV corresponding to a frequency span of 50 -- 300 MHz. The Dark E-Field Radio experiment is a wide-band search for dark photon dark matter. In this paper we detail changes in calibration and upgrades since our proof-of-concept pilot run. Our detector employs a wide bandwidth E-field anten…
▽ More
We report new limits on the kinetic mixing strength of the dark photon spanning the mass range 0.21 -- 1.24 $μ$eV corresponding to a frequency span of 50 -- 300 MHz. The Dark E-Field Radio experiment is a wide-band search for dark photon dark matter. In this paper we detail changes in calibration and upgrades since our proof-of-concept pilot run. Our detector employs a wide bandwidth E-field antenna moved to multiple positions in a shielded room, a low noise amplifier, wideband ADC, followed by a $2^{24}$-point FFT. An optimal filter searches for signals with Q $\approx10^6$. In nine days of integration, this system is capable of detecting dark photon signals corresponding to $ε$ several orders of magnitude lower than previous limits. We find a 95% exclusion limit on $ε$ over this mass range between $6\times 10^{-15}$ and $6\times 10^{-13}$, tracking the complex resonant mode structure in the shielded room.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
On a Foundation Model for Operating Systems
Authors:
Divyanshu Saxena,
Nihal Sharma,
Donghyun Kim,
Rohit Dwivedula,
Jiayi Chen,
Chenxi Yang,
Sriram Ravula,
Zichao Hu,
Aditya Akella,
Sebastian Angel,
Joydeep Biswas,
Swarat Chaudhuri,
Isil Dillig,
Alex Dimakis,
P. Brighten Godfrey,
Daehyeok Kim,
Chris Rossbach,
Gang Wang
Abstract:
This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in…
▽ More
This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in varying environments and workloads. We discuss a wide range of possibilities that then arise, from employing foundation models as policy agents to utilizing them as generators and predictors to assist traditional OS control algorithms. Our hope is that this paper spurs further research into OS foundation models and creating the next generation of operating systems for the evolving computing landscape.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Kivi: Verification for Cluster Management
Authors:
Bingzhe Liu,
Gangmuk Lim,
Ryan Beckett,
P. Brighten Godfrey
Abstract:
Modern cloud infrastructure is powered by cluster management systems such as Kubernetes and Docker Swarm. While these systems seek to minimize users' operational burden, the complex, dynamic, and non-deterministic nature of these systems makes them hard to reason about, potentially leading to failures ranging from performance degradation to outages. We present Kivi, the first system for verifying…
▽ More
Modern cloud infrastructure is powered by cluster management systems such as Kubernetes and Docker Swarm. While these systems seek to minimize users' operational burden, the complex, dynamic, and non-deterministic nature of these systems makes them hard to reason about, potentially leading to failures ranging from performance degradation to outages. We present Kivi, the first system for verifying controllers and their configurations in cluster management systems. Kivi focuses on the popular system Kubernetes, and models its controllers and events into processes whereby their interleavings are exhaustively checked via model checking. Central to handling autoscaling and large-scale deployments is our design that seeks to find violations in a smaller and reduced topology. We also develop several model optimizations in Kivi to scale to large clusters. We show that Kivi is effective and accurate in finding issues in realistic and complex scenarios and showcase two new issues in Kubernetes controller source code.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
Flock: Accurate network fault localization at scale
Authors:
Vipul Harsh,
Tong Meng,
Kapil Agrawal,
P. Brighten Godfrey
Abstract:
Inferring the root cause of failures among thousands of components in a data center network is challenging, especially for "gray" failures that are not reported directly by switches. Faults can be localized through end-to-end measurements, but past localization schemes are either too slow for large-scale networks or sacrifice accuracy. We describe Flock, a network fault localization algorithm and…
▽ More
Inferring the root cause of failures among thousands of components in a data center network is challenging, especially for "gray" failures that are not reported directly by switches. Faults can be localized through end-to-end measurements, but past localization schemes are either too slow for large-scale networks or sacrifice accuracy. We describe Flock, a network fault localization algorithm and system that achieves both high accuracy and speed at datacenter scale. Flock uses a probabilistic graphical model (PGM) to achieve high accuracy, coupled with new techniques to dramatically accelerate inference in discrete-valued Bayesian PGMs. Large-scale simulations and experiments in a hardware testbed show Flock speeds up inference by >10000x compared to past PGM methods, and improves accuracy over the best previous datacenter fault localization approaches, reducing inference error by 1.19-11x on the same input telemetry, and by 1.2-55x after incorporating passive telemetry. We also prove Flock's inference is optimal in restricted settings
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Towards Learning and Explaining Indirect Causal Effects in Neural Networks
Authors:
Abbavaram Gowtham Reddy,
Saketh Bachu,
Harsharaj Pathak,
Benin L Godfrey,
Vineeth N. Balasubramanian,
Varshaneya V,
Satya Narayanan Kar
Abstract:
Recently, there has been a growing interest in learning and explaining causal effects within Neural Network (NN) models. By virtue of NN architectures, previous approaches consider only direct and total causal effects assuming independence among input variables. We view an NN as a structural causal model (SCM) and extend our focus to include indirect causal effects by introducing feedforward conne…
▽ More
Recently, there has been a growing interest in learning and explaining causal effects within Neural Network (NN) models. By virtue of NN architectures, previous approaches consider only direct and total causal effects assuming independence among input variables. We view an NN as a structural causal model (SCM) and extend our focus to include indirect causal effects by introducing feedforward connections among input neurons. We propose an ante-hoc method that captures and maintains direct, indirect, and total causal effects during NN model training. We also propose an algorithm for quantifying learned causal effects in an NN model and efficient approximation strategies for quantifying causal effects in high-dimensional data. Extensive experiments conducted on synthetic and real-world datasets demonstrate that the causal effects learned by our ante-hoc method better approximate the ground truth effects compared to existing methods.
△ Less
Submitted 8 January, 2024; v1 submitted 24 March, 2023;
originally announced March 2023.
-
Studies in Pulse Shape Discrimination for an Optimized ASIC Design
Authors:
B. Boxer,
B. Godfrey,
C. Grace,
J. Johnson,
R. Khandwala,
M. Tripathi
Abstract:
The continued advancements of Silicon Photomultipliers (SiPMs) have made them viable photosensors for low recoil energy Pulse Shape Discrimination (PSD) between fast neutron and gamma interactions when coupled to an appropriate scintillator. At the same time, the large number of channels in a typical array calls for the development of low-cost and low-power electronics. A custom integrated circuit…
▽ More
The continued advancements of Silicon Photomultipliers (SiPMs) have made them viable photosensors for low recoil energy Pulse Shape Discrimination (PSD) between fast neutron and gamma interactions when coupled to an appropriate scintillator. At the same time, the large number of channels in a typical array calls for the development of low-cost and low-power electronics. A custom integrated circuit (ASIC) is an ideal solution for this purpose. To assess the requirements for such an ASIC, studies were performed using two scintillators, Stilbene and EJ-276, coupled to a 6 x 6 mm SiPM from Onsemi. We demonstrate that both scintillators are viable for performing PSD for interaction energies from 100 keV to several MeV while optimizing the integration periods used in the PSD metric. These measurements inform the design parameters of the ASIC under development.
△ Less
Submitted 21 December, 2022; v1 submitted 28 September, 2022;
originally announced September 2022.
-
On-Device CPU Scheduling for Sense-React Systems
Authors:
Aditi Partap,
Samuel Grayson,
Muhammad Huzaifa,
Sarita Adve,
Brighten Godfrey,
Saurabh Gupta,
Kris Hauser,
Radhika Mittal
Abstract:
Sense-react systems (e.g. robotics and AR/VR) have to take highly responsive real-time actions, driven by complex decisions involving a pipeline of sensing, perception, planning, and reaction tasks. These tasks must be scheduled on resource-constrained devices such that the performance goals and the requirements of the application are met. This is a difficult scheduling problem that requires handl…
▽ More
Sense-react systems (e.g. robotics and AR/VR) have to take highly responsive real-time actions, driven by complex decisions involving a pipeline of sensing, perception, planning, and reaction tasks. These tasks must be scheduled on resource-constrained devices such that the performance goals and the requirements of the application are met. This is a difficult scheduling problem that requires handling multiple scheduling dimensions, and variations in resource usage and availability. In practice, system designers manually tune parameters for their specific hardware and application, which results in poor generalization and increases the development burden. In this work, we highlight the emerging need for scheduling CPU resources at runtime in sense-react systems. We study three canonical applications (face tracking, robot navigation, and VR) to first understand the key scheduling requirements for such systems. Armed with this understanding, we develop a scheduling framework, Catan, that dynamically schedules compute resources across different components of an app so as to meet the specified application requirements. Through experiments with a prototype implemented on a widely-used robotics framework (ROS) and an open-source AR/VR platform, we show the impact of system scheduling on meeting the performance goals for the three applications, how Catan is able to achieve better application performance than hand-tuned configurations, and how it dynamically adapts to runtime variations.
△ Less
Submitted 14 August, 2022; v1 submitted 27 July, 2022;
originally announced July 2022.
-
Search for Dark Photon Dark Matter: Dark E-Field Radio Pilot Experiment
Authors:
Benjamin Godfrey,
J. Anthony Tyson,
Seth Hillbrand,
Jon Balajthy,
Daniel Polin,
S. Mani Tripathi,
Shelby Klomp,
Joseph Levine,
Nate MacFadden,
Brian H. Kolner,
Molly R. Smith,
Paul Stucky,
Arran Phipps,
Peter Graham,
Kent Irwin
Abstract:
We are building an experiment to search for dark matter in the form of dark photons in the nano- to milli-eV mass range. This experiment is the electromagnetic dual of magnetic detector dark radio experiments. It is also a frequency-time dual experiment in two ways: We search for a high-Q signal in wide-band data rather than tuning a high-$Q$ resonator, and we measure electric rather than magnetic…
▽ More
We are building an experiment to search for dark matter in the form of dark photons in the nano- to milli-eV mass range. This experiment is the electromagnetic dual of magnetic detector dark radio experiments. It is also a frequency-time dual experiment in two ways: We search for a high-Q signal in wide-band data rather than tuning a high-$Q$ resonator, and we measure electric rather than magnetic fields. In this paper we describe a pilot experiment using room temperature electronics which demonstrates feasibility and sets useful limits to the kinetic coupling $ε\sim 10^{-12}$ over 50--300 MHz. With a factor of 2000 increase in real-time spectral coverage, and lower system noise temperature, it will soon be possible to search a wide range of masses at 100 times this sensitivity. We describe the planned experiment in two phases: Phase-I will implement a wide band, 5-million channel, real-time FFT processor over the 30--300 MHz range with a back-end time-domain optimal filter to search for the predicted $Q\sim 10^6$ line using low-noise amplifiers. We have completed spot frequency calibrations using a biconical dipole antenna in a shielded room that extrapolate to a $5 σ$ limit of $ε\sim 10^{-13}$ for the coupling from the dark field, per month of integration. Phase-II will extend the search to 20 GHz using cryogenic preamplifiers and new antennas.
△ Less
Submitted 17 November, 2021; v1 submitted 7 January, 2021;
originally announced January 2021.
-
Caramel: Accelerating Decentralized Distributed Deep Learning with Computation Scheduling
Authors:
Sayed Hadi Hashemi,
Sangeetha Abdu Jyothi,
Brighten Godfrey,
Roy Campbell
Abstract:
The method of choice for parameter aggregation in Deep Neural Network (DNN) training, a network-intensive task, is shifting from the Parameter Server model to decentralized aggregation schemes (AllReduce) inspired by theoretical guarantees of better performance. However, current implementations of AllReduce overlook the interdependence of communication and computation, resulting in significant per…
▽ More
The method of choice for parameter aggregation in Deep Neural Network (DNN) training, a network-intensive task, is shifting from the Parameter Server model to decentralized aggregation schemes (AllReduce) inspired by theoretical guarantees of better performance. However, current implementations of AllReduce overlook the interdependence of communication and computation, resulting in significant performance degradation. In this paper, we develop Caramel, a system that accelerates decentralized distributed deep learning through model-aware computation scheduling and communication optimizations for AllReduce. Caramel achieves this goal through (a) computation DAG scheduling that expands the feasible window of transfer for each parameter (transfer boundaries), and (b) network optimizations for smoothening of the load including adaptive batching and pipelining of parameter transfers. Caramel maintains the correctness of the dataflow model, is hardware-independent, and does not require any user-level or framework-level changes. We implement Caramel over TensorFlow and show that the iteration time of DNN training can be improved by up to 3.62x in a cloud environment.
△ Less
Submitted 29 April, 2020;
originally announced April 2020.
-
Plankton: Scalable network configuration verification through model checking
Authors:
Santhosh Prabhu,
Kuan-Yen Chou,
Ali Kheradmand,
P. Brighten Godfrey,
Matthew Caesar
Abstract:
Network configuration verification enables operators to ensure that the network will behave as intended, prior to deployment of their configurations. Although techniques ranging from graph algorithms to SMT solvers have been proposed, scalable configuration verification with sufficient protocol support continues to be a challenge. In this paper, we show that by combining equivalence partitioning w…
▽ More
Network configuration verification enables operators to ensure that the network will behave as intended, prior to deployment of their configurations. Although techniques ranging from graph algorithms to SMT solvers have been proposed, scalable configuration verification with sufficient protocol support continues to be a challenge. In this paper, we show that by combining equivalence partitioning with explicit-state model checking, network configuration verification can be scaled significantly better than the state of the art, while still supporting a rich set of protocol features. We propose Plankton, which uses symbolic partitioning to manage large header spaces and efficient model checking to exhaustively explore protocol behavior. Thanks to a highly effective suite of optimizations including state hashing, partial order reduction, and policy-based pruning, Plankton successfully verifies policies in industrial-scale networks quickly and compactly, at times reaching a 10000$\times$ speedup compared to the state of the art.
△ Less
Submitted 5 November, 2019;
originally announced November 2019.
-
Forecasting U.S. Textile Comparative Advantage Using Autoregressive Integrated Moving Average Models and Time Series Outlier Analysis
Authors:
Zahra Saki,
Lori Rothenberg,
Marguerite Moor,
Ivan Kandilov,
A. Blanton Godfrey
Abstract:
To establish an updated understanding of the U.S. textile and apparel (TAP) industrys competitive position within the global textile environment, trade data from UN-COMTRADE (1996-2016) was used to calculate the Normalized Revealed Comparative Advantage (NRCA) index for 169 TAP categories at the four-digit Harmonized Schedule (HS) code level. Univariate time series using Autoregressive Integrated…
▽ More
To establish an updated understanding of the U.S. textile and apparel (TAP) industrys competitive position within the global textile environment, trade data from UN-COMTRADE (1996-2016) was used to calculate the Normalized Revealed Comparative Advantage (NRCA) index for 169 TAP categories at the four-digit Harmonized Schedule (HS) code level. Univariate time series using Autoregressive Integrated Moving Average (ARIMA) models forecast short-term future performance of Revealed categories with export advantage. Accompanying outlier analysis examined permanent level shifts that might convey important information about policy changes, influential drivers and random events.
△ Less
Submitted 13 August, 2019;
originally announced August 2019.
-
Dissecting Latency in the Internet's Fiber Infrastructure
Authors:
Ilker Nadi Bozkurt,
Waqar Aqeel,
Debopam Bhattacherjee,
Balakrishnan Chandrasekaran,
Philip Brighten Godfrey,
Gregory Laughlin,
Bruce M. Maggs,
Ankit Singla
Abstract:
The recent publication of the `InterTubes' map of long-haul fiber-optic cables in the contiguous United States invites an exciting question: how much faster would the Internet be if routes were chosen to minimize latency? Previous measurement campaigns suggest the following rule of thumb for estimating Internet latency: multiply line-of-sight distance by 2.1, then divide by the speed of light in f…
▽ More
The recent publication of the `InterTubes' map of long-haul fiber-optic cables in the contiguous United States invites an exciting question: how much faster would the Internet be if routes were chosen to minimize latency? Previous measurement campaigns suggest the following rule of thumb for estimating Internet latency: multiply line-of-sight distance by 2.1, then divide by the speed of light in fiber. But a simple computation of shortest-path lengths through the conduits in the InterTubes map suggests that the conversion factor for all pairs of the 120 largest population centers in the U.S.\ could be reduced from 2.1 to 1.3, in the median, even using less than half of the links. To determine whether an overlay network could be used to provide shortest paths, and how well it would perform, we used the diverse server deployment of a CDN to measure latency across individual conduits. We were surprised to find, however, that latencies are sometimes much higher than would be predicted by conduit length alone. To understand why, we report findings from our analysis of network latency data from the backbones of two Tier-1 ISPs, two scientific and research networks, and the recently built fiber backbone of a CDN.
△ Less
Submitted 26 November, 2018;
originally announced November 2018.
-
Expander Datacenters: From Theory to Practice
Authors:
Vipul Harsh,
Sangeetha Abdu Jyothi,
Inderdeep Singh,
P. Brighten Godfrey
Abstract:
Recent work has shown that expander-based data center topologies are robust and can yield superior performance over Clos topologies. However, to achieve these benefits, previous proposals use routing and transport schemes that impede quick industry adoption. In this paper, we examine if expanders can be effective for the technology and environments practical in today's data centers, including the…
▽ More
Recent work has shown that expander-based data center topologies are robust and can yield superior performance over Clos topologies. However, to achieve these benefits, previous proposals use routing and transport schemes that impede quick industry adoption. In this paper, we examine if expanders can be effective for the technology and environments practical in today's data centers, including the use of traditional protocols, at both small and large scale while complying with common practices such as over-subscription. We study bandwidth, latency and burst tolerance of topologies, highlighting pitfalls of previous topology comparisons. We consider several other metrics of interest: packet loss during failures, queue occupancy and topology degradation. Our experiments show that expanders can realize 3x more throughput than an equivalent fat tree, and 1.5x more throughput than an equivalent leaf-spine topology, for a wide range of scenarios, with only traditional protocols. We observe that expanders achieve lower flow completion times, are more resilient to bursty load conditions like incast and outcast and degrade more gracefully with increasing load. Our results are based on extensive simulations and experiments on a hardware testbed with realistic topologies and real traffic patterns.
△ Less
Submitted 31 October, 2018;
originally announced November 2018.
-
Leveraging Product as an Activation Function in Deep Networks
Authors:
Luke B. Godfrey,
Michael S. Gashler
Abstract:
Product unit neural networks (PUNNs) are powerful representational models with a strong theoretical basis, but have proven to be difficult to train with gradient-based optimizers. We present windowed product unit neural networks (WPUNNs), a simple method of leveraging product as a nonlinearity in a neural network. Windowing the product tames the complex gradient surface and enables WPUNNs to learn…
▽ More
Product unit neural networks (PUNNs) are powerful representational models with a strong theoretical basis, but have proven to be difficult to train with gradient-based optimizers. We present windowed product unit neural networks (WPUNNs), a simple method of leveraging product as a nonlinearity in a neural network. Windowing the product tames the complex gradient surface and enables WPUNNs to learn effectively, solving the problems faced by PUNNs. WPUNNs use product layers between traditional sum layers, capturing the representational power of product units and using the product itself as a nonlinearity. We find the result that this method works as well as traditional nonlinearities like ReLU on the MNIST dataset. We demonstrate that WPUNNs can also generalize gated units in recurrent neural networks, yielding results comparable to LSTM networks.
△ Less
Submitted 19 October, 2018;
originally announced October 2018.
-
Internet Congestion Control via Deep Reinforcement Learning
Authors:
Nathan Jay,
Noga H. Rotman,
P. Brighten Godfrey,
Michael Schapira,
Aviv Tamar
Abstract:
We present and investigate a novel and timely application domain for deep reinforcement learning (RL): Internet congestion control. Congestion control is the core networking task of modulating traffic sources' data-transmission rates to efficiently utilize network capacity, and is the subject of extensive attention in light of the advent of Internet services such as live video, virtual reality, In…
▽ More
We present and investigate a novel and timely application domain for deep reinforcement learning (RL): Internet congestion control. Congestion control is the core networking task of modulating traffic sources' data-transmission rates to efficiently utilize network capacity, and is the subject of extensive attention in light of the advent of Internet services such as live video, virtual reality, Internet-of-Things, and more. We show that casting congestion control as RL enables training deep network policies that capture intricate patterns in data traffic and network conditions, and leverage this to outperform the state-of-the-art. We also highlight significant challenges facing real-world adoption of RL-based congestion control, including fairness, safety, and generalization, which are not trivial to address within conventional RL formalism. To facilitate further research and reproducibility of our results, we present a test suite for RL-guided congestion control based on the OpenAI Gym interface.
△ Less
Submitted 21 May, 2019; v1 submitted 7 October, 2018;
originally announced October 2018.
-
cISP: A Speed-of-Light Internet Service Provider
Authors:
Debopam Bhattacherjee,
Sangeetha Abdu Jyothi,
Ilker Nadi Bozkurt,
Muhammad Tirmazi,
Waqar Aqeel,
Anthony Aguirre,
Balakrishnan Chandrasekaran,
P. Brighten Godfrey,
Gregory P. Laughlin,
Bruce M. Maggs,
Ankit Singla
Abstract:
Low latency is a requirement for a variety of interactive network applications. The Internet, however, is not optimized for latency. We thus explore the design of cost-effective wide-area networks that move data over paths very close to great-circle paths, at speeds very close to the speed of light in vacuum. Our cISP design augments the Internet's fiber with free-space wireless connectivity. cISP…
▽ More
Low latency is a requirement for a variety of interactive network applications. The Internet, however, is not optimized for latency. We thus explore the design of cost-effective wide-area networks that move data over paths very close to great-circle paths, at speeds very close to the speed of light in vacuum. Our cISP design augments the Internet's fiber with free-space wireless connectivity. cISP addresses the fundamental challenge of simultaneously providing low latency and scalable bandwidth, while accounting for numerous practical factors ranging from transmission tower availability to packet queuing. We show that instantiations of cISP across the contiguous United States and Europe would achieve mean latencies within 5% of that achievable using great-circle paths at the speed of light, over medium and long distances. Further, we estimate that the economic value from such networks would substantially exceed their expense.
△ Less
Submitted 10 October, 2018; v1 submitted 28 September, 2018;
originally announced September 2018.
-
A parameterized activation function for learning fuzzy logic operations in deep neural networks
Authors:
Luke B. Godfrey,
Michael S. Gashler
Abstract:
We present a deep learning architecture for learning fuzzy logic expressions. Our model uses an innovative, parameterized, differentiable activation function that can learn a number of logical operations by gradient descent. This activation function allows a neural network to determine the relationships between its input variables and provides insight into the logical significance of learned netwo…
▽ More
We present a deep learning architecture for learning fuzzy logic expressions. Our model uses an innovative, parameterized, differentiable activation function that can learn a number of logical operations by gradient descent. This activation function allows a neural network to determine the relationships between its input variables and provides insight into the logical significance of learned network parameters. We provide a theoretical basis for this parameterization and demonstrate its effectiveness and utility by successfully applying our model to five classification problems from the UCI Machine Learning Repository.
△ Less
Submitted 11 September, 2017; v1 submitted 28 August, 2017;
originally announced August 2017.
-
On the Evaluation of Silicon Photomultipliers for Use as Photosensors in Liquid Xenon Detectors
Authors:
Benjamin Godfrey,
Tyler Anderson,
Earl Breedon,
Jacob Cutter,
Navneet Dhaliwal,
Olivia Dalager,
Seth Hillbrand,
Michael Irving,
Aaron Manalaysay,
Juan Montoya,
James Morad,
Christian Neher,
Dustin Stolp,
Mani Tripathi,
Ryan Wilson
Abstract:
Silicon photomultipliers (SiPMs) are potential solid-state alternatives to traditional photomultiplier tubes (PMTs) for single-photon detection. In this paper, we report on evaluating SensL MicroFC-10035-SMT SiPMs for their suitability as PMT replacements. The devices were successfully operated in a liquid-xenon detector, which demonstrates that SiPMs can be used in noble element time projection c…
▽ More
Silicon photomultipliers (SiPMs) are potential solid-state alternatives to traditional photomultiplier tubes (PMTs) for single-photon detection. In this paper, we report on evaluating SensL MicroFC-10035-SMT SiPMs for their suitability as PMT replacements. The devices were successfully operated in a liquid-xenon detector, which demonstrates that SiPMs can be used in noble element time projection chambers as photosensors. The devices were also cooled down to 170 K to observe dark count dependence on temperature. No dependencies on the direction of an applied 3.2 kV/cm electric field were observed with respect to dark-count rate, gain, or photon detection efficiency.
△ Less
Submitted 3 April, 2018; v1 submitted 16 June, 2017;
originally announced June 2017.
-
Neural Decomposition of Time-Series Data for Effective Generalization
Authors:
Luke B. Godfrey,
Michael S. Gashler
Abstract:
We present a neural network technique for the analysis and extrapolation of time-series data called Neural Decomposition (ND). Units with a sinusoidal activation function are used to perform a Fourier-like decomposition of training samples into a sum of sinusoids, augmented by units with nonperiodic activation functions to capture linear trends and other nonperiodic components. We show how careful…
▽ More
We present a neural network technique for the analysis and extrapolation of time-series data called Neural Decomposition (ND). Units with a sinusoidal activation function are used to perform a Fourier-like decomposition of training samples into a sum of sinusoids, augmented by units with nonperiodic activation functions to capture linear trends and other nonperiodic components. We show how careful weight initialization can be combined with regularization to form a simple model that generalizes well. Our method generalizes effectively on the Mackey-Glass series, a dataset of unemployment rates as reported by the U.S. Department of Labor Statistics, a time-series of monthly international airline passengers, the monthly ozone concentration in downtown Los Angeles, and an unevenly sampled time-series of oxygen isotope measurements from a cave in north India. We find that ND outperforms popular time-series forecasting techniques including LSTM, echo state networks, ARIMA, SARIMA, SVR with a radial basis function, and Gashler and Ashmore's model.
△ Less
Submitted 5 June, 2017; v1 submitted 25 May, 2017;
originally announced May 2017.
-
Elimination of Numerical Cherenkov Instability in flowing-plasma Particle-In-Cell simulations by using Galilean coordinates
Authors:
Remi Lehe,
Manuel Kirchen,
Brendan B. Godfrey,
Andreas R. Maier,
Jean-Luc Vay
Abstract:
Particle-In-Cell (PIC) simulations of relativistic flowing plasmas are of key interest to several fields of physics (including e.g. laser-wakefield acceleration, when viewed in a Lorentz-boosted frame), but remain sometimes infeasible due to the well-known numerical Cherenkov instability (NCI). In this article, we show that, for a plasma drifting at a uniform relativistic velocity, the NCI can be…
▽ More
Particle-In-Cell (PIC) simulations of relativistic flowing plasmas are of key interest to several fields of physics (including e.g. laser-wakefield acceleration, when viewed in a Lorentz-boosted frame), but remain sometimes infeasible due to the well-known numerical Cherenkov instability (NCI). In this article, we show that, for a plasma drifting at a uniform relativistic velocity, the NCI can be eliminated by simply integrating the PIC equations in Galilean coordinates that follow the plasma (also sometimes known as comoving coordinates) within a spectral analytical framework. The elimination of the NCI is verified empirically and confirmed by a theoretical analysis of the instability. Moreover, it is shown that this method is applicable both to Cartesian geometry and to cylindrical geometry with azimuthal Fourier decomposition.
△ Less
Submitted 31 July, 2016;
originally announced August 2016.
-
Stable discrete representation of relativistically drifting plasmas
Authors:
Manuel Kirchen,
Remi Lehe,
Brendan B. Godfrey,
Irene Dornmair,
Soeren Jalas,
Kevin Peters,
Jean-Luc Vay,
Andreas R. Maier
Abstract:
Representing the electrodynamics of relativistically drifting particle ensembles in discrete, co-propagating Galilean coordinates enables the derivation of a Particle-in-Cell algorithm that is intrinsically free of the Numerical Cherenkov Instability, for plasmas flowing at a uniform velocity. Application of the method is shown by modeling plasma accelerators in a Lorentz-transformed optimal frame…
▽ More
Representing the electrodynamics of relativistically drifting particle ensembles in discrete, co-propagating Galilean coordinates enables the derivation of a Particle-in-Cell algorithm that is intrinsically free of the Numerical Cherenkov Instability, for plasmas flowing at a uniform velocity. Application of the method is shown by modeling plasma accelerators in a Lorentz-transformed optimal frame of reference.
△ Less
Submitted 31 July, 2016;
originally announced August 2016.
-
A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks
Authors:
Luke B. Godfrey,
Michael S. Gashler
Abstract:
We present the soft exponential activation function for artificial neural networks that continuously interpolates between logarithmic, linear, and exponential functions. This activation function is simple, differentiable, and parameterized so that it can be trained as the rest of the network is trained. We hypothesize that soft exponential has the potential to improve neural network learning, as i…
▽ More
We present the soft exponential activation function for artificial neural networks that continuously interpolates between logarithmic, linear, and exponential functions. This activation function is simple, differentiable, and parameterized so that it can be trained as the rest of the network is trained. We hypothesize that soft exponential has the potential to improve neural network learning, as it can exactly calculate many natural operations that typical neural networks can only approximate, including addition, multiplication, inner product, distance, polynomials, and sinusoids.
△ Less
Submitted 3 February, 2016;
originally announced February 2016.
-
A spectral, quasi-cylindrical and dispersion-free Particle-In-Cell algorithm
Authors:
Remi Lehe,
Manuel Kirchen,
Igor A. Andriyash,
Brendan B. Godfrey,
Jean-Luc Vay
Abstract:
We propose a spectral Particle-In-Cell (PIC) algorithm that is based on the combination of a Hankel transform and a Fourier transform. For physical problems that have close-to-cylindrical symmetry, this algorithm can be much faster than full 3D PIC algorithms. In addition, unlike standard finite-difference PIC codes, the proposed algorithm is free of numerical dispersion. This algorithm is benchma…
▽ More
We propose a spectral Particle-In-Cell (PIC) algorithm that is based on the combination of a Hankel transform and a Fourier transform. For physical problems that have close-to-cylindrical symmetry, this algorithm can be much faster than full 3D PIC algorithms. In addition, unlike standard finite-difference PIC codes, the proposed algorithm is free of numerical dispersion. This algorithm is benchmarked in several situations that are of interest for laser-plasma interactions. These benchmarks show that it avoids a number of numerical artifacts, that would otherwise affect the physics in a standard PIC algorithm - including the zero-order numerical Cherenkov effect.
△ Less
Submitted 2 March, 2016; v1 submitted 16 July, 2015;
originally announced July 2015.
-
Towards a Speed of Light Internet
Authors:
Ankit Singla,
Balakrishnan Chandrasekaran,
P. Brighten Godfrey,
Bruce Maggs
Abstract:
In principle, a network can transfer data at nearly the speed of light. Today's Internet, however, is much slower: our measurements show that latencies are typically more than one, and often more than two orders of magnitude larger than the lower bound implied by the speed of light. Closing this gap would not only add value to today's Internet applications, but might also open the door to exciting…
▽ More
In principle, a network can transfer data at nearly the speed of light. Today's Internet, however, is much slower: our measurements show that latencies are typically more than one, and often more than two orders of magnitude larger than the lower bound implied by the speed of light. Closing this gap would not only add value to today's Internet applications, but might also open the door to exciting new applications. Thus, we propose a grand challenge for the networking research community: building a speed-of-light Internet. Towards addressing this goal, we begin by investigating the causes of latency inflation in the Internet across the network stack. Our analysis reveals that while protocol overheads, which have dominated the community's attention, are indeed important, infrastructural inefficiencies are a significant and under-explored problem. Thus, we propose a radical, yet surprisingly low-cost approach to mitigating latency inflation at the lowest layers and building a nearly speed-of-light Internet infrastructure.
△ Less
Submitted 13 May, 2015;
originally announced May 2015.
-
Improved Numerical Cherenkov Instability Suppression in the Generalized PSTD PIC Algorithm
Authors:
Brendan B. Godfrey,
Jean-Luc Vay
Abstract:
The family of generalized Pseudo-Spectral Time Domain (including the Pseudo-Spectral Analytical Time Domain) Particle-in-Cell algorithms offers substantial versatility for simulating particle beams and plasmas, and well written codes using these algorithms run reasonably fast. When simulating relativistic beams and streaming plasmas in multiple dimensions, they are, however, subject to the numeric…
▽ More
The family of generalized Pseudo-Spectral Time Domain (including the Pseudo-Spectral Analytical Time Domain) Particle-in-Cell algorithms offers substantial versatility for simulating particle beams and plasmas, and well written codes using these algorithms run reasonably fast. When simulating relativistic beams and streaming plasmas in multiple dimensions, they are, however, subject to the numerical Cherenkov instability. Previous studies have shown that instability growth rates can be reduced substantially by modifying slightly the transverse fields as seen by the streaming particles . Here, we offer an approach which completely eliminates the fundamental mode of the numerical Cherenkov instability while minimizing the transverse field corrections. The procedure, numerically computed residual growth rates (from weaker, higher order instability aliases), and comparisons with WARP simulations are presented. In some instances, there are no numerical instabilities whatsoever, at least in the linear regime.
△ Less
Submitted 4 February, 2015;
originally announced February 2015.
-
PCC: Re-architecting Congestion Control for Consistent High Performance
Authors:
Mo Dong,
Qingxi Li,
Doron Zarchy,
Brighten Godfrey,
Michael Schapira
Abstract:
TCP and its variants have suffered from surprisingly poor performance for decades. We argue the TCP family has little hope to achieve consistent high performance due to a fundamental architectural deficiency: hardwiring packet-level events to control responses without understanding the real performance result of its actions. We propose Performance-oriented Congestion Control (PCC), a new congestio…
▽ More
TCP and its variants have suffered from surprisingly poor performance for decades. We argue the TCP family has little hope to achieve consistent high performance due to a fundamental architectural deficiency: hardwiring packet-level events to control responses without understanding the real performance result of its actions. We propose Performance-oriented Congestion Control (PCC), a new congestion control architecture in which each sender continuously observes the connection between its actions and empirically experienced performance, enabling it to consistently adopt actions that result in high performance. We prove that PCC converges to a stable and fair equilibrium. Across many real-world and challenging environments, PCC shows consistent and often 10x performance improvement, with better fairness and stability than TCP. PCC requires no router hardware support or new packet format.
△ Less
Submitted 10 October, 2014; v1 submitted 24 September, 2014;
originally announced September 2014.
-
Review and Recent Advances in PIC Modeling of Relativistic Beams and Plasmas
Authors:
Brendan B. Godfrey
Abstract:
Particle-in-Cell (PIC) simulation codes have wide applicability to first-principles modeling of multidimensional nonlinear plasma phenomena, including wake-field accelerators. This review addresses both finite difference and pseudo-spectral PIC algorithms, including numerical instability suppression and generalizations of the spectral field solver.
Particle-in-Cell (PIC) simulation codes have wide applicability to first-principles modeling of multidimensional nonlinear plasma phenomena, including wake-field accelerators. This review addresses both finite difference and pseudo-spectral PIC algorithms, including numerical instability suppression and generalizations of the spectral field solver.
△ Less
Submitted 5 August, 2014;
originally announced August 2014.
-
Measuring and Understanding Throughput of Network Topologies
Authors:
Sangeetha Abdu Jyothi,
Ankit Singla,
P. Brighten Godfrey,
Alexandra Kolla
Abstract:
High throughput is of particular interest in data center and HPC networks. Although myriad network topologies have been proposed, a broad head-to-head comparison across topologies and across traffic patterns is absent, and the right way to compare worst-case throughput performance is a subtle problem.
In this paper, we develop a framework to benchmark the throughput of network topologies, using…
▽ More
High throughput is of particular interest in data center and HPC networks. Although myriad network topologies have been proposed, a broad head-to-head comparison across topologies and across traffic patterns is absent, and the right way to compare worst-case throughput performance is a subtle problem.
In this paper, we develop a framework to benchmark the throughput of network topologies, using a two-pronged approach. First, we study performance on a variety of synthetic and experimentally-measured traffic matrices (TMs). Second, we show how to measure worst-case throughput by generating a near-worst-case TM for any given topology. We apply the framework to study the performance of these TMs in a wide range of network topologies, revealing insights into the performance of topologies with scaling, robustness of performance across TMs, and the effect of scattered workload placement. Our evaluation code is freely available.
△ Less
Submitted 14 November, 2016; v1 submitted 11 February, 2014;
originally announced February 2014.
-
Suppressing the Numerical Cherenkov Instability in FDTD PIC Codes
Authors:
Brendan B. Godfrey,
Jean-Luc Vay
Abstract:
A procedure for largely suppressing the numerical Cherenkov instability in finite difference time-domain (FDTD) particle-in-cell (PIC) simulations of cold, relativistic beams is derived, and residual growth rates computed and compared with WARP code simulation results. Sample laser-plasma acceleration simulation output is provided to further validate the new procedure.
A procedure for largely suppressing the numerical Cherenkov instability in finite difference time-domain (FDTD) particle-in-cell (PIC) simulations of cold, relativistic beams is derived, and residual growth rates computed and compared with WARP code simulation results. Sample laser-plasma acceleration simulation output is provided to further validate the new procedure.
△ Less
Submitted 4 January, 2014;
originally announced January 2014.
-
High Throughput Data Center Topology Design
Authors:
Ankit Singla,
P. Brighten Godfrey,
Alexandra Kolla
Abstract:
With high throughput networks acquiring a crucial role in supporting data-intensive applications, a variety of data center network topologies have been proposed to achieve high capacity at low cost. While this literature explores a large number of design points, even in the limited case of a network of identical switches, no proposal has been able to claim any notion of optimality. The case of het…
▽ More
With high throughput networks acquiring a crucial role in supporting data-intensive applications, a variety of data center network topologies have been proposed to achieve high capacity at low cost. While this literature explores a large number of design points, even in the limited case of a network of identical switches, no proposal has been able to claim any notion of optimality. The case of heterogeneous networks, incorporating multiple line-speeds and port-counts as data centers grow over time, introduces even greater complexity.
In this paper, we present the first non-trivial upper-bound on network throughput under uniform traffic patterns for any topology with identical switches. We then show that random graphs achieve throughput surprisingly close to this bound, within a few percent at the scale of a few thousand servers. Apart from demonstrating that homogeneous topology design may be reaching its limits, this result also motivates our use of random graphs as building blocks to explore the design of heterogeneous networks. Given a heterogeneous pool of network switches, through experiments and analysis, we explore how the distribution of servers across switches and the interconnection of switches affect network throughput. We apply these insights to a real-world heterogeneous data center topology, VL2, demonstrating as much as 43% higher throughput with the same equipment.
△ Less
Submitted 12 February, 2014; v1 submitted 26 September, 2013;
originally announced September 2013.
-
Shortest Paths in Microseconds
Authors:
Rachit Agarwal,
Matthew Caesar,
P. Brighten Godfrey,
Ben Y. Zhao
Abstract:
Computing shortest paths is a fundamental primitive for several social network applications including socially-sensitive ranking, location-aware search, social auctions and social network privacy. Since these applications compute paths in response to a user query, the goal is to minimize latency while maintaining feasible memory requirements. We present ASAP, a system that achieves this goal by ex…
▽ More
Computing shortest paths is a fundamental primitive for several social network applications including socially-sensitive ranking, location-aware search, social auctions and social network privacy. Since these applications compute paths in response to a user query, the goal is to minimize latency while maintaining feasible memory requirements. We present ASAP, a system that achieves this goal by exploiting the structure of social networks.
ASAP preprocesses a given network to compute and store a partial shortest path tree (PSPT) for each node. The PSPTs have the property that for any two nodes, each edge along the shortest path is with high probability contained in the PSPT of at least one of the nodes. We show that the structure of social networks enable the PSPT of each node to be an extremely small fraction of the entire network; hence, PSPTs can be stored efficiently and each shortest path can be computed extremely quickly.
For a real network with 5 million nodes and 69 million edges, ASAP computes a shortest path for most node pairs in less than 49 microseconds per pair. ASAP, unlike any previous technique, also computes hundreds of paths (along with corresponding distances) between any node pair in less than 100 microseconds. Finally, ASAP admits efficient implementation on distributed programming frameworks like MapReduce.
△ Less
Submitted 3 September, 2013;
originally announced September 2013.
-
Numerical Stability Improvements for the Pseudo-Spectral EM PIC Algorithm
Authors:
Brendan B. Godfrey,
Jean-Luc Vay,
Irving Haber
Abstract:
The pseudo-spectral analytical time-domain (PSATD) particle-in-cell (PIC) algorithm solves the vacuum Maxwell's equations exactly, has no Courant time-step limit (as conventionally defined), and offers substantial flexibility in plasma and particle beam simulations. It is, however, not free of the usual numerical instabilities, including the numerical Cherenkov instability, when applied to relativ…
▽ More
The pseudo-spectral analytical time-domain (PSATD) particle-in-cell (PIC) algorithm solves the vacuum Maxwell's equations exactly, has no Courant time-step limit (as conventionally defined), and offers substantial flexibility in plasma and particle beam simulations. It is, however, not free of the usual numerical instabilities, including the numerical Cherenkov instability, when applied to relativistic beam simulations. This paper presents several approaches that, when combined with digital filtering, almost completely eliminate the numerical Cherenkov instability. The paper also investigates the numerical stability of the PSATD algorithm at low beam energies.
△ Less
Submitted 31 August, 2013;
originally announced September 2013.
-
Low latency via redundancy
Authors:
Ashish Vulimiri,
P. Brighten Godfrey,
Radhika Mittal,
Justine Sherry,
Sylvia Ratnasamy,
Scott Shenker
Abstract:
Low latency is critical for interactive networked applications. But while we know how to scale systems to increase capacity, reducing latency --- especially the tail of the latency distribution --- can be much more difficult. In this paper, we argue that the use of redundancy is an effective way to convert extra capacity into reduced latency. By initiating redundant operations across diverse resou…
▽ More
Low latency is critical for interactive networked applications. But while we know how to scale systems to increase capacity, reducing latency --- especially the tail of the latency distribution --- can be much more difficult. In this paper, we argue that the use of redundancy is an effective way to convert extra capacity into reduced latency. By initiating redundant operations across diverse resources and using the first result which completes, redundancy improves a system's latency even under exceptional conditions. We study the tradeoff with added system utilization, characterizing the situations in which replicating all tasks reduces mean latency. We then demonstrate empirically that replicating all operations can result in significant mean and tail latency reduction in real-world systems including DNS queries, database servers, and packet forwarding within networks.
△ Less
Submitted 16 June, 2013;
originally announced June 2013.
-
A cost-benefit analysis of low latency via added utilization
Authors:
Ashish Vulimiri,
P. Brighten Godfrey,
Sri Varsha Gorge,
Zitian Liu,
Scott Shenker
Abstract:
Several recently proposed techniques achieve latency reduction by trading it off for some amount of additional bandwidth usage. But how would one quantify whether the tradeoff is actually beneficial in a given system? We develop an economic cost vs. benefit analysis for answering this question. We use the analysis to derive a benchmark for wide-area client-server applications, and demonstrate how…
▽ More
Several recently proposed techniques achieve latency reduction by trading it off for some amount of additional bandwidth usage. But how would one quantify whether the tradeoff is actually beneficial in a given system? We develop an economic cost vs. benefit analysis for answering this question. We use the analysis to derive a benchmark for wide-area client-server applications, and demonstrate how it can be applied to reason about a particular latency saving technique --- redundant DNS requests.
△ Less
Submitted 4 December, 2014; v1 submitted 14 June, 2013;
originally announced June 2013.
-
Numerical stability analysis of the Pseudo-Spectral Analytical Time-Domain PIC algorithm
Authors:
Brendan B. Godfrey,
Jean-Luc Vay,
Irving Haber
Abstract:
The pseudo-spectral analytical time-domain (PSATD) particle-in-cell (PIC) algorithm solves the vacuum Maxwell's equations exactly, has no Courant time-step limit (as conventionally defined), and offers substantial flexibility in plasma and particle beam simulations. It is, however, not free of the usual numerical instabilities, including the numerical Cherenkov instability, when applied to relativ…
▽ More
The pseudo-spectral analytical time-domain (PSATD) particle-in-cell (PIC) algorithm solves the vacuum Maxwell's equations exactly, has no Courant time-step limit (as conventionally defined), and offers substantial flexibility in plasma and particle beam simulations. It is, however, not free of the usual numerical instabilities, including the numerical Cherenkov instability, when applied to relativistic beam simulations. This paper derives and solves the numerical dispersion relation for the PSATD algorithm and compares the results with corresponding behavior of the more conventional pseudo-spectral time-domain (PSTD) and finite difference time-domain (FDTD) algorithms. In general, PSATD offers superior stability properties over a reasonable range of time steps. More importantly, one version of the PSATD algorithm, when combined with digital filtering, is almost completely free of the numerical Cherenkov instability for time steps (scaled to the speed of light) comparable to or smaller than the axial cell size.
△ Less
Submitted 5 June, 2013; v1 submitted 31 May, 2013;
originally announced May 2013.
-
Scalable Routing on Flat Names
Authors:
Ankit Singla,
P. Brighten Godfrey,
Kevin Fall,
Gianluca Iannaccone,
Sylvia Ratnasamy
Abstract:
We introduce a protocol which routes on flat, location-independent identifiers with guaranteed scalability and low stretch. Our design builds on theoretical advances in the area of compact routing, and is the first to realize these guarantees in a dynamic distributed setting.
We introduce a protocol which routes on flat, location-independent identifiers with guaranteed scalability and low stretch. Our design builds on theoretical advances in the area of compact routing, and is the first to realize these guarantees in a dynamic distributed setting.
△ Less
Submitted 25 February, 2013;
originally announced February 2013.
-
Numerical stability of relativistic beam multidimensional PIC simulations employing the Esirkepov algorithm
Authors:
Brendan B. Godfrey,
Jean-Luc Vay
Abstract:
Rapidly growing numerical instabilities routinely occur in multidimensional particle-in-cell computer simulations of plasma-based particle accelerators, astrophysical phenomena, and relativistic charged particle beams. Reducing instability growth to acceptable levels has necessitated higher resolution grids, high-order field solvers, current filtering, etc. except for certain ratios of the time st…
▽ More
Rapidly growing numerical instabilities routinely occur in multidimensional particle-in-cell computer simulations of plasma-based particle accelerators, astrophysical phenomena, and relativistic charged particle beams. Reducing instability growth to acceptable levels has necessitated higher resolution grids, high-order field solvers, current filtering, etc. except for certain ratios of the time step to the axial cell size, for which numerical growth rates and saturation levels are reduced substantially. This paper derives and solves the cold beam dispersion relation for numerical instabilities in multidimensional, relativistic, electromagnetic particle-in-cell programs employing either the standard or the Cole-Karkkainnen finite difference field solver on a staggered mesh and the common Esirkepov current-gathering algorithm. Good overall agreement is achieved with previously reported results of the WARP code. In particular, the existence of select time steps for which instabilities are minimized is explained. Additionally, an alternative field interpolation algorithm is proposed for which instabilities are almost completely eliminated for a particular time step in ultra-relativistic simulations.
△ Less
Submitted 1 November, 2012;
originally announced November 2012.
-
On the Resilience of Routing Tables
Authors:
Joan Feigenbaum,
Brighten Godfrey,
Aurojit Panda,
Michael Schapira,
Scott Shenker,
Ankit Singla
Abstract:
Many modern network designs incorporate "failover" paths into routers' forwarding tables. We initiate the theoretical study of the conditions under which such resilient routing tables can guarantee delivery of packets.
Many modern network designs incorporate "failover" paths into routers' forwarding tables. We initiate the theoretical study of the conditions under which such resilient routing tables can guarantee delivery of packets.
△ Less
Submitted 3 August, 2012; v1 submitted 16 July, 2012;
originally announced July 2012.
-
Finishing Flows Quickly with Preemptive Scheduling
Authors:
Chi-Yao Hong,
Matthew Caesar,
P. Brighten Godfrey
Abstract:
Today's data centers face extreme challenges in providing low latency. However, fair sharing, a principle commonly adopted in current congestion control protocols, is far from optimal for satisfying latency requirements.
We propose Preemptive Distributed Quick (PDQ) flow scheduling, a protocol designed to complete flows quickly and meet flow deadlines. PDQ enables flow preemption to approximate…
▽ More
Today's data centers face extreme challenges in providing low latency. However, fair sharing, a principle commonly adopted in current congestion control protocols, is far from optimal for satisfying latency requirements.
We propose Preemptive Distributed Quick (PDQ) flow scheduling, a protocol designed to complete flows quickly and meet flow deadlines. PDQ enables flow preemption to approximate a range of scheduling disciplines. For example, PDQ can emulate a shortest job first algorithm to give priority to the short flows by pausing the contending flows. PDQ borrows ideas from centralized scheduling disciplines and implements them in a fully distributed manner, making it scalable to today's data centers. Further, we develop a multipath version of PDQ to exploit path diversity.
Through extensive packet-level and flow-level simulation, we demonstrate that PDQ significantly outperforms TCP, RCP and D3 in data center environments. We further show that PDQ is stable, resilient to packet loss, and preserves nearly all its performance gains even given inaccurate flow information.
△ Less
Submitted 12 June, 2012; v1 submitted 10 June, 2012;
originally announced June 2012.
-
Shortest Paths in Less Than a Millisecond
Authors:
Rachit Agarwal,
Matthew Caesar,
P. Brighten Godfrey,
Ben Y. Zhao
Abstract:
We consider the problem of answering point-to-point shortest path queries on massive social networks. The goal is to answer queries within tens of milliseconds while minimizing the memory requirements. We present a technique that achieves this goal for an extremely large fraction of path queries by exploiting the structure of the social networks.
Using evaluations on real-world datasets, we argu…
▽ More
We consider the problem of answering point-to-point shortest path queries on massive social networks. The goal is to answer queries within tens of milliseconds while minimizing the memory requirements. We present a technique that achieves this goal for an extremely large fraction of path queries by exploiting the structure of the social networks.
Using evaluations on real-world datasets, we argue that our technique offers a unique trade-off between latency, memory and accuracy. For instance, for the LiveJournal social network (roughly 5 million nodes and 69 million edges), our technique can answer 99.9% of the queries in less than a millisecond. In comparison to storing all pair shortest paths, our technique requires at least 550x less memory; the average query time is roughly 365 microseconds --- 430x faster than the state-of-the-art shortest path algorithm. Furthermore, the relative performance of our technique improves with the size (and density) of the network. For the Orkut social network (3 million nodes and 220 million edges), for instance, our technique is roughly 2588x faster than the state-of-the-art algorithm for computing shortest paths.
△ Less
Submitted 6 June, 2012;
originally announced June 2012.
-
Faster Approximate Distance Queries and Compact Routing in Sparse Graphs
Authors:
Rachit Agarwal,
P. Brighten Godfrey,
Sariel Har-Peled
Abstract:
A distance oracle is a compact representation of the shortest distance matrix of a graph. It can be queried to approximate shortest paths between any pair of vertices. Any distance oracle that returns paths of worst-case stretch (2k-1) must require space $Ω(n^{1 + 1/k})$ for graphs of n nodes. The hard cases that enforce this lower bound are, however, rather dense graphs with average degree Ω(n^{1…
▽ More
A distance oracle is a compact representation of the shortest distance matrix of a graph. It can be queried to approximate shortest paths between any pair of vertices. Any distance oracle that returns paths of worst-case stretch (2k-1) must require space $Ω(n^{1 + 1/k})$ for graphs of n nodes. The hard cases that enforce this lower bound are, however, rather dense graphs with average degree Ω(n^{1/k}).
We present distance oracles that, for sparse graphs, substantially break the lower bound barrier at the expense of higher query time. For any 1 \leq α\leq n, our distance oracles can return stretch 2 paths using O(m + n^2/α) space and stretch 3 paths using O(m + n^2/α^2) space, at the expense of O(αm/n) query time. By setting appropriate values of α, we get the first distance oracles that have size linear in the size of the graph, and return constant stretch paths in non-trivial query time. The query time can be further reduced to O(α), by using an additional O(m α) space for all our distance oracles, or at the cost of a small constant additive stretch.
We use our stretch 2 distance oracle to present the first compact routing scheme with worst-case stretch 2. Any compact routing scheme with stretch less than 2 must require linear memory at some nodes even for sparse graphs; our scheme, hence, achieves the optimal stretch with non-trivial memory requirements. Moreover, supported by large-scale simulations on graphs including the AS-level Internet graph, we argue that our stretch-2 scheme would be simple and efficient to implement as a distributed compact routing protocol.
△ Less
Submitted 12 January, 2012;
originally announced January 2012.
-
Slick Packets
Authors:
Giang T. K. Nguyen,
Rachit Agarwal,
Junda Liu,
Matthew Caesar,
P. Brighten Godfrey,
Scott Shenker
Abstract:
Source-controlled routing has been proposed as a way to improve flexibility of future network architectures, as well as simplifying the data plane. However, if a packet specifies its path, this precludes fast local re-routing within the network. We propose SlickPackets, a novel solution that allows packets to slip around failures by specifying alternate paths in their headers, in the form of compa…
▽ More
Source-controlled routing has been proposed as a way to improve flexibility of future network architectures, as well as simplifying the data plane. However, if a packet specifies its path, this precludes fast local re-routing within the network. We propose SlickPackets, a novel solution that allows packets to slip around failures by specifying alternate paths in their headers, in the form of compactly-encoded directed acyclic graphs. We show that this can be accomplished with reasonably small packet headers for real network topologies, and results in responsiveness to failures that is competitive with past approaches that require much more state within the network. Our approach thus enables fast failure response while preserving the benefits of source-controlled routing.
△ Less
Submitted 8 January, 2012;
originally announced January 2012.
-
Jellyfish: Networking Data Centers Randomly
Authors:
Ankit Singla,
Chi-Yao Hong,
Lucian Popa,
P. Brighten Godfrey
Abstract:
Industry experience indicates that the ability to incrementally expand data centers is essential. However, existing high-bandwidth network designs have rigid structure that interferes with incremental expansion. We present Jellyfish, a high-capacity network interconnect, which, by adopting a random graph topology, yields itself naturally to incremental expansion. Somewhat surprisingly, Jellyfish i…
▽ More
Industry experience indicates that the ability to incrementally expand data centers is essential. However, existing high-bandwidth network designs have rigid structure that interferes with incremental expansion. We present Jellyfish, a high-capacity network interconnect, which, by adopting a random graph topology, yields itself naturally to incremental expansion. Somewhat surprisingly, Jellyfish is more cost-efficient than a fat-tree: A Jellyfish interconnect built using the same equipment as a fat-tree, supports as many as 25% more servers at full capacity at the scale of a few thousand nodes, and this advantage improves with scale. Jellyfish also allows great flexibility in building networks with different degrees of oversubscription. However, Jellyfish's unstructured design brings new challenges in routing, physical layout, and wiring. We describe and evaluate approaches that resolve these challenges effectively, indicating that Jellyfish could be deployed in today's data centers.
△ Less
Submitted 20 April, 2012; v1 submitted 7 October, 2011;
originally announced October 2011.
-
BGP Stability is Precarious
Authors:
P. Brighten Godfrey
Abstract:
We note a fact which is simple, but may be useful for the networking research community: essentially any change to BGP's decision process can cause divergence --- or convergence when BGP would otherwise diverge.
We note a fact which is simple, but may be useful for the networking research community: essentially any change to BGP's decision process can cause divergence --- or convergence when BGP would otherwise diverge.
△ Less
Submitted 31 July, 2011;
originally announced August 2011.
-
Network Coding for Distributed Storage Systems
Authors:
Alexandros G. Dimakis,
P. Brighten Godfrey,
Yunnan Wu,
Martin J. Wainwright,
Kannan Ramchandran
Abstract:
Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since…
▽ More
Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate encoded fragments in a distributed way while transferring as little data as possible across the network.
For an erasure coded system, a common practice to repair from a node failure is for a new node to download subsets of data stored at a number of surviving nodes, reconstruct a lost coded block using the downloaded data, and store it at the new node. We show that this procedure is sub-optimal. We introduce the notion of regenerating codes, which allow a new node to download \emph{functions} of the stored data from the surviving nodes. We show that regenerating codes can significantly reduce the repair bandwidth. Further, we show that there is a fundamental tradeoff between storage and repair bandwidth which we theoretically characterize using flow arguments on an appropriately constructed graph. By invoking constructive results in network coding, we introduce regenerating codes that can achieve any point in this optimal tradeoff.
△ Less
Submitted 5 March, 2008;
originally announced March 2008.
-
Network Coding for Distributed Storage Systems
Authors:
Alexandros G. Dimakis,
P. Brighten Godfrey,
Martin J. Wainwright,
Kannan Ramchandran
Abstract:
Peer-to-peer distributed storage systems provide reliable access to data through redundancy spread over nodes across the Internet. A key goal is to minimize the amount of bandwidth used to maintain that redundancy. Storing a file using an erasure code, in fragments spread across nodes, promises to require less redundancy and hence less maintenance bandwidth than simple replication to provide the…
▽ More
Peer-to-peer distributed storage systems provide reliable access to data through redundancy spread over nodes across the Internet. A key goal is to minimize the amount of bandwidth used to maintain that redundancy. Storing a file using an erasure code, in fragments spread across nodes, promises to require less redundancy and hence less maintenance bandwidth than simple replication to provide the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate a new fragment in a distributed way while transferring as little data as possible across the network.
In this paper, we introduce a general technique to analyze storage architectures that combine any form of coding and replication, as well as presenting two new schemes for maintaining redundancy using erasure codes. First, we show how to optimally generate MDS fragments directly from existing fragments in the system. Second, we introduce a new scheme called Regenerating Codes which use slightly larger fragments than MDS but have lower overall bandwidth use. We also show through simulation that in realistic environments, Regenerating Codes can reduce maintenance bandwidth use by 25 percent or more compared with the best previous design--a hybrid of replication and erasure codes--while simplifying system architecture.
△ Less
Submitted 2 February, 2007;
originally announced February 2007.