-
SMaRTT-REPS: Sender-based Marked Rapidly-adapting Trimmed & Timed Transport with Recycled Entropies
Authors:
Tommaso Bonato,
Abdul Kabbani,
Daniele De Sensi,
Rong Pan,
Yanfang Le,
Costin Raiciu,
Mark Handley,
Timo Schneider,
Nils Blach,
Ahmad Ghalayini,
Daniel Alves,
Michael Papamichael,
Adrian Caulfield,
Torsten Hoefler
Abstract:
With the rapid growth of machine learning (ML) workloads in datacenters, existing congestion control (CC) algorithms fail to deliver the required performance at scale. ML traffic is bursty and bulk-synchronous and thus requires quick reaction and strong fairness. We show that existing CC algorithms that use delay as a main signal react too slowly and are not always fair. We design SMaRTT, a simple…
▽ More
With the rapid growth of machine learning (ML) workloads in datacenters, existing congestion control (CC) algorithms fail to deliver the required performance at scale. ML traffic is bursty and bulk-synchronous and thus requires quick reaction and strong fairness. We show that existing CC algorithms that use delay as a main signal react too slowly and are not always fair. We design SMaRTT, a simple sender-based CC algorithm that combines delay, ECN, and optional packet trimming for fast and precise window adjustments. At the core of SMaRTT lies the novel QuickAdapt algorithm that accurately estimates the bandwidth at the receiver. We show how to combine SMaRTT with a new per-packet traffic load-balancing algorithm called REPS to effectively reroute packets around congested hotspots as well as flaky or failing links. Our evaluation shows that SMaRTT alone outperforms EQDS, Swift, BBR, and MPRDMA by up to 50% on modern datacenter networks.
△ Less
Submitted 27 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Datacenter Ethernet and RDMA: Issues at Hyperscale
Authors:
Torsten Hoefler,
Duncan Roweth,
Keith Underwood,
Bob Alverson,
Mark Griswold,
Vahid Tabatabaee,
Mohan Kalkunte,
Surendra Anubolu,
Siyuan Shen,
Abdul Kabbani,
Moray McLaren,
Steve Scott
Abstract:
We observe that emerging artificial intelligence, high-performance computing, and storage workloads pose new challenges for large-scale datacenter networking. RDMA over Converged Ethernet (RoCE) was an attempt to adopt modern Remote Direct Memory Access (RDMA) features into existing Ethernet installations. Now, a decade later, we revisit RoCE's design points and conclude that several of its shortc…
▽ More
We observe that emerging artificial intelligence, high-performance computing, and storage workloads pose new challenges for large-scale datacenter networking. RDMA over Converged Ethernet (RoCE) was an attempt to adopt modern Remote Direct Memory Access (RDMA) features into existing Ethernet installations. Now, a decade later, we revisit RoCE's design points and conclude that several of its shortcomings must be addressed to fulfill the demands of hyperscale datacenters. We predict that both the datacenter and high-performance computing markets will converge and adopt modernized Ethernet-based high-performance networking solutions that will replace TCP and RoCE within a decade.
△ Less
Submitted 15 April, 2023; v1 submitted 7 February, 2023;
originally announced February 2023.
-
Flexible Network Bandwidth and Latency Provisioning in the Datacenter
Authors:
Vimalkumar Jeyakumar,
Abdul Kabbani,
Jeffrey C. Mogul,
Amin Vahdat
Abstract:
Predictably sharing the network is critical to achieving high utilization in the datacenter. Past work has focussed on providing bandwidth to endpoints, but often we want to allocate resources among multi-node services. In this paper, we present Parley, which provides service-centric minimum bandwidth guarantees, which can be composed hierarchically. Parley also supports service-centric weighted s…
▽ More
Predictably sharing the network is critical to achieving high utilization in the datacenter. Past work has focussed on providing bandwidth to endpoints, but often we want to allocate resources among multi-node services. In this paper, we present Parley, which provides service-centric minimum bandwidth guarantees, which can be composed hierarchically. Parley also supports service-centric weighted sharing of bandwidth in excess of these guarantees. Further, we show how to configure these policies so services can get low latencies even at high network load. We evaluate Parley on a multi-tiered oversubscribed network connecting 90 machines, each with a 10Gb/s network interface, and demonstrate that Parley is able to meet its goals.
△ Less
Submitted 5 May, 2014; v1 submitted 3 May, 2014;
originally announced May 2014.