-
Towards Fast, Adaptive, and Hardware-Assisted User-Space Scheduling
Authors:
Lisa,
Li,
Nikita Lazarev,
David Koufaty,
Yijun Yin,
Andy Anderson,
Zhiru Zhang,
Edward Suh,
Kostis Kaffes,
Christina Delimitrou
Abstract:
Modern datacenter applications are prone to high tail latencies since their requests typically follow highly-dispersive distributions. Delivering fast interrupts is essential to reducing tail latency. Prior work has proposed both OS- and system-level solutions to reduce tail latencies for microsecond-scale workloads through better scheduling. Unfortunately, existing approaches like customized data…
▽ More
Modern datacenter applications are prone to high tail latencies since their requests typically follow highly-dispersive distributions. Delivering fast interrupts is essential to reducing tail latency. Prior work has proposed both OS- and system-level solutions to reduce tail latencies for microsecond-scale workloads through better scheduling. Unfortunately, existing approaches like customized dataplane OSes, require significant OS changes, experience scalability limitations, or do not reach the full performance capabilities hardware offers.
The emergence of new hardware features like UINTR exposed new opportunities to rethink the design paradigms and abstractions of traditional scheduling systems. We propose LibPreemptible, a preemptive user-level threading library that is flexible, lightweight, and adaptive. LibPreemptible was built with a set of optimizations like LibUtimer for scalability, and deadline-oriented API for flexible policies, time-quantum controller for adaptiveness. Compared to the prior state-of-the-art scheduling system Shinjuku, our system achieves significant tail latency and throughput improvements for various workloads without modifying the kernel. We also demonstrate the flexibility of LibPreemptible across scheduling policies for real applications experiencing varying load levels and characteristics.
△ Less
Submitted 11 November, 2023; v1 submitted 5 August, 2023;
originally announced August 2023.
-
Small Spacecraft for Global Greenhouse Gas Monitoring
Authors:
Victoria Mayorova,
Andrey Morozov,
Iliya Golyak,
Nikita Lazarev,
Valeriia Melnikova,
Dmitry Rachkin,
Victor Svirin,
Stepan Tenenbaum,
Igor Fufurin
Abstract:
This work is devoted to the capabilities analysis of constellation and small spacecraft developed using CubeSat technology to solve promising problems of the Earth remote sensing in the area of greenhouse gases emissions. This paper presents the scientific needs for such tasks, followed by descriptions and discussions of the micro-technology application both in the small satellite platform design…
▽ More
This work is devoted to the capabilities analysis of constellation and small spacecraft developed using CubeSat technology to solve promising problems of the Earth remote sensing in the area of greenhouse gases emissions. This paper presents the scientific needs for such tasks, followed by descriptions and discussions of the micro-technology application both in the small satellite platform design and in the payload design. The overview of analogical spacecraft is carried out. The design of a new spacecraft for determination the oxygen and carbon dioxide concentration in the air column along the line of sight of the spacecraft when it illuminated by reflected sunlight is introduced. A mock-up of the device was made for greenhouse gases remote sensing a Fourier Transform Infrared (FTIR) spectroradiometer is placed in the small spacecraft design. The results of long-term measurements of greenhouse gas concentrations using the developed Fourier spectrometer mock-up is presented.
△ Less
Submitted 15 December, 2022;
originally announced December 2022.
-
A Hardware-Software Stack for Serverless Edge Swarms
Authors:
Liam Patterson,
David Pigorovsky,
Brian Dempsey,
Nikita Lazarev,
Aditya Shah,
Clara Steinhoff,
Ariana Bruno,
Justin Hu,
Christina Delimitrou
Abstract:
Swarms of autonomous devices are increasing in ubiquity and size, making the need for rethinking their hardware-software system stack critical.
We present HiveMind, the first swarm coordination platform that enables programmable execution of complex task workflows between cloud and edge resources in a performant and scalable manner. HiveMind is a software-hardware platform that includes a domain…
▽ More
Swarms of autonomous devices are increasing in ubiquity and size, making the need for rethinking their hardware-software system stack critical.
We present HiveMind, the first swarm coordination platform that enables programmable execution of complex task workflows between cloud and edge resources in a performant and scalable manner. HiveMind is a software-hardware platform that includes a domain-specific language to simplify programmability of cloud-edge applications, a program synthesis tool to automatically explore task placement strategies, a centralized controller that leverages serverless computing to elastically scale cloud resources, and a reconfigurable hardware acceleration fabric for network and remote memory accesses.
We design and build the full end-to-end HiveMind system on two real edge swarms comprised of drones and robotic cars. We quantify the opportunities and challenges serverless introduces to edge applications, as well as the trade-offs between centralized and distributed coordination. We show that HiveMind achieves significantly better performance predictability and battery efficiency compared to existing centralized and decentralized platforms, while also incurring lower network traffic. Using both real systems and a validated simulator we show that HiveMind can scale to thousands of edge devices without sacrificing performance or efficiency, demonstrating that centralized platforms can be both scalable and performant.
△ Less
Submitted 29 December, 2021;
originally announced December 2021.
-
A Roadmap for Enabling a Future-Proof In-Network Computing Data Plane Ecosystem
Authors:
Daehyeok Kim,
Nikita Lazarev,
Tommy Tracy,
Farzana Siddique,
Hun Namkung,
James C. Hoe,
Vyas Sekar,
Kevin Skadron,
Zhiru Zhang,
Srinivasan Seshan
Abstract:
As the vision of in-network computing becomes more mature, we see two parallel evolutionary trends. First, we see the evolution of richer, more demanding applications that require capabilities beyond programmable switching ASICs. Second, we see the evolution of diverse data plane technologies with many other future capabilities on the horizon. While some point solutions exist to tackle the interse…
▽ More
As the vision of in-network computing becomes more mature, we see two parallel evolutionary trends. First, we see the evolution of richer, more demanding applications that require capabilities beyond programmable switching ASICs. Second, we see the evolution of diverse data plane technologies with many other future capabilities on the horizon. While some point solutions exist to tackle the intersection of these trends, we see several ecosystem-level disconnects today; e.g., the need to refactor applications for new data planes, lack of systematic guidelines to inform the development of future data plane capabilities, and lack of holistic runtime frameworks for network operators. In this paper, we use a simple-yet-instructive emerging application-data plane combination to highlight these disconnects. Drawing on these lessons, we sketch a high-level roadmap and guidelines for the community to tackle these to create a more thriving "future-proof" data plane ecosystem.
△ Less
Submitted 8 November, 2021;
originally announced November 2021.
-
Dagger: Accelerating RPCs in Cloud Microservices Through Tightly-Coupled Reconfigurable NICs
Authors:
Nikita Lazarev,
Shaojie Xiang,
Neil Adit,
Zhiru Zhang,
Christina Delimitrou
Abstract:
The ongoing shift of cloud services from monolithic designs to microservices creates high demand for efficient and high performance datacenter networking stacks, optimized for fine-grained workloads. Commodity networking systems based on software stacks and peripheral NICs introduce high overheads when it comes to delivering small messages.
We present Dagger, a hardware acceleration fabric for c…
▽ More
The ongoing shift of cloud services from monolithic designs to microservices creates high demand for efficient and high performance datacenter networking stacks, optimized for fine-grained workloads. Commodity networking systems based on software stacks and peripheral NICs introduce high overheads when it comes to delivering small messages.
We present Dagger, a hardware acceleration fabric for cloud RPCs based on FPGAs, where the accelerator is closely-coupled with the host processor over a configurable memory interconnect. The three key design principle of Dagger are: (1) offloading the entire RPC stack to an FPGA-based NIC, (2) leveraging memory interconnects instead of PCIe buses as the interface with the host CPU, and (3) making the acceleration fabric reconfigurable, so it can accommodate the diverse needs of microservices. We show that the combination of these principles significantly improves the efficiency and performance of cloud RPC systems while preserving their generality. Dagger achieves 1.3-3.8x higher per-core RPC throughput compared to both highly-optimized software stacks, and systems using specialized RDMA adapters. It also scales up to 84 Mrps with 8 threads on 4 CPU cores, while maintaining state-of-the-art us-scale tail latency. We also demonstrate that large third-party applications, like memcached and MICA KVS, can be easily ported on Dagger with minimal changes to their codebase, bringing their median and tail KVS access latency down to 2.8 - 3.5us and 5.4 - 7.8us, respectively. Finally, we show that Dagger is beneficial for multi-tier end-to-end microservices with different threading models by evaluating it using an 8-tier application implementing a flight check-in service.
△ Less
Submitted 2 June, 2021;
originally announced June 2021.
-
Dagger: Towards Efficient RPCs in Cloud Microservices with Near-Memory Reconfigurable NICs
Authors:
Nikita Lazarev,
Neil Adit,
Shaojie Xiang,
Zhiru Zhang,
Christina Delimitrou
Abstract:
Cloud applications are increasingly relying on hundreds of loosely-coupled microservices to complete user requests that meet an applications end-to-end QoS requirements. Communication time between services accounts for a large fraction of the end-to-end latency and can introduce performance unpredictability and QoS violations. This work presents our early work on Dagger, a hardware acceleration pl…
▽ More
Cloud applications are increasingly relying on hundreds of loosely-coupled microservices to complete user requests that meet an applications end-to-end QoS requirements. Communication time between services accounts for a large fraction of the end-to-end latency and can introduce performance unpredictability and QoS violations. This work presents our early work on Dagger, a hardware acceleration platform for networking, designed specifically with the unique qualities of microservices in mind. The Dagger architecture relies on an FPGA-based NIC, closely coupled with the processor over a configurable memory interconnect, designed to offload and accelerate RPC stacks. Unlike the traditional cloud systems that use PCIe links as the NIC I/O interface, we leverage memory-interconnected FPGAs as networking devices to provide the efficiency, transparency, and programmability needed for fine-grained microservices. We show that this considerably improves CPU utilization and performance for cloud RPCs.
△ Less
Submitted 11 September, 2020; v1 submitted 16 July, 2020;
originally announced July 2020.
-
Exactly solvable model of sliding in metallic glass
Authors:
Nikolai Lazarev,
Alexander Bakai
Abstract:
At low temperature, T -> 0, the yield stress of a perfect crystal is equal to its so called theoretical strength. The yield stress of non-perfect crystals is controlled by the stress threshold of dislocation mobility. A non-crystalline solid has neither an ideal structure nor gliding dislocations. Its yield stress, i.e. the stress at which the macroscopic inelastic deformation starts, depends on d…
▽ More
At low temperature, T -> 0, the yield stress of a perfect crystal is equal to its so called theoretical strength. The yield stress of non-perfect crystals is controlled by the stress threshold of dislocation mobility. A non-crystalline solid has neither an ideal structure nor gliding dislocations. Its yield stress, i.e. the stress at which the macroscopic inelastic deformation starts, depends on distribution of local, attributed to each atomic site, critical stresses at which the local inelastic deformation occurs. We describe exactly solvable model of planar layer strength and sliding with an arbitrary homogeneous distribution of local critical stresses. The macroscopic stress threshold of the athermal sliding is found. Kinetics of thermally-activated creep of the sliding layer is described. The rate of the thermally activated sliding is tightly connected with parameters of the low temperature strength. The sliding activation volume scales with the applied external stress as ~ σ^-β, where β<1. The proposed model accounts for mechanisms and the yield stress of the low temperature deformation of polycluster metallic glasses, since intercluster boundaries of a polycluster metallic glass are natural sliding layers of the described type.
△ Less
Submitted 30 June, 2011;
originally announced June 2011.