Fast General Distributed Transactions with Opacity using Global Time
Authors:
Alex Shamis,
Matthew Renzelmann,
Stanko Novakovic,
Georgios Chatzopoulos,
Anders T. Gjerdrum,
Dan Alistarh,
Aleksandar Dragojevic,
Dushyanth Narayanan,
Miguel Castro
Abstract:
Transactions can simplify distributed applications by hiding data distribution, concurrency, and failures from the application developer. Ideally the developer would see the abstraction of a single large machine that runs transactions sequentially and never fails. This requires the transactional subsystem to provide opacity (strict serializability for both committed and aborted transactions), as w…
▽ More
Transactions can simplify distributed applications by hiding data distribution, concurrency, and failures from the application developer. Ideally the developer would see the abstraction of a single large machine that runs transactions sequentially and never fails. This requires the transactional subsystem to provide opacity (strict serializability for both committed and aborted transactions), as well as transparent fault tolerance with high availability. As even the best abstractions are unlikely to be used if they perform poorly, the system must also provide high performance.
Existing distributed transactional designs either weaken this abstraction or are not designed for the best performance within a data center. This paper extends the design of FaRM - which provides strict serializability only for committed transactions - to provide opacity while maintaining FaRM's high throughput, low latency, and high availability within a modern data center. It uses timestamp ordering based on real time with clocks synchronized to within tens of microseconds across a cluster, and a failover protocol to ensure correctness across clock master failures. FaRM with opacity can commit 5.4 million neworder transactions per second when running the TPC-C transaction mix on 90 machines with 3-way replication.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
A1: A Distributed In-Memory Graph Database
Authors:
Chiranjeeb Buragohain,
Knut Magne Risvik,
Paul Brett,
Miguel Castro,
Wonhee Cho,
Joshua Cowhig,
Nikolas Gloy,
Karthik Kalyanaraman,
Richendra Khanna,
John Pao,
Matthew Renzelmann,
Alex Shamis,
Timothy Tan,
Shuheng Zheng
Abstract:
A1 is an in-memory distributed database used by the Bing search engine to support complex queries over structured data. The key enablers for A1 are availability of cheap DRAM and high speed RDMA (Remote Direct Memory Access) networking in commodity hardware. A1 uses FaRM as its underlying storage layer and builds the graph abstraction and query engine on top. The combination of in-memory storage a…
▽ More
A1 is an in-memory distributed database used by the Bing search engine to support complex queries over structured data. The key enablers for A1 are availability of cheap DRAM and high speed RDMA (Remote Direct Memory Access) networking in commodity hardware. A1 uses FaRM as its underlying storage layer and builds the graph abstraction and query engine on top. The combination of in-memory storage and RDMA access requires rethinking how data is allocated, organized and queried in a large distributed system. A single A1 cluster can store tens of billions of vertices and edges and support a throughput of 350+ million of vertex reads per second with end to end query latency in single digit milliseconds. In this paper we describe the A1 data model, RDMA optimized data structures and query execution.
△ Less
Submitted 12 April, 2020;
originally announced April 2020.