Limiting Lamport Exposure to Distant Failures in Globally-Managed Distributed Systems
Authors:
Cristina Băsescu,
Georgia Fragkouli,
Enis Ceyhun Alp,
Michael F. Nowlan,
Jose M. Faleiro,
Gaylor Bosson,
Kelong Cong,
Pierluca Borsò-Tan,
Vero Estrada-Galiñanes,
Bryan Ford
Abstract:
Globalized computing infrastructures offer the convenience and elasticity of globally managed objects and services, but lack the resilience to distant failures that localized infrastructures such as private clouds provide. Providing both global management and resilience to distant failures, however, poses a fundamental problem for configuration services: How to discover a possibly migratory, stron…
▽ More
Globalized computing infrastructures offer the convenience and elasticity of globally managed objects and services, but lack the resilience to distant failures that localized infrastructures such as private clouds provide. Providing both global management and resilience to distant failures, however, poses a fundamental problem for configuration services: How to discover a possibly migratory, strongly-consistent service/object in a globalized infrastructure without dependencies on globalized state? Limix is the first metadata configuration service that addresses this problem. With Limix, global strongly-consistent data-plane services and objects are insulated from remote gray failures by ensuring that the definitive, strongly-consistent metadata for any object is always confined to the same region as the object itself. Limix guarantees availability bounds: any user can continue accessing any strongly consistent object that matters to the user located at distance $Δ$ away, insulated from failures outside a small multiple of $Δ$. We built a Limix metadata service based on CockroachDB. Our experiments on Internet-like networks and on AWS, using realistic trace-driven workloads, show that Limix enables global management and significantly improves availability over the state-of-the-art.
△ Less
Submitted 15 July, 2022; v1 submitted 3 May, 2014;
originally announced May 2014.