Creating a content delivery network for general science on the internet backbone using XCaches
Authors:
Edgar Fajardo,
Marian Zvada,
Derek Weitzel,
Mats Rynge,
John Hicks,
Mat Selmeci,
Brian Lin,
Pascal Paschos,
Brian Bockelman,
Igor Sfiligoi,
Andrew Hanushevsky,
Frank Würthwein
Abstract:
A general problem faced by computing on the grid for opportunistic users is that delivering cycles is simpler than delivering data to those cycles. In this project we show how we integrated XRootD caches placed on the internet backbone to implement a content delivery network for general science workflows. We will show that for some workflows on different science domains like high energy physics, g…
▽ More
A general problem faced by computing on the grid for opportunistic users is that delivering cycles is simpler than delivering data to those cycles. In this project we show how we integrated XRootD caches placed on the internet backbone to implement a content delivery network for general science workflows. We will show that for some workflows on different science domains like high energy physics, gravitational waves, and others the combination of data reuse from the workflows together with the use of caches increases CPU efficiency while decreasing network bandwidth use.
△ Less
Submitted 28 September, 2020; v1 submitted 2 July, 2020;
originally announced July 2020.
StashCache: A Distributed Caching Federation for the Open Science Grid
Authors:
Derek Weitzel,
Marian Zvada,
Ilija Vukotic,
Rob Gardner,
Brian Bockelman,
Mats Rynge,
Edgar Fajardo Hernandez,
Brian Lin,
Matyas Selmeci
Abstract:
Data distribution for opportunistic users is challenging as they neither own the computing resources they are using or any nearby storage. Users are motivated to use opportunistic computing to expand their data processing capacity, but they require storage and fast networking to distribute data to that processing. Since it requires significant management overhead, it is rare for resource providers…
▽ More
Data distribution for opportunistic users is challenging as they neither own the computing resources they are using or any nearby storage. Users are motivated to use opportunistic computing to expand their data processing capacity, but they require storage and fast networking to distribute data to that processing. Since it requires significant management overhead, it is rare for resource providers to allow opportunistic access to storage. Additionally, in order to use opportunistic storage at several distributed sites, users assume the responsibility to maintain their data. In this paper we present StashCache, a distributed caching federation that enables opportunistic users to utilize nearby opportunistic storage. StashCache is comprised of four components: data origins, redirectors, caches, and clients. StashCache has been deployed in the Open Science Grid for several years and has been used by many projects. Caches are deployed in geographically distributed locations across the U.S. and Europe. We will present the architecture of StashCache, as well as utilization information of the infrastructure. We will also present performance analysis comparing distributed HTTP Proxies vs StashCache.
△ Less
Submitted 16 May, 2019;
originally announced May 2019.