-
Kubernetes Deployment Options for On-Prem Clusters
Authors:
Lincoln Bryant,
Robert W. Gardner,
Feng** Hu,
David Jordan,
Ryan P. Taylor
Abstract:
Over the last decade, the Kubernetes container orchestration platform has become essential to many scientific workflows. Despite its popularity, deploying a production-ready Kubernetes cluster on-premises can be challenging for system administrators. Many of the proprietary integrations that application developers take for granted in commercial cloud environments must be replaced with alternatives…
▽ More
Over the last decade, the Kubernetes container orchestration platform has become essential to many scientific workflows. Despite its popularity, deploying a production-ready Kubernetes cluster on-premises can be challenging for system administrators. Many of the proprietary integrations that application developers take for granted in commercial cloud environments must be replaced with alternatives when deployed locally. This article will compare three popular deployment strategies for sites deploying Kubernetes on-premise: Kubeadm with Kubespray, OpenShift / OKD and Rancher via K3S/RKE2.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Deep-learning-based clustering of OCT images for biomarker discovery in age-related macular degeneration (Pinnacle study report 4)
Authors:
Robbie Holland,
Rebecca Kaye,
Ahmed M. Hagag,
Oliver Leingang,
Thomas R. P. Taylor,
Hrvoje Bogunović,
Ursula Schmidt-Erfurth,
Hendrik P. N. Scholl,
Daniel Rueckert,
Andrew J. Lotery,
Sobha Sivaprasad,
Martin J. Menten
Abstract:
Diseases are currently managed by grading systems, where patients are stratified by grading systems into stages that indicate patient risk and guide clinical management. However, these broad categories typically lack prognostic value, and proposals for new biomarkers are currently limited to anecdotal observations. In this work, we introduce a deep-learning-based biomarker proposal system for the…
▽ More
Diseases are currently managed by grading systems, where patients are stratified by grading systems into stages that indicate patient risk and guide clinical management. However, these broad categories typically lack prognostic value, and proposals for new biomarkers are currently limited to anecdotal observations. In this work, we introduce a deep-learning-based biomarker proposal system for the purpose of accelerating biomarker discovery in age-related macular degeneration (AMD). It works by first training a neural network using self-supervised contrastive learning to discover, without any clinical annotations, features relating to both known and unknown AMD biomarkers present in 46,496 retinal optical coherence tomography (OCT) images. To interpret the discovered biomarkers, we partition the images into 30 subsets, termed clusters, that contain similar features. We then conduct two parallel 1.5-hour semi-structured interviews with two independent teams of retinal specialists that describe each cluster in clinical language. Overall, both teams independently identified clearly distinct characteristics in 27 of 30 clusters, of which 23 were related to AMD. Seven were recognised as known biomarkers already used in established grading systems and 16 depicted biomarker combinations or subtypes that are either not yet used in grading systems, were only recently proposed, or were unknown. Clusters separated incomplete from complete retinal atrophy, intraretinal from subretinal fluid and thick from thin choroids, and in simulation outperformed clinically-used grading systems in prognostic value. Overall, contrastive learning enabled the automatic proposal of AMD biomarkers that go beyond the set used by clinically established grading systems. Ultimately, we envision that equip** clinicians with discovery-oriented deep-learning tools can accelerate discovery of novel prognostic biomarkers.
△ Less
Submitted 12 March, 2024;
originally announced May 2024.
-
Federating distributed storage for clouds in ATLAS
Authors:
Frank Berghaus,
Kevin Casteels,
Alessandro Di Girolamo,
Colson Driemel,
Marcus Ebert,
Fabrizio Furano,
Fernado Galindo,
Mario Lassnig,
Colin Leavett-Brown,
Michael Paterson,
Cedric Serfon,
Rolf Seuster,
Randall Sobie,
Reda Tafirout,
Ryan Paul Taylor
Abstract:
Input data for applications that run in cloud computing centres can be stored at distant repositories, often with multiple copies of the popular data stored at many sites. Locating and retrieving the remote data can be challenging, and we believe that federating the storage can address this problem. A federation would locate the closest copy of the data on the basis of GeoIP information. Currently…
▽ More
Input data for applications that run in cloud computing centres can be stored at distant repositories, often with multiple copies of the popular data stored at many sites. Locating and retrieving the remote data can be challenging, and we believe that federating the storage can address this problem. A federation would locate the closest copy of the data on the basis of GeoIP information. Currently we are using the dynamic data federation Dynafed, a software solution developed by CERN IT. Dynafed supports several industry standards for connection protocols like Amazon's S3, Microsoft's Azure, as well as WebDAV and HTTP. Dynafed functions as an abstraction layer under which protocol-dependent authentication details are hidden from the user, requiring the user to only provide an X509 certificate. We have setup an instance of Dynafed and integrated it into the ATLAS data distribution management system. We report on the challenges faced during the installation and integration. We have tested ATLAS analysis jobs submitted by the PanDA production system and we report on our first experiences with its operation.
△ Less
Submitted 26 October, 2018;
originally announced October 2018.