-
Using HEP experiment workflows for the benchmarking and accounting of WLCG computing resources
Authors:
Andrea Valassi,
Manfred Alef,
Jean-Michel Barbet,
Olga Datskova,
Riccardo De Maria,
Miguel Fontes Medeiros,
Domenico Giordano,
Costin Grigoras,
Christopher Hollowell,
Martina Javurkova,
Viktor Khristenko,
David Lange,
Michele Michelotto,
Lorenzo Rinaldi,
Andrea SciabĂ ,
Cas Van Der Laan
Abstract:
Benchmarking of CPU resources in WLCG has been based on the HEP-SPEC06 (HS06) suite for over a decade. It has recently become clear that HS06, which is based on real applications from non-HEP domains, no longer describes typical HEP workloads. The aim of the HEP-Benchmarks project is to develop a new benchmark suite for WLCG compute resources, based on real applications from the LHC experiments. B…
▽ More
Benchmarking of CPU resources in WLCG has been based on the HEP-SPEC06 (HS06) suite for over a decade. It has recently become clear that HS06, which is based on real applications from non-HEP domains, no longer describes typical HEP workloads. The aim of the HEP-Benchmarks project is to develop a new benchmark suite for WLCG compute resources, based on real applications from the LHC experiments. By construction, these new benchmarks are thus guaranteed to have a score highly correlated to the throughputs of HEP applications, and a CPU usage pattern similar to theirs. Linux containers and the CernVM-FS filesystem are the two main technologies enabling this approach, which had been considered impossible in the past. In this paper, we review the motivation, implementation and outlook of the new benchmark suite.
△ Less
Submitted 25 June, 2020; v1 submitted 3 April, 2020;
originally announced April 2020.
-
Scalable Global Grid catalogue for LHC Run3 and beyond
Authors:
M Martinez Pedreira,
C Grigoras
Abstract:
The AliEn (ALICE Environment) file catalogue is a global unique namespace providing map** between a UNIX-like logical name structure and the corresponding physical files distributed over 80 storage elements worldwide. Powerful search tools and hierarchical metadata information are integral parts of the system and are used by the Grid jobs as well as local users to store and access all files on t…
▽ More
The AliEn (ALICE Environment) file catalogue is a global unique namespace providing map** between a UNIX-like logical name structure and the corresponding physical files distributed over 80 storage elements worldwide. Powerful search tools and hierarchical metadata information are integral parts of the system and are used by the Grid jobs as well as local users to store and access all files on the Grid storage elements. The catalogue has been in production since 2005 and over the past 11 years has grown to more than 2 billion logical file names. The backend is a set of distributed relational databases, ensuring smooth growth and fast access. Due to the anticipated fast future growth, we are looking for ways to enhance the performance and scalability by simplifying the catalogue schema while kee** the functionality intact. We investigated different backend solutions, such as distributed key value stores, as replacement for the relational database. This contribution covers the architectural changes in the system, together with the technology evaluation, benchmark results and conclusions.
△ Less
Submitted 18 April, 2017;
originally announced April 2017.
-
A Security Monitoring Framework For Virtualization Based HEP Infrastructures
Authors:
A. Gomez Ramirez,
M. Martinez Pedreira,
C. Grigoras,
L. Betev,
C. Lara,
U. Kebschull
Abstract:
High Energy Physics (HEP) distributed computing infrastructures require automatic tools to monitor, analyze and react to potential security incidents. These tools should collect and inspect data such as resource consumption, logs and sequence of system calls for detecting anomalies that indicate the presence of a malicious agent. They should also be able to perform automated reactions to attacks w…
▽ More
High Energy Physics (HEP) distributed computing infrastructures require automatic tools to monitor, analyze and react to potential security incidents. These tools should collect and inspect data such as resource consumption, logs and sequence of system calls for detecting anomalies that indicate the presence of a malicious agent. They should also be able to perform automated reactions to attacks without administrator intervention. We describe a novel framework that accomplishes these requirements, with a proof of concept implementation for the ALICE experiment at CERN. We show how we achieve a fully virtualized environment that improves the security by isolating services and Jobs without a significant performance impact. We also describe a collected dataset for Machine Learning based Intrusion Prevention and Detection Systems on Grid computing. This dataset is composed of resource consumption measurements (such as CPU, RAM and network traffic), logfiles from operating system services, and system call data collected from production Jobs running in an ALICE Grid test site and a big set of malware. This malware was collected from security research sites. Based on this dataset, we will proceed to develop Machine Learning algorithms able to detect malicious Jobs.
△ Less
Submitted 16 April, 2017;
originally announced April 2017.
-
A Mediated Definite Delegation Model allowing for Certified Grid Job Submission
Authors:
Steffen Schreiner,
Latchezar Betev,
Costin Grigoras,
Maarten Litmaath
Abstract:
Grid computing infrastructures need to provide traceability and accounting of their users" activity and protection against misuse and privilege escalation. A central aspect of multi-user Grid job environments is the necessary delegation of privileges in the course of a job submission. With respect to these generic requirements this document describes an improved handling of multi-user Grid jobs in…
▽ More
Grid computing infrastructures need to provide traceability and accounting of their users" activity and protection against misuse and privilege escalation. A central aspect of multi-user Grid job environments is the necessary delegation of privileges in the course of a job submission. With respect to these generic requirements this document describes an improved handling of multi-user Grid jobs in the ALICE ("A Large Ion Collider Experiment") Grid Services. A security analysis of the ALICE Grid job model is presented with derived security objectives, followed by a discussion of existing approaches of unrestricted delegation based on X.509 proxy certificates and the Grid middleware gLExec. Unrestricted delegation has severe security consequences and limitations, most importantly allowing for identity theft and forgery of delegated assignments. These limitations are discussed and formulated, both in general and with respect to an adoption in line with multi-user Grid jobs. Based on the architecture of the ALICE Grid Services, a new general model of mediated definite delegation is developed and formulated, allowing a broker to assign context-sensitive user privileges to agents. The model provides strong accountability and long- term traceability. A prototype implementation allowing for certified Grid jobs is presented including a potential interaction with gLExec. The achieved improvements regarding system security, malicious job exploitation, identity protection, and accountability are emphasized, followed by a discussion of non- repudiation in the face of malicious Grid jobs.
△ Less
Submitted 12 December, 2011;
originally announced December 2011.