-
Mutiny! How does Kubernetes fail, and what can we do about it?
Authors:
Marco Barletta,
Marcello Cinque,
Catello Di Martino,
Zbigniew T. Kalbarczyk,
Ravishankar K. Iyer
Abstract:
In this paper, we i) analyze and classify real-world failures of Kubernetes (the most popular container orchestration system), ii) develop a framework to perform a fault/error injection campaign targeting the data store preserving the cluster state, and iii) compare results of our fault/error injection experiments with real-world failures, showing that our fault/error injections can recreate many…
▽ More
In this paper, we i) analyze and classify real-world failures of Kubernetes (the most popular container orchestration system), ii) develop a framework to perform a fault/error injection campaign targeting the data store preserving the cluster state, and iii) compare results of our fault/error injection experiments with real-world failures, showing that our fault/error injections can recreate many real-world failure patterns. The paper aims to address the lack of studies on systematic analyses of Kubernetes failures to date.
Our results show that even a single fault/error (e.g., a bit-flip) in the data stored can propagate, causing cluster-wide failures (3% of injections), service networking issues (4%), and service under/overprovisioning (24%). Errors in the fields tracking dependencies between object caused 51% of such cluster-wide failures. We argue that controlled fault/error injection-based testing should be employed to proactively assess Kubernetes' resiliency and guide the design of failure mitigation strategies.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Orchestrating Mixed-Criticality Cloud Workloads in Reconfigurable Manufacturing Systems
Authors:
Marco Barletta,
Marcello Cinque,
Davide De Vita
Abstract:
The adoption of cloud computing technologies in the industry is paving the way to new manufacturing paradigms. In this paper we propose a model to optimize the orchestration of workloads with differentiated criticality levels on a cloud-enabled factory floor. Preliminary results show that it is possible to optimize the guarantees to deployed jobs without penalizing the number of schedulable jobs.…
▽ More
The adoption of cloud computing technologies in the industry is paving the way to new manufacturing paradigms. In this paper we propose a model to optimize the orchestration of workloads with differentiated criticality levels on a cloud-enabled factory floor. Preliminary results show that it is possible to optimize the guarantees to deployed jobs without penalizing the number of schedulable jobs. We indicate future research paths to quantitatively evaluate job isolation.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
RunPHI: Enabling Mixed-criticality Containers via Partitioning Hypervisors in Industry 4.0
Authors:
Marco Barletta,
Marcello Cinque,
Luigi De Simone,
Raffaele Della Corte,
Giorgio Farina,
Daniele Ottaviano
Abstract:
Orchestration systems are becoming a key component to automatically manage distributed computing resources in many fields with criticality requirements like Industry 4.0 (I4.0). However, they are mainly linked to OS-level virtualization, which is known to suffer from reduced isolation. In this paper, we propose RunPHI with the aim of integrating partitioning hypervisors, as a solution for assuring…
▽ More
Orchestration systems are becoming a key component to automatically manage distributed computing resources in many fields with criticality requirements like Industry 4.0 (I4.0). However, they are mainly linked to OS-level virtualization, which is known to suffer from reduced isolation. In this paper, we propose RunPHI with the aim of integrating partitioning hypervisors, as a solution for assuring strong isolation, with OS-level orchestration systems. The purpose is to enable container orchestration in mixed-criticality systems with isolation requirements through partitioned containers.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Introducing k4.0s: a Model for Mixed-Criticality Container Orchestration in Industry 4.0 (extended)
Authors:
Marco Barletta,
Marcello Cinque,
Luigi De Simone,
Raffaele Della Corte
Abstract:
Time predictable edge cloud is seen as the answer for many arising needs in Industry 4.0 environments, since it is able to provide flexible, modular, and reconfigurable services with low latency and reduced costs. Orchestration systems are becoming the core component of clouds since they take decisions on the placement and lifecycle of software components. Current solutions start introducing real-…
▽ More
Time predictable edge cloud is seen as the answer for many arising needs in Industry 4.0 environments, since it is able to provide flexible, modular, and reconfigurable services with low latency and reduced costs. Orchestration systems are becoming the core component of clouds since they take decisions on the placement and lifecycle of software components. Current solutions start introducing real-time containers support for time predictability; however, these approaches lack of determinism as well as support for workloads requiring multiple levels of assurance/criticality.
In this paper, we present k4.0s, an orchestration model for real-time and mixed-criticality environments, which includes timeliness, criticality and network requirements. The model leverages new abstractions for both node and jobs, e.g., node assurance, and requires novel monitoring strategies. We sketch an implementation of the proposal based on Kubernetes, and present an experimentation motivating the need for node assurance levels and adequate monitoring.
△ Less
Submitted 4 January, 2023; v1 submitted 27 May, 2022;
originally announced May 2022.
-
Automated Analysis of Scenario-based Specifications of Distributed Access Control Policies with Non-Mechanizable Activities (Extended Version)
Authors:
Michele Barletta,
Silvio Ranise,
Luca ViganĂ²
Abstract:
The advance of web services technologies promises to have far-reaching effects on the Internet and enterprise networks allowing for greater accessibility of data. The security challenges presented by the web services approach are formidable. In particular, access control solutions should be revised to address new challenges, such as the need of using certificates for the identification of users an…
▽ More
The advance of web services technologies promises to have far-reaching effects on the Internet and enterprise networks allowing for greater accessibility of data. The security challenges presented by the web services approach are formidable. In particular, access control solutions should be revised to address new challenges, such as the need of using certificates for the identification of users and their attributes, human intervention in the creation or selection of the certificates, and (chains of) certificates for trust management. With all these features, it is not surprising that analyzing policies to guarantee that a sensitive resource can be accessed only by authorized users becomes very difficult. In this paper, we present an automated technique to analyze scenario-based specifications of access control policies in open and distributed systems. We illustrate our ideas on a case study arising in the e-government area.
△ Less
Submitted 14 June, 2012;
originally announced June 2012.
-
Verifying the Interplay of Authorization Policies and Workflow in Service-Oriented Architectures (Full version)
Authors:
Michele Barletta,
Silvio Ranise,
Luca ViganĂ²
Abstract:
A widespread design approach in distributed applications based on the service-oriented paradigm, such as web-services, consists of clearly separating the enforcement of authorization policies and the workflow of the applications, so that the interplay between the policy level and the workflow level is abstracted away. While such an approach is attractive because it is quite simple and permits on…
▽ More
A widespread design approach in distributed applications based on the service-oriented paradigm, such as web-services, consists of clearly separating the enforcement of authorization policies and the workflow of the applications, so that the interplay between the policy level and the workflow level is abstracted away. While such an approach is attractive because it is quite simple and permits one to reason about crucial properties of the policies under consideration, it does not provide the right level of abstraction to specify and reason about the way the workflow may interfere with the policies, and vice versa. For example, the creation of a certificate as a side effect of a workflow operation may enable a policy rule to fire and grant access to a certain resource; without executing the operation, the policy rule should remain inactive. Similarly, policy queries may be used as guards for workflow transitions.
In this paper, we present a two-level formal verification framework to overcome these problems and formally reason about the interplay of authorization policies and workflow in service-oriented architectures. This allows us to define and investigate some verification problems for SO applications and give sufficient conditions for their decidability.
△ Less
Submitted 25 June, 2009;
originally announced June 2009.