-
A Global Operational Readiness Review Process: Improving Cloud Availability
Authors:
James J. Cusick
Abstract:
The ORR (Operational Readiness Review) is a long standing practice to help insure application or system readiness and improved Availability. In this paper the ORR is defined and recent examples of its use from Cloud Computing environments are compared. An emphasis on ORRs used within DevOps environments is also provided. A detailed presentation of a specific and custom ORR implementation for a lar…
▽ More
The ORR (Operational Readiness Review) is a long standing practice to help insure application or system readiness and improved Availability. In this paper the ORR is defined and recent examples of its use from Cloud Computing environments are compared. An emphasis on ORRs used within DevOps environments is also provided. A detailed presentation of a specific and custom ORR implementation for a large global IT organization is shared. This includes the process development approach, key components of the ORR checklist, automation support provided, and a unique Executive dashboard solution to visualize status on in-flight releases. Challenges and benefits from this ORR implementation are provided as well as a detailed comparison with the Google Launch checklist and its associated PRR/ORR. Finally, suggestions for further improvements, automation, and usage of the ORR in large-scale industrial settings based on this real-world experience are elaborated.
Keywords: Operational Readiness Review, ORR, IT Services, IT Operations, ITIL, Process Engineering, Reliability, Availability, Software Architecture, Cloud Computing, Networking, Site Reliability Engineering, DevOps, Agile Methods, Quality, Defect Prevention, Release Management, Risk Management, Data Visualization, Organizational Change Management.
△ Less
Submitted 3 May, 2022;
originally announced May 2022.
-
Turning the Tables: The View from Offshore During 60 Days in JST
Authors:
James J. Cusick
Abstract:
A report and examination of a Remote Work experience during the Covid-19 pandemic encompassing a 14-hour time difference from the primary work location. Advantages and disadvantages of a globally distributed work experience as compared to an aligned time zone are explored. Logistical aspects of the arrangement are provided as well as the management support, peer reaction, and relative productivity…
▽ More
A report and examination of a Remote Work experience during the Covid-19 pandemic encompassing a 14-hour time difference from the primary work location. Advantages and disadvantages of a globally distributed work experience as compared to an aligned time zone are explored. Logistical aspects of the arrangement are provided as well as the management support, peer reaction, and relative productivity. Recommendations are also provided on how to improve future geographically diverse team arrangements.
[Keywords: Global Software Development, Offshore Development, Software Engineering, IT Management, Remote Work, Remote Office Design, Distributed Communications, Collaboration Technology, Team Management, Research, Development, Productivity]
△ Less
Submitted 29 May, 2021; v1 submitted 10 May, 2021;
originally announced May 2021.
-
Exploring System Resiliency and Supporting Design Methods
Authors:
James J. Cusick
Abstract:
This paper provides a survey of the industry perspective on System Resiliency and Resiliency design approaches and briefly touches on Organizational Resiliency topics. Beginning with a composite definition of Resiliency, System Capabilities, Adversities, and the Resiliency Life-cycle the document then covers Operational Response Timelines, Failure Sources and Classifications. Next, Design for Resi…
▽ More
This paper provides a survey of the industry perspective on System Resiliency and Resiliency design approaches and briefly touches on Organizational Resiliency topics. Beginning with a composite definition of Resiliency, System Capabilities, Adversities, and the Resiliency Life-cycle the document then covers Operational Response Timelines, Failure Sources and Classifications. Next, Design for Resiliency is discussed with an introduction to Systems Theory and a review of Trade-off Analysis and Resiliency Dependencies. Then more than a dozen Resiliency Design Patterns are included for the reader to consider for their own solutioning. Supporting non-functional design topics including Availability, Performance, Security, Reliability as well as Reliability Allocation using Reliability Block Diagrams are also covered. Additionally, Failure Mode and Effect Analysis is reviewed, and a Resiliency Maturity Model is discussed. Finally, several Resiliency Design Examples are presented along with a set of recommendations on how to apply System Resiliency concepts and methods in an IT environment.
△ Less
Submitted 10 October, 2020; v1 submitted 12 September, 2020;
originally announced September 2020.
-
Business Value of ITSM. Requirement or Mirage?
Authors:
James J. Cusick
Abstract:
This paper builds on a presentation provided as part of a recent panel session on ITSM (IT Service Management) Business Value at the NYC itSMF (Service Management Forum) Local Interest Group meeting. The panel presentation explored the definition of Business Value and how ITSM itself could be measured to produce business value. While ITSM and ITIL have been in use for years it often remains a chal…
▽ More
This paper builds on a presentation provided as part of a recent panel session on ITSM (IT Service Management) Business Value at the NYC itSMF (Service Management Forum) Local Interest Group meeting. The panel presentation explored the definition of Business Value and how ITSM itself could be measured to produce business value. While ITSM and ITIL have been in use for years it often remains a challenge to demonstrate the business value of these methods or even to understand business value itself. This paper expands on the panel discussion on what is meant by business value and how it can be found (if at all) in the context of ITSM development and process improvement.
△ Less
Submitted 7 January, 2020; v1 submitted 1 January, 2020;
originally announced January 2020.
-
A Survey of Maturity Models from Nolon to DevOps and Their Applications in Process Improvement
Authors:
James J. Cusick
Abstract:
This paper traces the history of Maturity Models and their impact on Process Improvement from the early work of Shewhart to their current usage with DevOps. The history of modern process improvement can be traced at least to Shewhart. From his foundational process contributions and those of other innovators a variety of methods and tools to aid in process quality advancement were developed. This p…
▽ More
This paper traces the history of Maturity Models and their impact on Process Improvement from the early work of Shewhart to their current usage with DevOps. The history of modern process improvement can be traced at least to Shewhart. From his foundational process contributions and those of other innovators a variety of methods and tools to aid in process quality advancement were developed. This paper begins by reviewing those early steps and then focuses on the emergence of Maturity Models in the 1970s with initial approach by Nolan. The broad adoption of Maturity Models that followed through the success of the CMM and then the CMMI approaches is detailed. This then leads to a general survey of additional models developed for such areas as IT Service Management, ITIL, Project Management, Agile Development, DevOps, CERT, and MDM among others. Finally, this paper discusses the application of these models in the support of process improvement and their limitations. Readers of this paper can expect to gain an appreciation for the origins of these models and surrounding methods as well as an ability to conduct comparative analysis of such models to aid in their selection and application.
Keywords: Process Improvement, Process Engineering, Maturity Models, Capability Maturity Models, CMM, CMMI, ITSM, ITIL, Agile, DevOps, History of Science, History of Computing, Software Engineering, Quality.
△ Less
Submitted 10 October, 2020; v1 submitted 1 July, 2019;
originally announced July 2019.
-
The First 50 Years of Software Reliability Engineering: A History of SRE with First Person Accounts
Authors:
James J. Cusick
Abstract:
Software Reliability has just passed the 50-year milestone as a technical discipline along with Software Engineering. This paper traces the roots of Software Reliability Engineering (SRE) from its pre-software history to the beginnings of the field with the first software reliability model in 1967 through its maturation in the 1980s to the current challenges in proving application reliability on s…
▽ More
Software Reliability has just passed the 50-year milestone as a technical discipline along with Software Engineering. This paper traces the roots of Software Reliability Engineering (SRE) from its pre-software history to the beginnings of the field with the first software reliability model in 1967 through its maturation in the 1980s to the current challenges in proving application reliability on smartphones and in other areas. This history began as a thesis proposal for a History of Science research program and includes multiple previously unpublished interviews with founders of the field. The project evolved to also provide a survey of the development of SRE from notable prior histories and from citations of new work in the field including reliability applications to Agile Methods. This history concludes at the modern-day providing bookends in the theory, models, literature, and practice of Software Reliability Engineering from 1968 to 2018 and pointing towards new opportunities to deepen and broaden the field.
△ Less
Submitted 16 February, 2019;
originally announced February 2019.
-
Achieving and Managing Availability SLAs with ITIL Driven Processes, DevOps, and Workflow Tools
Authors:
James J. Cusick
Abstract:
System and application availability continues to be a fundamental characteristic of IT services. In recent years the IT Operations team at Wolters Kluwer CT Corporation has placed special focus on this area. Using a combination of goals, metrics, processes, organizational models, communication methods, corrective maintenance, root cause analysis, preventative engineering, automated alerting, and w…
▽ More
System and application availability continues to be a fundamental characteristic of IT services. In recent years the IT Operations team at Wolters Kluwer CT Corporation has placed special focus on this area. Using a combination of goals, metrics, processes, organizational models, communication methods, corrective maintenance, root cause analysis, preventative engineering, automated alerting, and workflow automation significant progress has been made in meeting availability SLAs or Service Level Agreements. This paper presents the background of this work, approach, details of its implementation, and results. A special focus is provided on the use of a classical ITIL view as operationalized in an Agile and DevOps environment.
Keywords: System Availability, Software Reliability, ITIL, Workflow Automation, Process Engineering, Production Support, Customer Support, Product Support, Change Management, Release Management, Incident Management, Problem Management, Organizational Design, Scrum, Agile, DevOps, Service Level Agreements, Software Measurement, Microsoft SharePoint.
△ Less
Submitted 13 May, 2017;
originally announced May 2017.
-
Considerations for Cloud Security Operations
Authors:
James Cusick
Abstract:
Information Security in Cloud Computing environments is explored. Cloud Computing is presented, security needs are discussed, and mitigation approaches are listed. Topics covered include Information Security, Cloud Computing, Private Cloud, Public Cloud, SaaS, PaaS, IaaS, ISO 27001, OWASP, Secure SDLC.
Information Security in Cloud Computing environments is explored. Cloud Computing is presented, security needs are discussed, and mitigation approaches are listed. Topics covered include Information Security, Cloud Computing, Private Cloud, Public Cloud, SaaS, PaaS, IaaS, ISO 27001, OWASP, Secure SDLC.
△ Less
Submitted 23 January, 2016;
originally announced January 2016.
-
Design, Construction, and Use of a Single Board Computer Beowulf Cluster: Application of the Small-Footprint, Low-Cost, InSignal 5420 Octa Board
Authors:
James J. Cusick,
William Miller,
Nicholas Laurita,
Tasha Pitt
Abstract:
In recent years development in the area of Single Board Computing has been advancing rapidly. At Wolters Kluwer's Corporate Legal Services Division a prototy** effort was undertaken to establish the utility of such devices for practical and general computing needs. This paper presents the background of this work, the design and construction of a 64 core 96 GHz cluster, and their possibility of y…
▽ More
In recent years development in the area of Single Board Computing has been advancing rapidly. At Wolters Kluwer's Corporate Legal Services Division a prototy** effort was undertaken to establish the utility of such devices for practical and general computing needs. This paper presents the background of this work, the design and construction of a 64 core 96 GHz cluster, and their possibility of yielding approximately 400 GFLOPs from a set of small footprint InSignal boards created for just over $2,300. Additionally this paper discusses the software environment on the cluster, the use of a standard Beowulf library and its operation, as well as other software application uses including Elastic Search and ownCloud. Finally, consideration will be given to the future use of such technologies in a business setting in order to introduce new Open Source technologies, reduce computing costs, and improve Time to Market.
Index Terms: Single Board Computing, Raspberry Pi, InSignal Exynos 5420, Linaro Ubuntu Linux, High Performance Computing, Beowulf clustering, Open Source, MySQL, MongoDB, ownCloud, Computing Architectures, Parallel Computing, Cluster Computing
△ Less
Submitted 5 January, 2015; v1 submitted 30 December, 2014;
originally announced January 2015.
-
Architecture and Production Readiness Reviews in Practice
Authors:
James Cusick
Abstract:
Detailed description of procedures around architecture reviews. In order to succeed in building and deploying complex software solutions, an architecture is essential. For many in the industry structured reviews of these architectures is also de rigor. Practices for such reviews have been developed and reported on for years. One aspect that does not receive as much attention but is no less importa…
▽ More
Detailed description of procedures around architecture reviews. In order to succeed in building and deploying complex software solutions, an architecture is essential. For many in the industry structured reviews of these architectures is also de rigor. Practices for such reviews have been developed and reported on for years. One aspect that does not receive as much attention but is no less important is the relationship between these architectures and the requirements for deploying them into production environments. At Wolters Kluwer's Corporate Legal Services we first established a typical architecture review process and then established a two phase production preparation review process. This paper describes in detail how these practices work and some of the technical results of these reviews including the frequency and style of the reviews, the process automation around them, and the number and nature of some of the technical flaws eliminated by enforcing these reviews. This paper lays the ground work for others who would be interested in following similar practices.
△ Less
Submitted 30 December, 2014; v1 submitted 10 May, 2013;
originally announced May 2013.
-
Biological Computing Fundamentals and Futures
Authors:
Balaji Akula,
James Cusick
Abstract:
The fields of computing and biology have begun to cross paths in new ways. In this paper a review of the current research in biological computing is presented. Fundamental concepts are introduced and these foundational elements are explored to discuss the possibilities of a new computing paradigm. We assume the reader to possess a basic knowledge of Biology and Computer Science
The fields of computing and biology have begun to cross paths in new ways. In this paper a review of the current research in biological computing is presented. Fundamental concepts are introduced and these foundational elements are explored to discuss the possibilities of a new computing paradigm. We assume the reader to possess a basic knowledge of Biology and Computer Science
△ Less
Submitted 9 November, 2009;
originally announced November 2009.
-
Applying Software Defect Estimations: Using a Risk Matrix for Tuning Test Effort
Authors:
James Cusick
Abstract:
Applying software defect esimation techniques and presenting this information in a compact and impactful decision table can clearly illustrate to collaborative groups how critical this position is in the overall development cycle. The Test Risk Matrix described here has proven to be a valuable addition to the management tools and approaches used in develo** large scale software on several rele…
▽ More
Applying software defect esimation techniques and presenting this information in a compact and impactful decision table can clearly illustrate to collaborative groups how critical this position is in the overall development cycle. The Test Risk Matrix described here has proven to be a valuable addition to the management tools and approaches used in develo** large scale software on several releases. Use of this matrix in development planning meetings can clarify the attendant risks and possible consequences of carrying out or bypassing specific test activities.
△ Less
Submitted 11 November, 2007;
originally announced November 2007.