Operational Excellence

Operational Excellence

The ideal state organization has a focus not only on the creation of digital experiences but also on how those experiences are supported, maintained and scaled to meet the needs of its customers. Organizations that adopt an Operational Excellence mindset embrace Agile and DevOps best practices throughout all parts of the organization. They create processes that encourage tight cooperation between the teams developing new solutions and the teams that will ultimately support them after release.

Agile software development is often understood to be a perceived trade-off in favor of speed and flexibility at the expense of resilience and stability. Agile teams can spur innovation through the rapid delivery of value and their ability to pivot based on fast feedback, but their dynamism can be perceived to create an additional maintenance burden for the rest of the organization. However, the trade-off between speed and stability is a false one.

Operational Excellence Graphic 1

The most successful organizations are the ones that maintain an appropriate balance between speed and stability through practices that encourage risk mitigation, technical excellence, and a sustainable level of quality and availability.1

In recent years, many organizations have tried to avoid the perceived speed/stability trade-off by adopting a “two-speed IT” approach in which one side of the organization operates on a fast-track to deliver customer-facing initiatives, while the rest of the organization moves at a slower pace to maintain a stable backbone for business-critical operations. In practice, two-speed IT often delays the adoption of best practices throughout the entire organization, and creates unnecessary barriers between development and operational teams.2

Organizations that adopt an Operations Mindset embrace Agile and DevOps best practices throughout all parts of the organization. They create processes that encourage tight cooperation between the teams developing new solutions and the teams that will ultimately support them after release, including:

  • Anticipating issues with solution scalability, security, and availability by focusing on non-functional requirements and DevOps practices during development.
  • Building code quality into day-to-day development practices, in order to reduce downstream maintenance and support costs.
  • Providing support and operations teams ahead of launch with the resources they will need to maintain a production solution.
  • Setting aside capacity for production support activities.

Adopting such practices across an organization can result in significant quality improvements and cost savings, including: 24 times faster recovery from failures; 22% less unplanned rework, and 50% less time spent addressing security issues.3


Organizations that embrace this mindset have five specific characteristics that enable it to be realized:

Clear Non-Functional Requirements

Organizations that adopt an Operations Mindset set clear, non-functional requirements (NFRs) for digital experiences. The Scaled Agile Framework® defines NFR’s as:

Non-functional requirements (NFRs) define system attributes such as security, reliability, performance, maintainability, scalability, and usability. They serve as constraints or restrictions on the design of the system across the different backlogs.

The ideal state organization does not stop at simply defining what the NFR’s should be but also building a way to monitor them in staging and production in near real time.

Mindset Operational Infographic 2

With this focus on NFRs, organizations can more clearly delineate the point at which a code commit caused an application to not meet a specific NFR. Organizations that don’t adopt this mindset rarely get to that granular level of detail for determining causality.


Organizations that adopt this mentality will have a healthy DevOps practice within the organization. As DevOps exists to bridge the gap between development and operations, part of their responsibility related to development is covered in the Continuous Delivery Mindset. The other portion is directly tied to operations.

As a part of the operations process, the DevOps team will work to ensure that applications are constructed with the longer-term maintainability in mind. This will include a focus on adherence to NFRs, complete stack log monitoring, and scalability planning and testing.

Test-Driven Development

Test-driven development is a cycle of testing, coding, and refactoring that is used by developers to reduce defects and increase code readability and maintainability.

Building quality into every phase of the development process using continuous testing instead of waiting to test in a large batch at the end of a development cycle can substantially improve quality and costs, especially when NFRs are tested alongside functional requirements.

According to one study, test-driven development can increase initial development time by 15-35%, but that initial cost is offset by reduced maintenance costs and 40-90% fewer defects.4

Definition of Done

A Definition of Done is a checklist used by Agile teams to verify that a user story or feature is “done” and ready for acceptance. At a team level, it typically requires developers to ensure that automated tests are in place, that code has been peer-reviewed and that code has been checked in.

A more robust Definition of Done can be developed in conjunction with operations teams to ensure that a solution will be supportable once it is public. Such a pre-release Definition of Done may require support documentation to be complete, training and monitoring plans in place, or formal testing of NFRs through performance or availability testing.

Maintenance Capacity Allocation

Organizations that adopt the Operational Excellence Mindset take a purposeful approach to planning for maintenance development by setting aside team capacity to deal with unexpected production issues.

This can mean either creating a dedicated team of maintenance developers that is solely responsible for production support, or reserving maintenance capacity on a development team that is otherwise tasked with new feature development. For example, capacity allocation can be used during sprint planning to set aside 25% of a team’s time for unplanned maintenance, compared to 75% for sprint commitments.

Maintenance development need not be managed using Scrum in order to follow Agile best practices. Kanban is often a popular alternative for managing a team’s maintenance work because the flow of production issues is often unpredictable and not suited to formal planning cycles.

Pain Points

Organizations that do not adopt an Operational Excellence Mindset can encounter one or more of the following four pain points:

Higher Costs

Quality and stability issues can create major costs if they reach production. An average bug costs three times more to fix in production as it does to fix during initial coding.5

The hourly cost of unplanned application downtime can be in the billions for large companies.6

Operation Excellence Infographic 3

Even when issues don’t reach production, they may be more expensive to fix later in a development cycle if major architectural changes are required. Organizations that do not “shift left” by incorporating continuous testing in their process end up paying far more to fix issues than they would have invested in preventative measures earlier in a development cycle.

Technical Debt

Organizations that favor the development of new functionality at the expense of technical remediation and refactoring often incur significant technical debt that leads to increased rework costs down the road.

Scaling Difficulties

In many organizations without an Operational Excellence Mindset, scalability can be a significant challenge. Without properly planning for scalability during the development phase, many applications are built without scalability in mind. In addition, in organizations that don’t rely on dynamic infrastructure, hardware can quickly become a limitation if a digital property quickly increases in adoption.

Poor Employee Morale

Organizations that don’t explicitly set aside team capacity to address unexpected production issues often overload their development teams by expecting production issues to be addressed without any impact to those teams’ existing commitments.

Furthermore, organizations that don’t involve operational teams in the development process often leave those teams feeling unequipped to properly support new solutions.


Organizations that adopt an Operations Mindset prioritize operational considerations during development in order to strike an appropriate balance between speed of delivery and long-term stability. In contrast to “two-speed IT” adherents, these organizations employ Agile and DevOps practices throughout the organization, and strive to ensure clear collaboration between development and support teams throughout the development process.