Quantifying Margin in Scaled Agile

July 26, 2023 · 8 min read

CIO, Atlas Revolutions

Chief Engineer, Triple Dot Engineering

CEO, Lint Agility Services

Capacity margins are a fundamental aspect of successful Agile implementations, particularly within the context of the Scaled Agile Framework (SAFe^®). In essence, capacity margins refer to the intentional allocation of surplus resources and time beyond the estimated requirements for Agile projects. As Agile methodologies gain prominence in modern software development, capacity margins serve as a safety net, allowing teams to effectively handle unforeseen challenges, fluctuations in workload, and uncertainties that are inherent in complex development endeavors. By maintaining capacity margins, Agile teams can avoid the risk of overloading their resources, prevent potential bottlenecks, and retain the agility necessary to adapt to changing requirements or market demands.

What is Capacity Margin?

A common criticism of SAFe^® is that capacity margins mean that we are under delivering against our capacity, and it represents a fundamental misunderstanding of capacity margins. The built-in margins in SAFe^® represent a means to promote stability and resilience within Agile teams and their development processes. They empower organizations to maintain a consistent pace of project delivery without compromising the quality of their software products, which is a key reason that organizations implement Agile and SAFe^® in the first place: to promote sustainable value delivery over time.

At the team level, SAFe^® encourages Agile teams to allocate capacity margins to accommodate unplanned work, disruptions, or dependencies that might impact their ability to complete planned tasks within the iteration. By allowing some "slack" in their capacity, teams can better respond to emerging issues without derailing the entire iteration and maintain a sustainable pace of work. This buffer also prepares teams to handle interdependencies, integration issues, and any unforeseen cross-team challenges that may arise during the PI.

Typically, this capacity margin takes the form of a 20% story point buffer during the development iterations, but additionally, SAFe^® also encourages the creation of an estimating guard band during the Innovation & Planning (IP) iteration to address risks that might impact delivery. The estimating guard band is not planned development time and represents the non-PI Planning weeks of the IP iteration and give the teams enough time to close out those last remaining features and stories during the PI. These reserves act as a “guard band” to manage exceptional circumstances that could disrupt the planned work, such as unexpected delays or emergencies.

Criticisms of Using Capacity Margins

Two other common criticisms of capacity margins include:

A Reduced Sense of Urgency

Having capacity margins could lead to a reduced sense of urgency among team members. If there is always a safety net available, there might be less motivation to optimize and push for higher efficiency, potentially leading to complacency in meeting deadlines and achieving objectives.

It’s important not to underestimate the power of intrinsic motivation. People inherently want to do a good job with high quality. Additionally, a safety net allows for us to deliver more predictably with less volatility. The reality is that capacity margins enable us to provide better forecasts and meet delivery milestones and objectives efficiently.

A Lack of Transparency

Critics argue that capacity margins might mask underlying issues in project planning and execution. If problems are consistently resolved by using the buffer, it may be challenging to identify and address the root causes of recurring challenges.

While it is true that capacity margin will allow us to deliver more consistently, despite problems in the system, there are other components of agile methodologies such as retrospectives and Inspect & Adapt (I&A) workshops that are designed to surface, address, and eliminate those problems. Agile is a holistic methodology and we should not expect one practice to solve every problem across the whole framework.

Quantifying Capacity Margin in SAFe^®

It is important for us to recognize that there are actually two major types of margin in SAFe^®. We have the capacity margin that development teams use to create an estimating guard band around delivery, and we have the business value margin represented by the program predictability measure. So SAFe^® actually protects us in two ways. We create a buffer around our development capacity, and also around stakeholder and business expectations for the value created in any given PI. It may be tempting to use hours as a substitute for both forms of margin, but we should not do that as the business value margin is not tied directly to the hours worked. It is a measure of stakeholder and market satisfaction with the work done and ensures that a predictably stable team of agile teams will meet 80-100+% of their value commitment in every PI.

In this paper, we focus on the development capacity margin, which represents anywhere from 30.0-42.2% of the total development capacity depending on the specific PI and Iteration constructs your organization uses. A typical 10 week PI with 2 week iterations has a 32% capacity margin built in.

Methodology

We began by identifying capacity margin in each iteration. By only commiting to 80% of what they think they can deliver, the team builds in a 20% margin for each iteration. We calculated hours of margin in each iteration as follows:

M_{iter} = .2 * (40 * L) * ( N - 1)

Where $M_{iter}$ is the iteration margin, $L$ is the length of the iteration in weeks, and $N$ is the number of iteration in the PI. We then calculated the number of hours in the innovation & planning (IP) iteration as follows:

M_{ip} = (40 * L - 40)

Where $M_{ip}$ is the IP iteration margin, $40*L$ is the number of hours in the IP iteration, and the $-40$ accounts for the week of PI planning where no development occurs.

Our total margin can now be calculated as:

M_{total} = M_{iter} + M_{ip} \\ M_{total}= .2 * (40 * L) * ( N - 1) + (40 * L - 40)

Calculating Capacity

To understand percent margin, we needed to agree on the total capacity, $C$ , of the team. We considered two main approaches:

(1) \text{ } C = (40 * L) * (N)

Where we consider the total capacity of the team as the number of hours available in the PI. In our second approach, we ignore the capacity of the team during the IP iteration because it is not planned development time.

(2) \text{ } C = (40 * L) * (N - 1)

We decided that method (2) was more appropriate as it represents the planned development capacity of the team to deliver value rather than the total planned working hours of the team.

Percent Margin

With an understanding of margin hours and capacity hours we can finally calculate percent margin as:

M_{pct} = \frac{M_{total}}{C} * 100

The Python code used for the calculation is available here.

Results

Recognizing that in SAFe^®, PI lengths should fall between 8 and 13 weeks, we calculated the margin for a broader range of variability to show how margin changes with iteration and PI length.

margin vs PI length

What this shows is that as the number of iterations in the PI increases, margin decreases. Similarly, increasing iteration length (in weeks) increases overall margin.

Considering only the results that might be practical in SAFe^®, we get the the following results:

Iteration length	Iterations in PI	PI Length	Margin
2	4	8-weeks	36.67%
2	5	10-weeks	32.50%
2	6	12-weeks	30.00%
3	4	12-weeks	42.22%

Conclusion

Recognizing that a 30.0-42.2% margin ensures the teams continue to deliver predictably and with high quality, it is a small price to pay for the increased stakeholder and team-member satisfaction. However, we also recognize that it is essential to strike the right balance when defining the size of the guard band. An excessively large buffer might lead to underutilization of resources, but an insufficient margin might fail to provide the necessary flexibility to accommodate unforeseen challenges effectively.

In conclusion, organizations embarking on Agile and SAFe^® implementations should carefully consider the organizational context, and trust the SAFe^® process to appropriately reflect the nature of the work, and the risk appetite of the organization. By judiciously incorporating capacity margins into their planning practices, Agile teams achieve higher adaptability, enhanced predictability, and ultimately deliver consistently, at the right time, to the right people, with the right value.

What is Capacity Margin?​

Criticisms of Using Capacity Margins​

Quantifying Capacity Margin in SAFe®​

Methodology​

Calculating Capacity​

Percent Margin​

Results​

Conclusion​