When working on projects Scrum Masters and Delivery managers often need to estimate a velocity of the team during next sprint. Even in agile approach clients would like to know what they can expect from next release. From the other hand it's good if SM could assess how much a team can deliver to propose sprint goal and stretched goals to challenge a team.
According to Scrum Book (http://scrumbook.org/value-stream/running-average-velocity.html) running average (moving average, rolling average) could be a good start when forecasting future velocity. I'll use Python to calculate 3 sprints running average for the example data (forecastedVelocity). In the forecastError column you can find difference between actual and forecasted value.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
closedSprints = pd.read_csv('sprints.csv', delimiter=';')
closedSprints.head()
# calculate rolling average of 3 last ratios
closedSprints['forecastedVelocity'] = np.round(closedSprints['velocity'].shift().rolling(3, min_periods=1).mean(), 2)
closedSprints['forecastError'] = closedSprints['velocity'] - closedSprints['forecastedVelocity']
closedSprints.head()
I'm using first predicted value as a forecast for 3 future sprints.
lastForecast = closedSprints.loc[closedSprints['sprintNo'] == 20, 'forecastedVelocity'].iat[0]
closedSprints.loc[closedSprints['sprintNo'].isin([21,22]), 'forecastedVelocity'] = lastForecast
figure(num=None, figsize=(8, 6), dpi=80, facecolor='w', edgecolor='k')
_ = plt.xticks( closedSprints.index.values , closedSprints.sprintNo ) # location, labels
_ = plt.plot( closedSprints['velocity'])
_ = plt.plot( closedSprints['forecastedVelocity'])
_ = plt.xlabel('Sprint no')
_ = plt.ylabel('story points sum')
_ = plt.legend(['velocity', 'forecasted velocity'])
_ = plt.show()
Above you can find the historical sprint data with calculated forecasted velocity. This calculated forecast is in a middle of possible forecast values. It's always recommended (https://otexts.org/fpp2/perspective.html#perspective) to calculate prediction intervals to understand the range of possible forecasted values. This should give us additional insight and confidence in the model and evalate forecast accuracy.
To calculate prediction intervals we need to calculate standard deviation of forecast errors(https://otexts.org/fpp2/prediction-intervals.html).
errorStd = round(closedSprints.forecastError.std(), 2)
print('Forecast error standard deviation: {}'.format(errorStd))
To simplify I'm assuming that forecast error is following normal distribution. To calculate uncertainty of forecasts I'm using 1.96 multiplier for 95% confidence interval and 1.44 multiplier for 85% confidence interval. Values for other intervals can be found here: https://otexts.org/fpp2/prediction-intervals.html
Generally forecast uncertainty should grow with time but I didn't included this factor to simplify calculations.
closedSprints.loc[closedSprints['sprintNo'].isin([20, 21, 22]), 'min95'] = (lastForecast
- 1.96 * errorStd).clip(0)
closedSprints.loc[closedSprints['sprintNo'].isin([20, 21, 22]), 'max95'] = lastForecast + 1.96 * errorStd
closedSprints.loc[closedSprints['sprintNo'].isin([20, 21, 22]), 'min85'] = (lastForecast
- 1.44 * errorStd).clip(0)
closedSprints.loc[closedSprints['sprintNo'].isin([20, 21, 22]), 'max85'] = lastForecast + 1.44 * errorStd
closedSprints.tail()
figure(num=None, figsize=(8, 6), dpi=80, facecolor='w', edgecolor='k')
_ = plt.xticks( closedSprints.index.values , closedSprints.sprintNo ) # location, labels
_ = plt.plot( closedSprints['velocity'])
_ = plt.plot( closedSprints['forecastedVelocity'])
#_ = plt.plot( closedSprints['min95'])
#_ = plt.plot( closedSprints['max95'])
_ = plt.fill_between(closedSprints.index.values, closedSprints['min95'], closedSprints['max95'], color='orange', alpha='0.1')
_ = plt.fill_between(closedSprints.index.values, closedSprints['min85'], closedSprints['max85'], color='red', alpha='0.1')
_ = plt.xlabel('Sprint no')
_ = plt.ylabel('story points sum')
_ = plt.legend(['velocity', 'forecasted velocity'])
_ = plt.show()
As you can see above 95% prediction interval (orange) on the diagram is wider than 85% interval (red). According to calculation 95 out of 100 observations the delivered velocity should be between 7.4 and 20.9 story points. Using this information we could plan accordingly when features will be delivered by the team.
As you can see above prediction interval is quite wide. From definition we know that it's directly affected by forecast error. What can you do to minimize the error? First of all I'd recommend using more sophisticated forecast methods than rolling average from last 3 sprints. This may be good start but in real life practice it's usually too simple, because it's not taking into consideration factors like team capacity, structure etc. There are be other models that should give better forecast results. In next articles I will demonstrate how they can be applied in Agile.
Another suggestion is to reduce story size. This should have several advantages:
As you can see above prediction intervals might be extremely useful to understand deliverables forecast for agile teams. If you need to be confident when planning you should add this tool to your quiver.