“It’s better to be roughly right than precisely wrong.”
-John Maynard Keynes
What do all of these have in common?
- T-Shirt Sizes
- Story Points
Answer: They’re all estimation scales used in Streamline’s application development lifecycle.
Streamline Health’s development teams employ each of these different levels of granularity for estimation at different phases of the lifecycle. Appropriately accurate estimates are key in all phases of our lifecycle, from figuring out how long it will take to achieve the software development part of an organizational goal all the way down to sprint planning by a single team.
From the highest level, earliest phase down to the last days of a sprint, that Keynes quote up at the top holds true. If you strive for accurate estimates you’ll quickly encounter the law of diminishing returns. You simply can’t accurately estimate work that you don’t fully understand (yet). Spending time early on to gain the necessary understanding will take nearly as much time as simply (or not so simply, which is the problem) doing the work.
Our increasingly more granular scales is Streamline’s approach to avoiding “estimation paralyses.”
It’s important to understand that our agile estimation techniques are not intended to produce quotes for clients. We are not directly selling our software development services. Agile estimation at Streamline is used for planning and capacity management.
Our earliest estimation scale, for when we’re thinking about the next quarter (“Planning Increment”) in broad terms is simply small, medium, large, and extra large. When we use these, we’re talking at the “feature” level: work for a single product that will take more than one sprint to complete. A “small” feature is one that could be completed in a couple sprints with capacity left over for other work. Medium will take two to three sprints and Large will take most of the capacity of one team for the entire planning increment. An extra large feature is one that we don’t think can be completed in four sprints (one planning increment) by one team. T-shirt size estimates are assigned by the product management team with expert advice from architecture, engineering, and other SMEs.
When we start trying to shoe-horn features into a planning increment, we move to sprint estimates. How many sprints would this feature take if it were the only thing that one team worked on? Yes, this is pretty similar to the t-shirt sizes we started with. Sprint estimates are not as intuitive for senior management to understand, so the t-shirt sizes are something a proxy for sprints designed for stakeholder consumption.
The sprint estimate is not about actual calendar/sprint time. We would never package all of the stories for one feature into one sprint after another with no other work for that team. We can’t — there are always competing priorities.
Sprint estimates are done by the scrum teams and stakeholders during our quarterly planning increment planning session. After the teams have had an opportunity to review and refine the proposed features they’re asked to estimate them in sprints. By this time they have an idea of how the features will be decomposed, so their estimates are more granular than the t-shirts, if still quite rough.
Story Points and Ideal Developer Days
Story points are a very standard way of estimating user stories and other sprint-level work items (including bug fixes and small changes). The intent is to keep the teams from diving into hour-based estimates that would tempt them to try to decompose into tasks. Getting engineers to accept this practice is one of the most difficult aspects of an agile transformation. These people are detail oriented and precise, and story points feel sloppy to them.
The introduction of the Ideal Developer Day Story Point is intended, among other purposes, to help them get used to estimating in the abstract (by making it less abstract). An Ideal Developer Day is however much work a typical team member gets done in a typical day. It’s difficult for developers not to ask “how many hours is that?” which leads to further decomposition. We counter those questions with “just use your gut — how many days do you think you’d spend on your part of this story? How many days would your colleagues in the other disciplines take? Yes, you do have to include their work in your estimate. You’re a team.”
For story point estimates we use the Fibonacci sequence. This ever-growing series of numbers forces larger chunks of work to have proportionally higher estimates. This enforces the notion that you don’t know what you don’t know, and builds in the extra time to learn about the unknown. It also eliminates arguments like,
“It will take four days!”
“No, it will take five!”
“I think it will take six!”
Just pick either five or eight and move on.
Finally, once the stories are packaged the sprint is kicked off, the team plans their tasks using estimation in hours. But we still limit the granularity: The minimum is a half day (three or four hours), with half-day jumps after that. So tasks can be four, eight (a full day), twelve, or sixteen (two days) hours. The next value would be twenty-four, but we discourage teams from creating tasks that large because, frankly, it means the don’t really know how large it is.
This is another tough concept for engineers to accept — breaking up what they consider one long task into one and two-day chunks just so that you can call parts of it done. But it’s an important practice that helps the team see progress on their task board, also identify incorrect estimates sooner, and intercede when a team member is having a problem and not admitting it. If a task was planned to take three days, and on the third day the team discovers that the person working on it really hasn’t made two days worth of progress, they’ve already lost a full day that they could have used to fix the problem.
The estimate just be provided by the team that is going to do the work. At the feature level, sprint estimates might be done by a gathering of multiple teams. At the story level, if you aren’t sure which team will do the work, get them all together to estimate, or if you change the team assignment, have the new team re-estimate. It’s critical that the team doing the work owns the estimate.
For sprint and story point estimates, we ensure that all those voting agree to the result. It is important to emphasize consensus on estimates done by a team so that nobody on the team can later say, “well, Ted and I voted thirteen but the three people who voted five won. I knew we wouldn’t get the work done.”
To prevent an estimation session from devolving into a design session, or a riot, we follow a three-vote process. After the product owner presents the story or feature, the team can ask some questions and might even do a little more refinement. Then they’re asked to vote. If the vote is not unanimous — if even one person differs — we set a two minute timer and ask someone with a high or low vote to try to convince the others. If there’s still time in the two minutes for someone else to state an opposing opinion they can. After two minutes the team must vote again. We repeat this process until three votes are taken, and only then do we take the majority.
The teams can use this process during planning if they’re having trouble gaining consensus, but we generally find that a mature team can reach agreement over task estimates without the formality.