What does a software development team need in order to deliver a high-quality product in a timely manner, while not killing itself with overwork? For quite a few years, I've advised testers and other development team members to embrace the values, principles, and practices that have been popularized in the Agile development movement. Among those principles:
- Communicate and collaborate constantly with all development and customer team members.
- Use rapid and continual feedback to guide your team.
- Drive development with tests.
- Get the whole team to commit to delivering a high-quality product.
- Start with the simplest solution first.
- Work in tiny increments.
- Use retrospectives to improve continuously.
My teams have done all these things. Yet, more often than we'd like, they manage to miss or misunderstand business requirements, with the result that we don't deliver the business value our customers need. Having to redo a user story or make major changes to it is frustrating for the customer and for the development teams.
Only in the past couple of years have my team and I discovered how critical it is to really understand the business for which we workand not just the parts of the business automated in the particular application at hand, but all aspects of our customers' jobs. I've always prided myself on quickly learning my employer's domain, no matter how difficult. I wish I could go back in time five or ten years and teach myself this lesson sooner. I can't do that, but perhaps you can benefit from my experiences.
The Production Support Dilemma
After a few years of working on software to manage 401(k) plans, my team had a good handle on the various processes, algorithms, and regulation compliance involved. We released new functionality to production every two weeks, and we had a stable build, backed by thousands of automated regression tests that we could have released every day if necessary. We implemented most new features in a solid, testable architecture, with very few bugs getting out to production.
However, we still were plagued with a significant production-support burden. Despite our test-first development process and robust code, most issues stemmed from users making mistakes that could only be corrected with offline data updates. Why did this happen? What could we do about human error?
In order to have more time to devote to new development, we had to get to the bottom of these production-support issues. Here's how we decided to proceed:
- We set a team goal of reducing the time spent addressing production-support requests.
- We decided that each of us would spend time learning a different part of the business in detail, looking for sources of problems and ways to mitigate them.
- We agreed to document everything in detail on our team wiki so that everyone could benefit.
- We budgeted time over several months for each of us to sit with a staff member and learn that person's job.