- Property 1.Frequent Delivery
- Property 2.Reflective Improvement
- Property 3.Osmotic Communication
- Property 4.Personal Safety
- Property 5.Focus
- Property 6.Easy Access to Expert Users
- Property 7.Technical Environment with Automated Tests, Configuration Management, and Frequent Integration
- Evidence: Collaboration across Organizational Boundaries
- Reflection on the Properties
Property 7. Technical Environment with Automated Tests, Configuration Management, and Frequent Integration
The elements I highlight in this property are such well-established core elements that it is embarrassing to have to mention them at all. Let us consider them one at a time and all together.
Automated Testing. Teams do deliver successfully using manual tests, so this can't be considered a critical success factor. However, every programmer I've interviewed who once moved to automated tests swore never to work without them again. I find this nothing short of astonishing.
Their reason has to do with improved quality of life. During the week, they revise sections of code knowing they can quickly check that they hadn't inadvertently broken something along the way. When they get code working on Friday, they go home knowing that they will be able on Monday to detect whether anyone had broken it over the weekendthey simply rerun the tests on Monday morning. The tests give them freedom of movement during the day and peace of mind at night.
Configuration Management. The configuration management system allows people to check in their work asynchronously, back changes out, wrap up a particular configuration for release, and roll back to that configuration later on when trouble arises. It lets the developers develop their code both separately and together. It is steadily cited by teams as their most critical noncompiler tool.
Frequent Integration. Many teams integrate the system multiple times a day. If they can't manage that, they do it daily, or, in the worst case, every other day. The more frequently they integrate, the more quickly they detect mistakes, the fewer additional errors that pile up, the fresher their thoughts, and the smaller the region of code that has to be searched for the miscommunication.
The best teams combine all three into continuous integration-with-test. They catch integration-level errors within minutes.
Can you run the system tests to completion without having to be physically present?
Do all your developers check their code into the configuration management system?
Do they put in a useful note about it as they check it in?
Is the system integrated at least twice a week?
How frequent should frequent integration be? There is no fixed answer to this any more than to the question of how long a development iteration should be.
One lead designer reported to me that he was unable to convince anyone on his team to run the build more than three times a week. While he did not find this comfortable, it worked for that project. The team used one-month-long iterations, had osmotic communications, reflective improvement, configuration management, and some automated testing in place. Having those properties in place made the frequency of their frequent integration less critical.
The most advanced teams use a build-and-test machine such as Cruise Control4 to integrate and test nonstop (note: having this machine running is not yet sufficient . . . the developers have to actually check in their code to the main line code base multiple times a day!). The machine posts the test results to a Web page that team members leave open on their screens at all times. One internationally distributed development team (obviously not using Crystal Clear!) reports that this use of Cruise Control allows the developers to keep abreast of the changing code base, which to some extent mitigates their being in different time zones.
Experiment with different integration frequency, and find the pace that works for your team. Include this topic as part of your reflective improvement. For more on configuration management, I refer you to Configuration Management Principles and Practice (Hass 2003) Configuration Management Patterns (Berczuk 2003), and Pragmatic Version Control using CVS by the Pragmatic Programmers (Thomas 2003). You may need to hire a consultant to come in for a few days, help set up the configuration management system, and tutor the team on how to use it.
Automated testing means that the person can start the tests running, go away, not having to intervene in or look at the screens, and then come back to find the test results waiting. No human eyes and no fingers are needed in the process. Each person's test suites can be combined into a very large one that can, if needed, be run over the weekend (still needing no human eyes or fingers).
Three questions immediately arise about automated testing:
At what level should they be written?
How automated do they have to be?
How quickly should they run?
Besides usability tests, which are best performed by people outside the project,5 I find three levels of tests hotly discussed:
Customer-oriented acceptance tests running in front of the GUI and relying on mouse and keyboard movements
Customer-oriented acceptance tests running just behind the GUI, testing the actions of the system without needing a mouse or keyboard simulator
Programmer-oriented function, class, and module tests (commonly called unit tests)
The automated tests that my interviewees are so enthusiastic over are from the latter two of those categories. Automating unit tests allow the programmers to check that their code hasn't accidentally broken out from under them while they are adding new code or improving old code (refactoring). The GUI-less acceptance tests do the same for the integrated system, and are stable over many changes in the system's internal design. Although GUI-less acceptance tests are highly recommended, I rarely find teams using them, for the reason that they require the system architecture to carefully separate the GUI from the function. This is a separation that has been recommended for decades, but few teams manage.
Automated GUI-driven system tests are not in the highly recommended short list because they are costly to automate and must be rebuilt with every change of the GUI. This difficulty makes it all the more important that the development team creates an architecture that supports GUI-less acceptance tests.
A programmer's unit tests need to execute in seconds, not minutes. Running that fast, the programmer will not lose her concentration while they run, which means that the tests are actually run as the programmer works. If the tests take several minutes to run, she is unlikely to rerun the tests after typing in just a few lines of new code or moving two lines of code to a new function or class.
Tests may take longer when the code is checked into the configuration management system. At this point, the programmer has completed a sequence of design actions, and can afford to walk away for a few minutes while the tests run.
The acceptance tests can take a long time to run, if needed. I write this sentence advisedly: The reason the tests run a long time should be because there are so many tests or there is a complicated timing sequence involved, not because the test harness is sloppy. Once again, if the tests run quickly, they will get run more often. For some systems, though, the acceptance tests do need to run over the weekend.
Crystal Clear does not mandate when the tests get written. Traditionally, programmers and testers write the tests after the code is written. Also traditionally, they don't have much energy to write tests after they write code. Partially for thisreason, more and more developers are adopting test-driven development (Beck2003).
The best way I know to get started with automated testing is to download a -language-specific copy of the X-unit test framework (where X is replaced by the language name), invented by Kent Beck. There is JUnit for Java programmers, CppUnit for C++ programmers, and so on for Visual Basic, Scheme, C, and even PHP. Then get one of the books on test-driven development (Beck 2003, Astels 2003) and work through the examples. A Web search will turn up more resources on X-unit.
Both httpUnit and Ward Cunningham's FIT (Framework for Integrated Tests) help with GUI-less acceptance tests. The former is for testing HTML streams of Web-based systems, the latter to allow the business expert to create her own test suites without needing to know about programming. Robert Martin integrated FIT with Ward's wiki technology to create FITnesse.6 Many teams use spreadsheets to allow the business experts to easily type in scenario data for these system-function tests.
There are, sadly, no good books on designing the system for easy GUI-less acceptance testing. The Mac made the idea of scriptable interfaces mainstream for a short while (Simone url) and scripting is standard with Microsoft Office. In general, however, the practice has submerged and is used by a relatively small number of outstanding developers. The few people I know who could write these books are too busy programming.
I end this section with a small testimonial to test-driven development that I hope will sway one or two readers. Thanks to David Brady for this note:
Yesterday I wrote a function that takes a variable argument, like printf(). That function decomposes the list arguments, and drops the whole mess onto a function pointer. The pointer points to a function on either the console message sink object or a kernel-side memory buffer message sink object. (This is just basic inheritance, but it's all gooky because I'm writing it in C.)
Anyway, in the past I would expect a problem of that complexity to stall me for an indefinite amount of time while I tried to debug all the bizarre and fascinating things that can go wrong with a setup like that.
It took me less than an hour to write the test and the code using test-first.
My test was pretty simple, but coming up with it was probably the hardest part ofthe whole process. I finally decided that if my function returned the correct number of characters written (same as printf), that I would infer that the function was working.
With the test in place, I had an incredible amount of focus. I knew what I had to make the code do, and there was no need to wander around aimlessly in the code trying to support every possible case. No, it was just "get this test to run." When I had the test running, I was surprised to realize that I was indeed finished. There wasn't anything extra to add; I was actually done!
I usually cut 350400 lines of production-grade code on a good day. Yesterday I didn't feel like I had a particularly good day, but I cut 184 test LOC and 529 production LOC, pLOC that I *know* works, because the tests tell me so, pLOC that includes one of the top-10 trickiest things I've ever done in C (that went from "no idea" to "fully functional" in under 60 minutes).
Wow. I'm sold.
Test infection. Give it a warm, damp place to start, and it'll do the rest. . . .