Frame 4: Policy-Driven Waste
Waste is anything that depletes resources of time, effort, space, or money without adding customer value. Most people do not frame their view of a system in terms of waste, so waste tends to fall outside their field of view. You can remove only the waste you see, so it is important to adjust the way you look at work so that the always-present waste becomes clearly visible.
You would be amazed at how much waste in a system is caused by the system itself, that is, by the way the work in the system is done. Too often waste is disguised under the cloak of habit or conventional wisdom—and more often than not, these sources of waste are embedded in the policies and standard procedures of the organization. Unless and until these policies change, the waste is not going to go away.
How Can Policies Cause Waste?
There are many ways that policies can cause waste. For example, leaders in many companies believe that developers should not talk to customers because this is a waste of valuable developer time. In Frame 3: End-to-End Flow, we discussed a critical defect process in which three levels of customer support insulated developers from customers (see Figure 1-3). By simply removing the three levels of customer support and having developers talk directly to customers, 40% of the total time of 800 developers was freed up. This is not an isolated case. We will see in a case study in Chapter 6 that direct developer-customer interaction delivered more of the right content, increased sales, and dramatically reduced support calls.
Of course, simply providing direct customer-developer interaction does not necessarily eliminate the biggest cause of waste; other policies can overwhelm the advantages of good customer interaction.
Another example of policy-driven waste was depicted in the time-to-market value stream map in Frame 3: End-to-End Flow (see Figure 1-4). A quick glance showed that three long queues were the cause of most of the delay in getting products to market, yet the organization was blind to them. This was probably because the queues were not the responsibility of any department manager; their real purpose was to buffer departments from the variation of neighboring departments. Failure to see the impact of queues is often caused by focusing exclusively on department-level performance or by trying to achieve high utilization of scarce talent. These policies leave most companies oblivious to the tremendous drag local optimization has on time-to-market and end-to-end flow and hence on cost, revenue, and yes, even utilization.
The Five Biggest Causes of Policy-Driven Waste
In our experience, the most common causes of policy-driven waste in software development are
- Economies of scale
- Separating decision making from work
- Wishful thinking
- Technical debt
In "No Silver Bullet" Fred Brooks wrote, "Software entities are more complex for their size than perhaps any other human construct. . . . Many of the classic problems of developing software products derive from this essential complexity and its nonlinear increases with size." 21 We know this. And yet . . .
- Our software systems contain far more features than are ever going to be used.22 Those extra features increase the complexity of the code, driving up costs nonlinearly. If even half of our code is unnecessary—a conservative estimate—the cost of the system with that extra code is not just double; it's perhaps ten times more expensive than it needs to be. Our best opportunity to improve software development productivity is this: Stop putting features into our systems that aren't absolutely necessary.
- Many of our policies are predicated on the assumption that scope is nonnegotiable. This leaves us no way to stop adding those unnecessary features. We need a process that lets us develop the first 20% of a system, get it in production, get feedback, and add features incrementally as time and money permit. We need policies that say: If something has to be compromised—cost, schedule, or scope—the default choice should routinely be scope.
- Even our measurements give subtle messages that we should squeeze as much code into a system as possible. We measure productivity based on lines of code or function points, as if these things were good. They're not; they're bad. Function points might provide interesting relative data, but they should never be used as performance metrics.
- We need to keep our code bases simple. This means we shouldn't add features until they are needed. Forget just in case; develop just in time. We need architectures that foster incremental development. We need policies that make refactoring—removing complexity introduced when changing the code—a normal and expected part of adding new features.
- Our customers often want to use software to automate their complex processes. This is not a good idea. Business processes should be simplified first and automated later. How often do we help our customers simplify their processes before automating them?
The lean frame of reference focuses on simplicity. Lean thinkers know that complexity clogs up the flow of work and inevitably slows things down. The cost of complexity is hidden; it has a second-order effect on cost, so we just don't see it in our financial systems. This makes complexity all the more pernicious—it's hard to cost-justify spending money to keep things simple.
At the end of this chapter we introduce the concept of ideation, the process of coming up with a design so fitting to the problem that it seems inevitable. Solutions that "just fit" are necessarily simple; great designs always make us wonder how something so obvious could have escaped us for so long.
Economies of Scale
Many of our instincts, policies, and procedures are rooted in the economies of scale, which drove huge improvements in productivity as industrial production replaced craft production in the first half of the twentieth century. But during the second half of that century, it became apparent that in any system with high variety, the economies of flow outperform the economies of scale, even in manufacturing. Software groups develop one-of-a-kind systems—the essence of variety. It should be obvious that we should base our policies and processes on the economies of flow. And yet . . .
- It's difficult to abandon batch and queue mentality. We sort work into batches so we can assign each batch to the appropriate specialist, making maximum use of the specialist's time and skills. Full utilization of our most skilled workers is considered essential, and people are conditioned to focus on doing their part of the work without regard to its impact on the next step or the final customers. As a result, neither our workflow nor our workers are capable of absorbing variety.
- It is so difficult to abandon batch and queue mentality that we fail to see queues that are staring us in the face. We can't figure out why it takes so long for things to move through our backlog-laden processes. We are blind to lists of customer requests that would take years to clear. We can't bring ourselves to shorten our queues because it would mean saying no to customers rather than letting their requests die a slow death.
- Instead of designing a system that can absorb urgent requests, we pull workers off their current job to rush a yet-more-important job through our system. We ask people to work on three, five, ten, or more things at once. We are blind to the enormous amount of time wasted in context switching. It never occurs to us that if we did one thing at a time rather than three, everything would get done a lot faster and we would deliver value a lot sooner.
- Sometimes we do let people work on one thing at a time, and then use computer systems to make sure that everyone is busy all of the time. We schedule projects and assign teams with an eye to full utilization. This scheme has little capability to absorb ever-present variation, so it is absorbed in the ramp-up time of newly formed teams. It would be much better to assign work to established teams than to reconstitute teams around projects.
- We create annual budgets or long project plans that justify every person by committing to what they will deliver. Then we dump this big batch of work on our organization all at once. We must deliver everything that was promised, but as time goes on, reality intervenes. Customers want other things but aren't willing to pay more or give up what was promised. We know this system never works, but we don't see a way to escape the policy of making big batch promises.
Lean thinking uses economies of flow, rather than economies of scale, to frame the world we look at. Variety is an essential ingredient of software development, and so we need processes that absorb the variety gracefully. We will discuss such flow processes in Chapter 3. The problem is, when the world is framed with economies of scale, these approaches seem counterintuitive.
Separating Decision Making from Work
Any solution to a problem is necessarily simplified and abstracted when removed from its source, and it is for this reason that a design should not be separated from the concrete context in which it is implemented. We know that our deepest insights into our work are based on the tacit knowledge we get from being there: watching, experiencing, and getting our hands dirty. We know that great designs come from designers who are deeply engaged with solving the problem. We know that throwing things over the wall doesn't work. And yet . . .
- There is widespread belief that it's not necessary for managers to understand work they manage. Yet without a technical background, managers are not in a position to provide guidance to technical workers. Some managers simply establish targets and leave it to workers to figure out how to meet them. Others put a team together and charter the members with figuring out how to do the right thing. From a lean perspective, the fundamental job of managers is to understand how the work they manage works, and then focus on how to make it better. 23 This is not to say that all leaders must know all the answers; the critical thing is that they know what questions to ask.
- We have created organizational cultures where the only available career path is to leave the technical details behind. We do not honor or adequately reward the seasoned architect or brilliant user interaction designer. What future awaits a tech lead who can reinvent the testing process and put together a set of tools that runs every bit of code through every operating system and database every single night? When the only career path for these people is to leave their core expertise behind and become managers, we will never have top-notch technical leadership on the ground, where we need it.
- In Lean Product and Process Development, Allen Ward writes that handoffs (handovers) are the biggest waste in product development. He says that a handoff occurs whenever we separate responsibility (what to do), knowledge (how to do it), action (doing the work), and feedback (learning from the results). 24 Our processes are full of these handovers, and we fail to see what's wrong with them. But in practice, many day-to-day decisions are based on tacit knowledge, which gets left behind in a handover. We must think that tacit knowledge transfers by magic, if we understand tacit knowledge in the first place.
Our language betrays us when we talk about "The Business." With these words we separate development decisions from the work they are automating (see Figure 1-5).
Figure 1-5 From "the" business to "our" business
Even agile software development methodologies make this mistake. They may recommend a "customer" or a "product owner" who is supposed to decide what all of our customers want and prioritize the order of development. But the most successful development occurs when developers talk directly to customers or are part of business teams. And those things called requirements? They are really candidate solutions; separating requirements from implementation is just another form of handover.
- Once our systems are deployed, there are policies and interpretations of laws (Sarbanes-Oxley, for example) that keep developers away from their code. So we walk away and leave the support team to deal with any problems that occur. The support team members are the ones who get the phone calls in the middle of the night, yet we wonder why they don't like frequent deployments. They are the victims of our risky design practices, but we are not interested in hearing about the causes of system failures. Perhaps our world would be a better place if all developers had to walk in the shoes of the operations and support team for a month every year.
Ever since Adam Smith wrote about division of labor in a pin factory,25 it has been commonly accepted wisdom that division of labor increases productivity—the more specialization the better. Fortunately, this "fact" was lost on Toyota's Taiichi Ohno, who devised a system where multiskilled workers and easy-to-reconfigure machines are more productive than specialists. There are two reasons for this: First, lean systems are designed to absorb variety, and second, they are designed to be relentlessly improved as the workers devise ever better ways to do the work. In the context of system development, Adam Smith was dead wrong.
Frederick Winslow Taylor got one thing right. He insisted that work improvement should be based on the scientific method. Taiichi Ohno embraced this idea—but instead of having "experts" measure and improve the work of production workers, he trained production workers to measure and improve their own work. We know that making decisions based on data rather than opinion is the right approach. Being in software, we can create tools to gather any data we want any day of the week. And yet . . .
- We chase the latest ideas in software development without bothering with the scientific method. We think it is a waste of time to understand the theory, create hypotheses, run experiments, gather data, and find out what really works in our environment. We fail to appreciate that "best practices" are somebody else's solutions to their problems, not necessarily the right solutions to our problems. We adopt new development approaches with an unhealthy dose of wishful thinking, rather than determining the most appropriate practices for our environment—and then we are surprised at the disappointing results.
- We manage by looking at single data points instead of a series of data in context. We set targets without understanding our process capability relative to the target. We don't appreciate the fact that trying to remove normal (common-cause) variation will make the situation worse, not better.
- We don't like uncertainty, so we try to make decisions and get them out of the way. Our natural inclination is to look at one alternative for solving a problem, because we think that's cheaper and faster than looking at several alternatives. For tough problems, this is usually wrong; making early decisions when we are the most ignorant is the least likely way to get good results and the most likely way to force us to start over again.
- Despite our prowess in handling information, we have very few techniques for preserving knowledge. One approach has been to collect massive, detailed documents. But who reads them? Even search engines fail us. Another approach has been whiteboards, coupled with a camera if we really need to save the sketches. Still another approach is to video a whiteboard talk on the fundamental architecture of an application. There is no doubt a middle ground, but we're still searching for it. We might learn a lesson from the open-source movement, where all communication is written and the focus is on making the communication system extremely simple, appropriately concise, quickly searchable, and never bypassed by verbal communication.
- We feel a great sense of accomplishment when our code passes its tests and we celebrate because it meets the specification—as if the specification could contain everything we needed to think about. What about that security hole or the memory leak or the ungraceful exit from the database that occasionally causes a lockup? How easy will the system be to install, interface to, populate with data? Thinking we're done when the regression tests pass is wishful thinking.
When you think of learning as uncovering the shortcomings of plans, somehow plan-driven development loses its charm. Instead, creating useful knowledge becomes the essence of developing a new product. But there's more to learn about than the product being developed; we also learn about our process for developing products. Constant learning is the essence of improving both the product itself and the product development process.
We know that all successful software gets changed.28 So if we think we're working on code that will be successful, we know we need to keep it easy to change. Anything that makes code difficult to change is technical debt.29 We know that technical debt drives the total cost of software ownership relentlessly higher, and that eventually we will have to pay it off or the system will go bankrupt. And yet . . .
- We tolerate obscure code, instead of making sure that all code reveals its intentions to the next person who comes along. Developers, especially apprentices, should be taught how to write "clean code":30 code that is simple and direct, with straightforward logic. Senior technical people need to ensure that messy code, even if it passes the tests, is never admitted into the code base.
- Far too often we don't take the time for refactoring: consolidating changes into existing code. Refactoring is essential for iterative development. Adding new features to existing code creates complexity, ambiguity, and duplication; refactoring pays down the debt.
- We run regression tests on our systems before deployment. At first they are quick, but with each addition of code, regression tests take longer and longer and longer. As the regression deficit 31 grows, we increase the interval between releases. The only way to break this unending cycle of increasing release overhead is to decrease the regression deficit. If we had started with automated test harnesses back when the code base was small and added to and maintained them, we could make changes to our code almost as quickly today as when the code base was new.
- We know that dependencies are one of the biggest generators of technical debt, and yet we are ambivalent about replacing obsolete systems with massive dependencies. We must develop, and migrate to, architectures that minimize dependencies. We have known for a long time how to do this: Focus on information hiding 32 and separation of concerns. 33
- We branch code for many reasons: to isolate new development, to focus on an individual application, to create parallel feature sets. And we know that the longer two branches of code are apart, the harder they will be to merge. And yet we wait for days to build our code, and worse, we delay system testing until the end of development. We don't realize that this isn't necessary anymore; the big bang is obsolete.
We need to expose technical debt for what it is: a costly burden to be avoided lest it lead us into bankruptcy. Chapter 2 will look at software development through a technical frame and discuss solid techniques for avoiding technical debt.