Quality Software and Plentiful Resources
People like to take potshots at open source—especially threatened ISV executives. Realistically, it’s probably hard to imagine how thousands of developers, from all over the world, without bosses, compensation, deadlines, or fear of retribution can create software solutions that rival the best efforts of proprietary companies. This section takes an in-depth look at the development process, with the objective of illustrating just how open source code gets written. It also shows just how deep the resource pool for open source development and administration talent really is.
Who Are Open Source Developers?
Linux, Apache, and several other mainstream solutions have shown that collaborative open source software projects can be successful. "Why open source works" is the big question—one that reportedly has puzzled even Bill Gates. "Who can afford to do professional work for nothing? What hobbyist can put three man-years into programming, finding all the bugs, documenting his product, and distributing it for free?" Gates once wrote.
Steven Weber, a political science professor at UC Berkeley, has researched the motivations of Linux developers and summarized the reasons why they do what they do in the following categories (Source: The Success of Open Source, Harvard University Press, 2004):
Job as a vocation—In the process of "scratching a personal itch," developers solve a problem that empowers them. With virtually no distribution costs, they can empower others as well.
Ego boosting—The challenge of programming, of solving a problem with visual evidence of the accomplishment, is a source of satisfaction.
Reputation—Open source developers strongly identify with the community. As a result, good work leads to recognition and reputation within the community—and potentially to better work and more reward.
Identity and belief systems—Much of the open source community culture is rooted in the "freedom" movement espoused by Richard Stallman, which includes principles of free software, free access to data, wariness of authority, appreciation for art and fun, and judgment of value based on creation over credentials. Developers strongly identify with this.
The joint enemy—Uniting in benevolent purpose against a common enemy is at least an element of motivation for open source developers.
Art and beauty—Code either works or it doesn’t, but the elegance of a simple solution is art—the difference between "clean" and "ugly" code. With open source, a creation can be shared with others.
From this research, you can see that open source developers are not merely geeks, united in a hobby-love for computers with code a by-product of their cyber fun. The motivations for development are deep, real, and closely parallel the motivations for almost any other productive endeavor, whether it’s motivated by profit, altruism, self-interest, or religious belief. Bottom line, the open source movement is real, with traction provided by a competent, skilled development force.
What do we know about who open source developers really are? Concrete data on demographics is scarce, but some estimates give us an idea. The credits in the 2.3 version of Linux code listed developers from over 30 countries. The community is definitely far-flung and international. More developers have .com addresses than .edu, indicating that many are working at for-profit institutions. The O’Reilly Open Source Convention attendee list included people from aerospace (Boeing, Lockheed Martin, General Dynamics, Raytheon, NASA); computers and semiconductors (Agilent, Apple, Fujitsu, HP, Intel, IBM, Philips, Intuit Macromedia, SAIC, Sun, Texas Instruments, Veritas); telecom (ATT Wireless, Nokia, Qualcomm, Verizon Wireless); finance, insurance, and accounting (Barclays Global Investors, Morgan Stanley, Federal Reserve Bank, PriceWaterhouseCoopers, Prudential); media (AOL Time Warner, BBC, Disney, LexisNexis, Reuters, USA Today, Yahoo!); and pharmaceuticals (GlaxoSmithKline, McKesson, Merck, Novartis, Pfizer).
A 2002 study conducted by Boston Consulting Group from SourceForge and Linux kernel email lists produced quantified demographics on the Free/Open Source Software community, including the following:
Seventy percent are Generation X, between 22 and 37, with an average age of 28.
They typically volunteer between 8 and 14 hours per week to open source projects.
Fifty-five percent are professional programmers, IT managers, or system administrators; 20% are students.
The average person has 11 years of programming experience.
Interesting quotes from survey respondents included the following: "How often in ’real life’ do you find Christians, Jews, Muslims, Atheists, Whites, Blacks, Hispanics, Republicans, Democrats, and Anarchists all working together without killing one another?" (NYC IT consultant), "People will always want to contribute what they know to a project that scratches an itch. Open software will continue to depend on projects that meet people where they need help" (San Jose IT manager).
How Does the Open Source Process Work?
Each open source project can have its own unique process development cycle. After all, there is no forced hierarchy, preferred method, or established "right way" with open source. However, an open source project will generally include the elements illustrated by the flowchart developed by BCG after their 2002 research (see Figure 2.1).
Figure 2.1 A general map of the open source development process.
The process can be summed up as follows:
An itch develops—It might be as simple as someone wanting to organize digital photos to share with family members, or as complex as the need to standardize customer information across divisions of a multinational company. The itch is common; that is, the need is shared across a larger audience than just one.
Germination occurs—The inception of a project could take one of several forms. It might be Linux-like in that one person establishes an initial code base. It might be Apache-like, with a committee meeting to clarify need and establish direction. It might be Mozilla-like, with a gifting of free code. It might be the posting of a project to SourceForge. Any activity or event that produces evidence of productive activity toward scratching the itch can be considered germination.
The project takes roots—From inception, the project begins to grow. Several activities can be part of this phase. A "home" is provided for the project, which includes an accessible storage location for code. Intercommunication is established that can be as simple as trading emails, or as extensive as established mailing lists with subgroups. Informal leadership is established based on respect and trust. Norms informally evolve shaping communication, interaction, and productivity. Word of the project is spread to others who might have the same itch.
Cultivation—The often iterative process of creating, submitting, testing, and evaluating code occurs. Here users, developers, or user/developers work to create, enhance, and refine the product. The code might be housed in a CVS-type environment in which bugs and feature requests are tracked and successive iterations of code are progressively versioned, with the objective of reaching a state of hardened, production-quality code.
Recurring harvest—Open source code is released when at a state useful to some population. It does not go dormant, but continues the cycle of enhancement, often with many releases (Raymond states, "release early, release often"). At this point, the software is often valuable for productive use within the project community.
Commercial productive use—Some projects (with widespread itch appeal) are released for general use and become part of mainstream, commercial solutions. A standard "open" license is applied to the product, it is adopted by commercial ISVs or OEMs, and support is provided. Responsible commercial organizations join the community, often contributing back to the project in terms of manpower, hosting services, support, leadership, and enhancements, or by contributing related technologies to the effort.
Several key principles facilitate the development of open source software. Without these key elements, it would be much more difficult to completely evolve a project to productive reality. These elements include code modularization, peer review, developers as users, and, of course, an effective communication infrastructure:
Code modularization—In the Linux example, you will see that the Linux kernel project is modularized with subprojects for the kernel state, security, device I/O, networking, file system, process management, memory management, and more. In addition, many more projects for device drivers, functions, utilities, and applications are available. The elegance of Linux in part is because of the architecture, which supports intercommunication among these separate modules. Packages—distinct collections of functional code or objects that can be loaded or unloaded without affecting the kernel or other packages—are a reflection of this modular architecture.
"It is suggested that mindful implementation of the principles of modularity may improve the rate of success of many Free/Open Source software projects." This assertion is developed in detail in the paper, "Free/Open Source Software as a Complex Modular System" by Narduzzo and Rossi.
Peer review—The term peer review might not be completely descriptive of this element, but it encompasses the idea that a person’s contribution is subject to observation and analysis by the entire open community. One’s contribution is not shielded by proprietary binaries. The process can include kudos as well as flames, acceptance or rejection, reputation, and responsibility. As a result, the established norms of peer review tend to produce quality code. Solutions, if not at first, eventually evolve to become effective, simple, and elegant.
Developers as users—Much has been said about the ineffectiveness of the multistep, silo-prone process of gathering marketing requirements from users, feeding that to developers who create code based on perceived need and theory, and then throwing it over the wall to be sold to users. With open source, the majority of the initial users are also developers. They solve their own problems, and in doing so create more effective solutions. This eliminates miscommunication and also dramatically speeds the development cycle. According to Raymond, both developers and users "develop a shared representation grounded in the actual code."
Communication infrastructure—Without the Internet, the open source process would be impossible. Global access, instant communication, shared storage, and open standards are all elements that are key to the development of open source.
And the result? Open source development leads to quality software. Here’s a quantifiable sample of the caliber of open source code. Reasoning is a security inspection service that uses automated technology to uncover coding flaws, such as memory leaks, out-of-bounds array access, bad deallocations, and uninitialized variables. In comparing two open source solutions (MySQL and the Linux TCP/IP stack) to the collection of commercially available versions of each, Reasoning found that defect density in the open source solutions was significantly lower than in the commercial versions.
The defect density per one thousand lines of code for MySQL was .09, as compared to an average of .57 for commercial versions. The defect range for commercial databases was between .36 and .71, still significantly higher than for MySQL. The defect rate for the Linux TCP/IP stack was .10 per thousand lines of code, as compared to a range of .15 for the best of the commercial stacks, up to more than 1.0 for the worst (http://www.reasoning.com).
Linus Torvalds simplifies the concept. "I think, fundamentally, open source does tend to be more stable software. It’s the right way to do things. I compare it to science vs. witchcraft. In science, the whole system builds on people looking at other people’s results and building on top of them. In witchcraft, somebody had a small secret and guarded it—but never allowed others to really understand it and build on it...When problems get serious enough, you can’t have one person or one company guarding their secrets. You have to have everybody share in knowledge."