Home > Articles

This chapter is from the book

Item 14: Use Patch

The patch program is considered by some to be the prime enabler behind the success of open source software. To quote Eric Raymond, "The patch program did more than any other single tool to enable collaborative development over the Internet—a method that would revitalize UNIX after 1990" (The Art of UNIX Programming, Addison-Wesley, 2003). Of course, it is hard to imagine patch taking all the credit; after all, what would development be without vi(1)? But still, there is a ring of truth in what he says.

In the open source community, at any given moment, on any given project, there are dozens, if not hundreds of developers, all working on some derivation of what is in currently on the tip (or branch) of some source code repository. All of them are working relatively blind to the changes that their counterparts are making to the same body of source code.

An Example

Integrating (and evaluating) the changes made to a shared body of source code in such an environment can be difficult and error prone without a tool like patch. To see how, consider a team of three developers (A, B, and C) all working from the tip of the repository. Developer B is the team lead, and his job is to perform code reviews for Developer A and C, and integrate their changes into the source tree once an acceptable code review has been obtained. He also does development on the same body of source code, because he owns the overall architecture.

Let's say that Developer A finishes his work and is in need of a code review. To obtain the code review, Developer B needs to communicate his changes to Developer B. I've seen this done a few different ways over the years:

  • Developer A copies and pastes the changes made to the source file(s) into a text file, and sends the result to Developer B. In addition, Developer A adds comments to the text file to describe what the changes are, and where in the original source file the changes were made (or Developer A e-mails this information separately to Developer B). This is perhaps the worst method of all for conducting a code review, for two reasons:
    1. Developer A may make a mistake and not copy and paste all the changes that were made, or miss entire source files that have modification. The omission of a single line of change can greatly affect the ability of a code reviewer to accurately perform his task. Worse yet, if the code reviewer is responsible for integrating the changes into the repository and changes were missed, the process will surely lead to bugs.
    2. Even if all changes are copied and pasted by Developer A, there is a chance that context will be lost or incorrectly communicated. One way to counter this problem would be for Developer A to include extra lines above and below the code that actually changed, but this is a better job for a tool like cvs diff, which can generate a patch file that contains the needed lines of context.
  • Developer A sends to Developer B copies of all the source files that were changed. This is better than sending a series of hand-constructed diffs, because Developer B can now take the source files and create a patch that correctly captures the changes made by Developer A, along with the context of those changes. If Developer A sends source files that are not being modified by Developer B, Developer B can simply use the diff program (not cvs diff or svn diff) to generate a patch file relative to his current working tree. If Developer A, however, sends changes that do affect files modified by Developer B, Developer B can either diff against his working tree to see the changes in the context of work he is performing, or Developer B can pull a new tree somewhere and generate a patch file from it. The actual method used is usually best determined by the code reviewer. The downsides of this method are as follows:
    1. It is error prone. (Developer A might forget to include source files that contain change.)
    2. It places a burden on the code reviewer to generate a patch file. The last thing you want to do on a large project is make more work for the code reviewer. Usually, a code reviewer is generally always struggling to keep up with not only his own development task, but with all the code review requests that are pouring in. Anything you can do to make his job easier will generally be appreciative (and may result in the code reviewer giving your requests a higher priority).
  • Developer A generates a patch file using cvn or svn diff, and sends it to the code reviewer. This is the best method because
    1. The changes are relative to Developer A's source tree.
    2. cvs diff won't miss any changes that were made, assuming that cvs diff is run at a high-enough level of the source tree. (There is one exception: new source files that have not been added to the repository, along with forgetting to pass the -N argument to cvs diff when creating a patch file [this is not a problem with svn diff, which automatically includes new source files in its diff output.])

After the code reviewer (Developer B) receives the patch from Developer A, he or she has a few options:

  • Simply look at the patch file, performing the code review based on its contents alone. Most of the time, this is what I do, especially if the patch file is a unified diff (as it should always be), and if the changes do not intersect any that I am making.
  • Apply the patch file to his local tree, build the result, and then perhaps test it. This can be helpful if Developer B would like to step through an execution of the code in a debugger, or to see that the patch builds correctly and without warnings. If Developer A has made changes to some of the source files that were modified by Developer B, Developer B can either
    1. Pull a new tree and apply the patch to it so that his or her changes are not affected.
    2. Use cvs diff to generate a patch file that contains his own changes, and then attempt to apply the changes from Developer A into his source tree. This allows Developer B not only to see the changes made by Developer A, but also to see them in the context of the changes that he is making. When the code review has been completed, Developer B can continue working on his changes, and check both his and Developer B's changes in at a later time, or Developer B can have Developer A check in the changes, and then do a cvs or svn update to get in sync.

The patch program is the tool used by a code reviewer to apply changes specified in a patch file to a local copy of the repository. In essence, if both you and I have a copy of the same source tree, you can use cvs diff to generate a patch file containing changes you have made to your copy of the sources, and then I can use the patch program, along with your patch file, to apply those changes to my copy of the sources. The patch program tries very hard to do its job accurately, even if the copy of the sources the patch is being applied to have been changed in some unrelated way. The type of diff contained in the patch file affects the accuracy attained by the patch program; patch is generally more successful if it is fed a set of context diffs rather than normal diffs. The cvs diff -u3 syntax (unified diff with three lines of context) is enough to generate a patch file that gives a good result. (SVN by default generates unified diffs with three lines of context.)

Patch Options

The patch program has a number of options. (You can refer to the patch man page for more details.) However, in practice, the only option that matters much is the -p argument, which is used to align the absolute paths used in the patch file with the local directory structure containing the sources that the patch is being applied to. When you run cvs diff to create a patch file, it is best to do it from within the source tree, at the highest level in the directory hierarchy necessary to include all the files that have changes. The resulting patch file will, for each file that has changes, identify the file with a relative path, and patch uses this relative path to figure out what file in the target directory to apply changes to. For example:

Index: layout/layout.cpp
===================================================================
RCS file: /usr/cvsroot/crossplatform/layout/layout.cpp,v
retrieving revision 1.33
diff -u -3 -r1.33 layout.cpp
--- layout/layout.cpp   27 May 2006 09:31:47 -0000  1.33
+++ layout/layout.cpp   7 Jun 2006 10:43:22 -0000
@@ -327,7 +327,7 @@
    return document;
 }

-int main(int argc, char *argv[])
+int LayoutMain(int argc, char *argv[])
 {
     int run, parse;
     char *src = NULL;

The first line in the preceding patch (the line prefixed with Index:) specifies the pathname of the file to be patched. Assuming that the patch is contained in a file named patch.txt, then, if the preceding patch file were copied to the same relative location in the target tree, then issuing the following command is sufficient for patch to locate the files that are specified in the patch file:

$ patch -p0 < patch.txt

The -p argument will remove the smallest prefix containing the specified number of leading slashes from each filename in the patch file, using the result to locate the file in the local source tree. Because the patch file was copied to the same relative location of the target tree that was used to generate the patch file in the source tree, we must use -p0 because we do not want patch to remove any portion of the path names when searching for files. If -p1 were used (with the same patch file, located in the same place in the target tree), the pathname layout/layout.cpp would be reduced to layout.cpp, and as a result, patch would not be able to locate the file, because the file would not be located in the current directory. Copying the patch file down into the layout directory would fix this, but this could only be done if, and only if, the patch file affected only sources that were located in the layout directory, because the -p0 is applied by patch to all sources represented in the patch file.

Dealing with Rejects

So much for identifying which files to patch. The second difficulty you may run into is rejects. If patch is unable to perform the patch operation, it will announce this fact, and do one of two things. Either it will generate a reject file, which is a filename in the same directory as the file being patched, but with a .rej suffix (for example, bar.cpp.rej), or it will place text inside of the patched file to identify the lines that it was unable to resolve. (The -dry-run option can be used to preview the work performed by patch. As the name implies, it will cause patch to do a "dry run" of the patch operation, to let you know if it will succeed, without actually changing any of the target files.)

If either of these situations happens, there are a few ways to deal with it. The first thing I would do is remove the original source file, re-pull it from the repository using cvs update, and try to reapply the patch, in case I was applying the patch to a file that was not up-to-date with the tip. If this didn't work, I would contact the person who generated the patch and ask that person to verify that his or her source tree was up-to-date at the time the patch file was generated. If it was not, I would ask that person to run cvs update on the file or files and generate a new patch file.

If neither of these strategies works, what happens next depends on the type of output generated by patch. If patch created a .rej file, I would open it and the source file being patched in an editor, and manually copy and paste the contents of the .rej file into the source, consulting with the author of the patch file in case there are situations that are not clear. If, on the other hand, patch inlined the errors instead of generating a .rej file, open the source that was patched and search for lines containing <<<. These lines (and ones containing >>>) delimit the original and new source changes that were in conflict. By careful inspection of the patch output, and perhaps some consultation with the author of the patch, you should be able to identify which portions of the resulting output should stay, and which portions of the result need to go, and perform the appropriate editing in the file to come up with the desired final result.

Thankfully, problems like this happen only rarely. The two most common causes of conflict occur when a developer is accepting a patch that affects code that he or she has also modified, or the patch files are created against a different baseline. There is little to do to avoid the first case, other than better communication among developers to ensure that they are not modifying the same code at the same time. The second case is usually are avoided when developers are conscientious about ensuring that their source trees (and patches) are consistent with the contents of the CVS repository. When this is done, problems are relatively rare, and if they do occur, usually are slight and easy to deal with.

Patch and Cross-Platform Development

Now that you have an idea of why patch is so important to open source software, and how to use it, I need to describe how patch can make developing cross-platform software easier. At Netscape, each developer had a Mac, PC, and Linux system in his or her cubicle, or was at least encouraged to have one of each platform. (Not all did, in reality.) Each developer, like most of us, tended to specialize in one platform. (There were many Windows developers, a group of Mac developers, and a small handful of Linux developers.) As a result, it would only be natural that each developer did the majority of his or her work on the platform of his or her choice.

At Netscape, to overcome the tendency for the Windows developers to ignore the Linux and Macintosh platforms (I'm not picking on just Windows developers; Macintosh and Linux developers at Netscape were just as likely to avoid the other platforms, too), it was required that each developer ensure that all changes made to the repository correctly built and executed on each of the primary supported platforms, not just the developer's primary platform. To do this, some developers installed Network File System (NFS) clients on their Macintosh and Windows machines, and then pulled sources from the repository only on the Linux machine, to which both Mac and Windows machines had mounts. Effectively, Linux was a file server for the source, and the other platforms simply built off that source remotely. (The build system for Netscape/Mozilla allowed for this by isolating the output of builds; see Item 12.) This allowed, for example, the Windows developers to do all of their work on Windows, walk over to the Mac and Linux boxes, and do the required builds on those platforms, using the same source tree.

But what if NFS (or, Samba these days) was not available? Or, more likely, developers did not have all three platforms to build on (or if they did, have the skills needed to make use of them)? In these cases, the patch program would come to the rescue. Developers could create a patch file, for example, on their Windows machine, and then either copy it to the other machines (where a clean source tree awaited for it to be applied to), or they could mail it to a "build buddy" who would build the source for those platforms that the developer was not equipped to build for. (Macintosh build buddies were highly sought after at Netscape because most developers at Netscape did not have the desire, or the necessary skills, to set up a Macintosh development system; it was much easier to ask one of the Macintosh helpers to be a build buddy.)

Netscape's policy was a good one, and patch was an important part of its implementation. The policy was a good one because, by forcing all check-ins to build and run on all three platforms at the same time, it made sure that the feature set of each of the three platforms moved forward at the about the same pace (see Item 1). Mozilla, Netscape, and today, Firefox, pretty much work the same on Mac, Windows, and Linux, at the time of release. The combination of cvs diff, which accurately captured changes made to a local tree, and patch, which accurately merged these changes into a second copy of the tree, played a big role in enabling this sort of policy to be carried out, and allows projects such as Mozilla Firefox to continue to achieve cross-platform parity.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020