Home > Articles

This chapter is from the book

Item 13: Use CVS or Subversion to Manage Source Code

An SCM system does exactly what the name implies; it helps you manage source code and related resources. To appreciate why SCM is important, consider for a moment what life without it might be like. Assume that you are a developer working on a new project that will consist of C++ sources files, a Makefile, and perhaps some resources such as icons or images. Obviously, these items must live somewhere, and so on day one of the project, you create a directory on your local desktop system where these files will live, and write a few hundred lines of code, storing the results in that directory. After a few weeks of hacking, you come up with version 0.1 of your creation, and after some light testing, you decide that your creation is ready to be pushed out onto the Web, with the hope of generating some user feedback from a small community of users.

After a few weeks, your e-mail inbox has accumulated several feature requests from users, and perhaps a dozen or so bug reports (in addition to a ton of spam because you gave out your e-mail address to the public). You get busy implementing some of these features and fixing the worst of the bugs, and after a few more weeks, you are ready to post version 0.2. This process repeats itself for a few months, and before you know it, you are shipping version 1.0 to an even wider audience.

The 1.0 release is a success, but isn't without its share of problems. First off, users are beginning to report a nasty bug in a feature that was first introduced in version 0.8, and was working flawlessly until version 1.0 was released to the public. You are able to duplicate the bug in a release version of the 1.0 binary, but can't duplicate it in a debugger. In an attempt to understand the problem, you pour over the code in search of clues; but after numerous hours of looking, you realize that you have no idea what might have caused this bug to surface. About the only way you can think of identifying the cause of the problem is to determine what specific changes you made to the codebase that might have led to the bug's manifestation. However, all you have to work with is the 1.0 source code, and you have no way of identifying the changes you made between 0.9 and 1.0.

The second problem before you is a request. It turns out that you removed a feature that was present in version 1.0, and removing the feature has angered a lot of your users, who are clamoring for it to be reinstated in version 1.1. However, the code for this feature no longer exists, having been deleted from the source code long ago.

It is these two situations that, in my experience, make the use of a source code management system a necessity, cross-platform or not, no matter how large the project is or how many developers are involved. A good source code management system will allow you to re-create, in an instance, a snapshot of the source tree at some point in the past, either in terms of a specific date and time, or in terms of a specific release version. It will also help you to keep track of where and when changes to the source code have been made, so that you can go back and isolate specific changes related to a feature or a bug fix.

Had the developer used a source control management system, he or she could have retrieved versions of the source code starting at 0.9 and used this to isolate exactly what change(s) to the source caused the bug to first surface. And, to retrieve the source code for the feature that was removed in 1.0, the developer could have used the source code management system to retrieve the code associated with the feature, and undo its removal (or reengineer it back into the current version of the source code).

The benefits of a source code management system increase significantly as soon as multiple developers are assigned to a project. The main benefits from the point of view of a multideveloper project are accountability and consistency. To see how these benefits are realized, I need to describe in more detail how a source code management system works. Earlier, I described how a developer will typically manage a body of source code, in the absence of a source code management system, in a directory, which is usually created somewhere on the developer's system where a single copy of the source is stored and edited. The use of a source code management system changes things dramatically, however. When using a source code management system, the source code for a project is maintained in something called a repository, which you can think of as being a database that stores the master copy of a project's source code and related files. To work with the project source code, a developer retrieves a copy of the source code from the repository. Changes made to the local copy of the source code do not affect the repository. When the developer is done making changes to his or her copy of the source, he or she submits the changes to the source code management system, which will update the master copy maintained in the repository. It is important to realize that the repository records only the changes made to the file, instead of a complete copy of the latest version. By storing only changes, you can easily retrieve earlier versions of files stored in the repository. The date of the change, the name or the ID of the developer who made the change, and any comments provided by the developer along with the change are all stored in the repository along with the change itself. The implications for developer accountability should be obvious; at any time, you can query the source code management system for a log of changes, when they were made, and by whom. This is a great help in locating the source of bugs and who may have caused them.

The ability to attribute a change in the repository to a developer, bug, or feature is directly affected by the granularity of check-ins made by developers on the project. Frequent, small changes to the repository will increase the ability of a developer to use the source control management system to identify and isolate change, and will also help ensure that other developers on the project gain access to the latest changes in a timely manner. A good rule of thumb is to limit the number of bugs fixed by a check-in to the repository to one (unless there are multiple, related bugs fixed by the same change).

So, now that you know the basic ideas being using an SCM, let's talk briefly about the implications to portability. First off, using an SCM is not a magic pill that makes your project portable. Portability requires attention to a lot more than just a source code management system to happen. (If that were not the case, this book would not need to be written.) But, using a source code management system that is available on each of the platforms that your organization is supporting (or plans to support) is, in my view, a critical part of any successful cross-platform project. It does no one any good if only Windows developers are able to pull source code, but Linux and Macintosh developers are left without a solution, after all. Not only should the SCM software be available everywhere, it at least should support a "lowest common denominator" user interface that behaves the same on all platforms, and to me, that means that the user interface needs to be command line based (both CVS and Subversion [SVN] support a command-line interface).

Because cross-platform availability and a common user interface are requirements, there are only two choices for an SCM system that I can see at the time of writing this book: CVS and SVN. At Netscape, and at countless other places (open source or not), CVS is the SCM of choice. It has stood the test of time, and is capable. It has been ported nearly everywhere, and its user interface is command line based. A very close cousin of CVS is SVN. After using SVN in a professional project, I have come to the conclusion that for the programmers using it, SVN is quite similar to CVS in terms of how one approaches it and the commands that it offers, so either would be a good choice. (It is not without its quirks, however.) In this book, when I refer to an SCM, I am referring to CVS, but I could have easily said the same thing about SVN.

Besides providing a location from which Tinderbox can pull sources (see Item 12) and its support for Windows, Mac OS X, and Linux, perhaps the most important contribution of CVS to cross-platform development is its ability to create diff (or patch) files. The implications to cross-platform development of patch files are detailed in Item 14; in the following paragraphs, I describe what a patch file is and how CVS can be used to create a patch file.

A diff file, or a patch, is created by executing the cvs diff command. For example, assume I have added a method called GetAlignment() to a file named nsLabel.h in the Mozilla source tree. By typing cvs diff, I can easily identify the lines containing changes that I made:

$ cvs diff
cvs server: Diffing .
Index: nsLabel.h
===================================================================
RCS file: /cvsroot/mozilla/widget/src/gtk/nsLabel.h,v
retrieving revision 1.21
diff -r1.21 nsLabel.h
61a62
>   NS_IMETHOD GetAlignment(nsLabelAlignment *aAlignment);
71d71
<   GtkJustification GetNativeAlignment();

The preceding output tells us that a line was added around line 61 of the file nsLabel.h, and one was removed around line 71 of the file. I can take this output, mail it to others on my team, and ask them to review it for errors or comments before checking in the changes. I can also look at this patch and make sure that it contains only those changes that I intended to land in the repository. I can't stress how important cvs diff is as a tool for identifying inadvertent check-ins before they are made.

With a lot of changes, the default output format that is shown here can be difficult to understand. A better output would show context lines, and make it more obvious which lines were added to the source, and which lines were deleted. The -u argument to cvs diff causes it to generate a "unified" diff, as follows:

$ cvs diff -u
cvs server: Diffing .
Index: nsLabel.h
===================================================================
RCS file: /cvsroot/mozilla/widget/src/gtk/nsLabel.h,v
retrieving revision 1.21
diff -u -r1.21 nsLabel.h
--- nsLabel.h   28 Sep 2001 20:11:17 -0000      1.21
+++ nsLabel.h   1 Feb 2004 02:47:21 -0000
@@ -59,6 +59,7 @@
   NS_IMETHOD SetLabel(const nsString &aText);
   NS_IMETHOD GetLabel(nsString &aBuffer);
   NS_IMETHOD SetAlignment(nsLabelAlignment aAlignment);
+  NS_IMETHOD GetAlignment(nsLabelAlignment *aAlignment);

   NS_IMETHOD PreCreateWidget(nsWidgetInitData *aInitData);

@@ -68,7 +69,6 @@

 protected:
   NS_METHOD CreateNative(GtkObject *parentWindow);
-  GtkJustification GetNativeAlignment();

   nsLabelAlignment mAlignment;

The differences in the output are the inclusion of context lines before and after the affected lines, and the use of + and - to indicate lines that have been added, or removed, respectively, from the source. This format is generally much easier on everyone who must read the patch, and it is the format that I recommend you use. You can change the number of lines of context generated by cvs diff by appending a count after the -u argument. For example, to generate only one line of context, issue the following command:

$ cvs diff -u1
cvs server: Diffing .
Index: nsLabel.h
===================================================================
RCS file: /cvsroot/mozilla/widget/src/gtk/nsLabel.h,v
retrieving revision 1.21
diff -u -1 -r1.21 nsLabel.h
--- nsLabel.h   28 Sep 2001 20:11:17 -0000      1.21
+++ nsLabel.h   1 Feb 2004 02:50:45 -0000
@@ -61,2 +61,3 @@
   NS_IMETHOD SetAlignment(nsLabelAlignment aAlignment);
+  NS_IMETHOD GetAlignment(nsLabelAlignment *aAlignment);

@@ -70,3 +71,2 @@
   NS_METHOD CreateNative(GtkObject *parentWindow);
-  GtkJustification GetNativeAlignment();

Generally, you'll want to generate somewhere between three or five lines of context for patches of moderate complexity. I use -u3 almost religiously, and it is the default number of lines for svn diff (which does context diffs by default, too). However, don't be surprised if developers working with your patch files ask for more lines of context.

Setting Up and Using CVS

For those of you who are interested, I describe the steps it takes to set up a CVS server on a Red Hat-based system. The steps I provide here are almost certainly going to be the same when executed on non-Red Hat systems, and may differ in certain ways on other UNIX-based systems, including Mac OS X. The work involved in getting a CVS server up and running is not terribly difficult, and can be done in a relatively short amount of time. You will need root access to the system upon which you are installing the server, and it will help to have a second system with a CVS client so that you can test the result.

That said, if doing system administration makes you nervous, or site policy disallows it, or you do not have root access, check with a local guru or your system administrator for help.

To start the process of getting a CVS server running, you need to download the source for CVS from the Internet, build it, and install it. I retrieved a nonstable version of CVS by downloading the file cvs-1.11.22.tar.gz from http://ftp.gnu.org/non-gnu/cvs/source/stable/1.1.22. You are probably best, however, grabbing the latest stable version you can find.

After you have unpacked the file, cd into the top-level directory (in my case, cvs-1.11.22), and enter the following to build and install the source:

$ ./configure
$ make
$ su
$ make install

Next, while still logged in as root, you need to do some work so that the CVS server daemon executes each time the system is rebooted. The first step is to check to see whether entries like the following are located in /etc/services:

cvspserver 2401/tcp     # CVS client/server operations
cvspserver 2401/udp     # CVS client/server operations

If these lines don't exist, add them to /etc/services as shown. Next, you need to create a file named cvspserver in /etc/xinetd.d that contains the following:

service cvspserver
{
   socket_type = stream
   protocol    = tcp
   wait        = no
   user        = root
   passenv     = PATH
   server      = /usr/bin/cvs
   server_args = -f --allow-root=/usr/cvsroot pserver
}

Make sure the permissions of this file are -rw-r--r--, and that its group and owner are root. This is probably the default, but it doesn't hurt to check.

If you are not yet running the desktop graphical user interface (GUI), fire it up, and from the Red Hat Start menu, select System Settings, Users and Groups to launch the Red Hat User Manager.

In the dialog that is displayed, click the Add Group button and add a group named cvsadmin. Next, click the Add User button, and add a user named cvsuser. You will be asked to provide a password; enter in something you can remember, and when you are done, exit the Red Hat User Manager.

Back in a terminal, and still as root, enter the following:

# cd /usr
# mkdir cvsroot
# chmod 775 cvsroot
# chown cvsuser cvsroot
# chgrp cvsadmin cvsroot

The preceding commands create the root directory for the CVS server. The path /usr/cvsroot corresponds to the value used in the server_args field of the service aggregate that was created earlier in /etc/xinetd.d. The following commands create a locks directory below cvsroot:

# cd cvsroot
# mkdir locks
# chown cvsuser locks
# chgrp cvsadmin locks

Now that the directory exists for the repository, it is time to create the repository. You can do this by executing the cvs init command, as follows:

# cvs -d /usr/cvsroot init

The -d argument specifies the location of the repository.

Now that the repository has been created, change to your home directory (for example, /home/syd), and execute the following command, which will check out the CVSROOT module from the repository that was just created:

# cvs -d /usr/cvsroot checkout CVSROOT
cvs checkout: Updating CVSROOT
U CVSROOT/checkoutlist
U CVSROOT/commitinfo
U CVSROOT/config
U CVSROOT/cvswrappers
U CVSROOT/editinfo
U CVSROOT/loginfo
U CVSROOT/modules
U CVSROOT/notify
U CVSROOT/rcsinfo
U CVSROOT/taginfo
U CVSROOT/verifymsg

Next, cd into the CVSROOT directory that was created by the preceding command, and open up the file named config using your favorite editor. Make the contents of this file consistent with the following:

# Set this to "no" if pserver shouldn't check system
# users/passwords.
SystemAuth=no

# Put CVS lock files in this directory rather than
# directly in the repository.
#LockDir=/var/lock/cvs

# Set 'TopLevelAdmin' to 'yes' to create a CVS
# directory at the top level of the new working
# directory when using the 'cvs checkout' command.
TopLevelAdmin=yes

# Set 'LogHistory' to 'all' or 'TOFEWGCMAR' to log all
# transactions to the history file, or a subset as
# needed (ie 'TMAR' logs all write operations)
#LogHistory=TOFEWGCMAR

# Set 'RereadLogAfterVerify' to 'always' (the default)
# to allow the verifymsg script to change the log
# message. Set it to 'stat' to force CVS to verify
# that the file has changed before reading it. This can
# take up to an extra second per directory being
# committed, so it is not recommended for large
# repositories. Set it to 'never' (the previous CVS
# behavior) to prevent verifymsg scripts from changing
# the log message.
#RereadLogAfterVerify=always

After you have made changes to the config file, check it into CVS as follows:

# cvs commit
cvs commit: Examining .
Checking in config;
/usr/cvsroot/CVSROOT/config,v  <--  config
new revision: 1.2; previous revision: 1.1
done
cvs commit: Rebuilding administrative file database

In the same directory, run the following command to create a password for each user for whom you want to grant access to the repository. Every time you add a new developer to the project, you need to update the passwd file as I am about to describe, and check the changes into the repository:

# htpasswd passwd syd
New password:
Re-type new password:
Adding password for user syd

Now, open the file passwd (which was just created). At the end of the password, append :cvsuser. The result should look something like this:

syd:B9TyxNZ11EKb6:cvsuser

Next, you must add the password file to the repository, and commit the result:

# cvs -d /usr/cvsroot add passwd
cvs add: scheduling file 'passwd' for addition
cvs add: use 'cvs commit' to add this file permanently
# cvs -d /usr/cvsroot commit
RCS file: /usr/cvsroot//CVSROOT/passwd,v
done
Checking in passwd;
/usr/cvsroot//CVSROOT/passwd,v  <--  passwd
initial revision: 1.1
done
cvs commit: Rebuilding administrative file database

This should result in two files in /usr/cvsroot/CVSROOT, one named passwd,v and the other named passwd. If there is not a file named passwd in /usr/cvsroot/CVSROOT (this could happen because of a bug in CVS), return to the checked-out version of CVSROOT (for example, the one in your home directory), edit the file named checkoutlist, and add a line to the end of the file that contains the text passwd. Then, doing a cvs commit on the checkoutlist file will cause the passwd file in /usr/cvsroot/CVSROOT to appear.

Now all that is left is to make the modules. Each directory you create under /usr/cvsroot is, logically, a project that is maintained in the repository. You can organize the hierarchy as you see fit. Here, I create a project named client:

# cd /usr/cvsroot
# mkdir client
# chown cvsuser client
# chgrp cvsadmin client

Now that we have created the repository, added a project, and set up some users, we can start the CVS server daemon by kicking xinetd:

# /etc/init.d/xinetd restart
Stopping xinetd:                             [  OK  ]
Starting xinetd:                             [  OK  ]

To ensure that the CVS server is running, run the following command:

# netstat -a | grep cvs

If you see output like the following, everything is in order, and you can use the repository:

tcp        0      0 *:cvspserver   *:*     LISTEN

To test out the new server and repository, find another machine, open up a shell (or a GUI CVS client if you prefer), and then check out the project named client. In the following example, I am using a command-line CVS client, and the server is located on my local network at the IP address 192.168.1.102:

$ cvs -d :pserver:syd@192.168.1.102:/usr/cvsroot login
(Logging in to syd@192.168.1.102)
CVS password:
$ cvs -d :pserver:syd@192.168.1.102:/usr/cvsroot co client
cvs server: Updating client

There now should be a directory named client in the current directory. If you cd into the client directory, you should see the following contents:

$ cd client
$ ls
CVS

At this point, you can add files and directories to the project with cvs add, and commit them to the repository using cvs commit.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020