Git Essentials

By William "Bo" Rothwell
May 6, 2017

␡

⎙ Print

Page 1 of 4 Next >

Learn the concepts of GIT, the version control system (VCS) for tracking changes in computer files and coordinating work on those files among multiple people. Topics include revision control concepts, GIT installation, and GIT features.

This chapter is from the book 

Linux for Developers: Jumpstart Your Linux Programming Skills

Learn More Buy

This chapter introduces you to Git, including how to install the necessary software to access Git servers where your software project will be stored.

Version Control Concepts

To understand Git and the concept of version control, looking at version control from an historical perspective is helpful. There have been three generations of version control software.

The First Generation

The first generation was very simple. Developers worked on the same physical system and “checked out” one file at a time.

This generation of version control software made use of a technique called file locking. When a developer checked out a file, it was locked and no other developer could edit the file. Figure 12.1 illustrates the concept of this type of version control.

Figure 12.1 First-generation version control software

Examples of first-generation version control software include Revision Control System (RCS) and Source Code Control System (SCCS).

The Second Generation

The problems with the first generation included the following:

Only one developer can work on a file at a time. This results in a bottleneck in the development process.
Developers have to log in directly to the system that contains the version control software.

These problems were solved in the second generation of version control software. In the second generation, files are stored on a centralized server in a repository. Developers can check out separate copies of a file. When the developer completes work on a file, the file is checked in to the repository. Figure 12.2 illustrates the concept of this type of version control.

Figure 12.2 Second-generation version control software

If two developers check out the same version of a file, then the potential for issues exists. This is handled by a process called a merge.

What Is a Merge?

Suppose two developers, Bob and Sue, check out version 5 of a file named abc.txt. After Bob completes his work, he checks the file back in. Typically, this results in a new version of the file, version 6.

Sometime later, Sue checks in her file. This new file must incorporate her changes and Bob’s changes. This is accomplished through the process of a merge.

Depending on the version control software that you use, there could be different ways to handle this merge. In some cases, such as when Bob and Sue have worked on completely different parts of the file, the merge process is very simple. However, in cases in which Sue and Bob worked on the same lines of code in the file, the merge process can be more complex. In those cases, Sue will have to make decisions, such as whether Bob’s code or her code will be in the new version of the file.

After the merge process completes, the process of committing the file to the repository takes place. To commit a file essentially means to create a new version in the repository; in this case, version 7 of the file.

Examples of second-generation version control software include Concurrent Versions System (CVS) and Subversion.

The Third Generation

The third generation is referred to as Distributed Version Control Systems (DVCSs). As with the second generation, a central repository server contains all the files for the project. However, developers don’t check out individual files from the repository. Instead, the entire project is checked out, allowing the developer to work on the complete set of files rather than just individual files. Figure 12.3 illustrates the concept of this type of version control.

Figure 12.3 Third-generation version control software

Another (very big) difference between the second and third generation of version control software has to do with how the merge and commit process works. As previously mentioned, the steps in the second generation are to perform a merge and then commit the new version to the repository.

With third-generation version control software, files are checked in and then they are merged. To understand the difference between these two techniques, first look at Figure 12.4.

Figure 12.4 Second-generation merge and commit

In phase 1 of Figure 12.4, two developers check out a file that is based on the third version. In phase 2, one developer checks that file in, resulting in a version 4 of the file.

In phase 3 the second developer must first merge the changes from his checked-out copy with the changes of version 4 (and, potentially, other versions). After the merge is complete, the new version can be committed to the repository as version 5.

If you focus on what is in the repository (the center part of each phase), you will see that there is a very straight line of development (ver1, ver2, ver3, ver4, ver5, and so on). This simple approach to software development poses some potential problems:

Requiring a developer to merge before committing often results in developers’ not wanting to commit their changes on a regular basis. The merge process can be a pain and developers might decide to just wait until later and do one merge rather than a bunch of regular merges. This has a negative impact on software development as suddenly huge chunks of code are added to a file. Additionally, you want to encourage developers to commit changes to the repository, just like you want to encourage someone who is writing a document to save on a regular basis.
Very important: Version 5 in this example is not necessarily the work that the developer originally completed. During the merging process, the developer might discard some of his work to complete the merge process. This isn’t ideal because it results in the loss of potentially good code.

A better, although arguably more complex, technique can be employed. It is called Directed Acyclic Graph (DAG), and you can see an example of how it works in Figure 12.5.

Figure 12.5 Third-generation commit and merge

Phases 1 and 2 are the same as shown in Figure 12.4. However, note that in phase 3 the second “check in” process results in a version 5 file that is not based on version 4, but rather independent of version 4. In phase 4 of the process, versions 4 and 5 of the file have been merged to create a version 6.

Although this process is more complex (and, potentially, much more complex if you have a large number of developers), it does provide some advantages over a “single line” of development:

Developers can commit their changes on a regular basis and not have to worry about merging until a later time.
The merging process could be delegated to a specific developer who has a better idea of the entire project or code than the other developers have.
At any time, the project manager can go back and see exactly what work each individual developer created.

Certainly an argument exists for both methods. However, keep in mind that this book focuses on Git, which uses the Directed Acyclic Graph method of third-generation version control systems.

Page 1 of 4 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Email Address