Why Distributed Version Control Systems?

Version Control System (VCS). Are you using it? Of course you are, who wouldn’t want to have a means of keeping track of changes to files in an organized manner. This is especially true when you have a team of people collaborating on a project. For years version control was carried out by such software as CVS (Concurrent Versions System) and SVN (Subversion). The approach these applications took was to maintain a single repository of files that users would read from and write to. This is known as a Centralized Version Control System (CVCS). Several years ago this concept of version control was challenged by a new paradigm known as Distributed Version Control Systems (DVCS). Two of the most popular DVCS are Mercurial and Git, with Git having exceptional fame due to its use on GitHub.com.

What is Distributed Version Control Systems?

Where CVCS has a single repository of files to be accessed by users, DVCS replicates the repository of files onto each user’s machine. Each replicated repository has a full history of the project with all of the metadata of the original. This makes version control with DVCS self-contained in that a user doesn’t need to be connected to the central repository in order to perform version control tasks locally.

Why DVCS over CVCS

With this being said, what makes DVCS a consideration on a project in place of CVCS?  There are a few key advantages that really shine when adding DVCS to a workflow.

Advantages

  • Performing actions other than pushing and pulling changesets is extremely fast because DVCS only needs to access the hard drive, not a remote server.
  • Committing new changesets can be done locally without anyone else seeing them. Once a group of changesets are completed, they can all be pushed at once. 
  • Everything except pushing and pulling can be done without an Internet connection. For example, work can be accomplished on an airplane, without the need to commit all changes under one large changeset.
  • Since each user has a full copy of the project repository, individuals can share changes with others as they see fit without pushing their changes for everyone to see.
  • Context switching using branching is simple and quick. This plays well in the case of working on bug fixes and developing new features in a software development context. Being distributed, branching allows for individuals to create their own personal source branches which are not readily accessible by others.   

Although DVCS provides new capabilities to users, it does have some inherent disadvantages. 

Disadvantages

  • If a project contains many large, binary files that cannot be easily compressed, the space needed to store all versions of these files can accumulate quickly.
  • If a project has a very long history (50,000 changesets or more), downloading the entire history can take an impractical amount of time and disk space.
  • Although for many only a temporarily disadvantage, nonetheless a disadvantage, is the ramp-up time to using DVCS. For many people the concepts and terminology used with DVCS is new and can be unfamiliar. Even more confusing are the terms and concepts that seem to be the same as those used in CVCS.  This is confusing because actual results can vastly differ from the expected results. 

Outie

Version control is a great asset to most workflows where it is important to track changes as well as provide a repository of files for others to access. DVCS have gained a great deal of attention in recent years, given the explosion of open source projects.  Two of the most popular DVCS to get started with are Git and Mercurial. Either CVCS or DVCS are a good idea for any project; however, DVCS provides benefits that greatly align themselves with modern web projects and workflows.

References

A personal blog by Jason Hill about coding, technology, hobbies, and much more
Copyright © 2022 Jason Hill Rocks. All Rights Reserved.